Neural Network Approximation: Three Hidden Layers Are Enough

October 25, 2020 ยท Declared Dead ยท ๐Ÿ› Neural Networks

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Zuowei Shen, Haizhao Yang, Shijun Zhang arXiv ID 2010.14075 Category cs.LG: Machine Learning Cross-listed cs.NE, stat.ML Citations 143 Venue Neural Networks Last Checked 4 months ago
Abstract
A three-hidden-layer neural network with super approximation power is introduced. This network is built with the floor function ($\lfloor x\rfloor$), the exponential function ($2^x$), the step function ($1_{x\geq 0}$), or their compositions as the activation function in each neuron and hence we call such networks as Floor-Exponential-Step (FLES) networks. For any width hyper-parameter $N\in\mathbb{N}^+$, it is shown that FLES networks with width $\max\{d,N\}$ and three hidden layers can uniformly approximate a Hรถlder continuous function $f$ on $[0,1]^d$ with an exponential approximation rate $3ฮป(2\sqrt{d})^ฮฑ 2^{-ฮฑN}$, where $ฮฑ\in(0,1]$ and $ฮป>0$ are the Hรถlder order and constant, respectively. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $ฯ‰_f(\cdot)$, the constructive approximation rate is $2ฯ‰_f(2\sqrt{d}){2^{-N}}+ฯ‰_f(2\sqrt{d}\,2^{-N})$. Moreover, we extend such a result to general bounded continuous functions on a bounded set $E\subseteq\mathbb{R}^d$. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of $ฯ‰_f(r)$ as $r\rightarrow 0$ is moderate (e.g., $ฯ‰_f(r)\lesssim r^ฮฑ$ for Hรถlder continuous functions), since the major term to be concerned in our approximation rate is essentially $\sqrt{d}$ times a function of $N$ independent of $d$ within the modulus of continuity. Finally, we extend our analysis to derive similar approximation results in the $L^p$-norm for $p\in[1,\infty)$ via replacing Floor-Exponential-Step activation functions by continuous activation functions.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning

Died the same way โ€” ๐Ÿ‘ป Ghosted