M05 Approximation, Sampling, and Learning in High Dimensions
Organized by Ahmed Abdeljawad (Linz), Anupam Gumber (Wien)
This mini-symposium will explore the latest developments in approximation, sampling, and learning techniques for high dimensional data, with a special focus on the interplay between theoretical insights and practical applications. Key topics include the approximation capabilities of neural networks and gradient descent algorithms, as well as the stability and robustness analysis of mathematical and statistical models. The main objective is to discuss emerging methodologies that address the challenges posed by the high dimensionality of modern data. The symposium will highlight data-driven approaches, such as sampling techniques, learning-based methods, and operator learning, tailored to high dimensional problems. It aims to connect mathematical theory with real-world applications in areas such as signal and image processing, medical imaging, partial differential equations, and finance. By bringing together experts in these fields, the session seeks to foster interdisciplinary collaboration, discuss recent advancements and challenges, and showcase innovative strategies for analyzing and modeling complex data.
Part 1: Tuesday 10:30–12:30 S2 120
10:30
Approximation error bounds for quantum neural networks
Lukas Gonon, School of Computer Science, University of St. Gallen, St. Gallen, Switzerland
Quantum neural networks have recently emerged as novel machine learning tool, suitable for implementation on quantum computing hardware. In this talk, we present quantum neural network expressivity bounds for functions with sufficient Fourier integrability, reminiscent to Barron-type bounds for classical neural networks. In particular, we provide a bound on the quantum circuit complexity required to achieve a prescribed approximation accuracy. We show that for integrable functions with integrable Fourier transform, the required circuit size only grows logarithmically in the reciprocal of the desired approximation accuracy. Our proof is based on probabilistic arguments and the obtained approximation rates are free from the curse of dimensionality. As an application example, we consider option pricing in exponential Lévy models.
11:00
Stochastic Modified Flows for Riemannian Stochastic Gradient Descent
Nimit Rana, Mathematical Finance and Stochastic Analysis Group, Department of Mathematics, University of York, York, United Kingdom
We provide quantitative estimates for the convergence rate of Riemannian stochastic gradient descent (RSGD) to both Riemannian gradient flow and the Riemannian stochastic modified flow (RSMF), a diffusion process capturing RSGD's random fluctuations. Using stochastic differential geometry, we show that in the small learning rate regime, RSGD can be approximated by the RSMF, driven by an infinite-dimensional Wiener process, offering higher accuracy than the deterministic gradient flow. Under assumptions on the retraction map, manifold geometry, and gradient estimators, we derive weak error bounds for the diffusion approximation. This is joint work with Benjamin Gess and Sebastian Kassing from TU Berlin, Germany.
11:30
Subspace estimation from quantized samples
Johannes Maly, Mathematical Data Science and Artificial Intelligence Group, Ludwig-Maximilians-University of Munich, Munich, Germany
We study subspace estimation from coarsely quantized data. In particular, we analyze two stochastic quantization schemes which use dithering: a one-bit quantizer combined with rectangular dither and a multi-bit quantizer with triangular dither. For each quantizer, we derive rigorous high probability bounds for the distances between the true and estimated signal subspaces. Using our analysis, we identify scenarios in which subspace estimation via triangular dithering qualitatively outperforms rectangular dithering. We verify in numerical simulations that our estimates are optimal in their dependence on the smallest non-zero eigenvalue of the target matrix, and apply them to spectral estimation. This is joint work with S. Dirksen and W. Li.
12:00
Explaining the impact of data augmentation with Quantum Harmonic Analysis (QHA) Tools
Monika Dörfler, Institute of Mathematics, University of Vienna, Vienna, Austria
Data, which we strive to use, interpret, classify, usually undergoes some kind of representation; that is, each data point is written in some dictionary of building blocks by means of coefficients. The building blocks may or may not be derived from the data themselves, they may or may not be physically meaningful and the may or may not depend on feq fixed parameters. In any case, we are often interested in using augmentation, which can happen along the dimensions given by the representation coefficients. Augmentation is a well-established and successful tool in many data science applications, however, often heuristic and not well-studied mathematically. We give new insights on the impact of augmentation, which we interpret as a generalized convolution in a QHA context. Using surprising insights from QHA, we provide a possible explanation of the benefits of augmentation. (Joint Work with Franz Luef and Henry McNulty)
Part 2: Tuesday 15:30–17:30 S2 120
15:30
Approximation Rates for Neural Networks and Barron Spaces: Weighted Spaces, Unbounded Domains, and Further Generalizations
Thomas Dittrich, Mathematical Data Science, Johann Radon Institute of Computational and Applied Mathematics (RICAM), Linz, Austria
The universal approximation theorem for neural networks states that every continuous function can be approximated arbitrarily well in the uniform norm, provided the network size is sufficiently large. Following the discovery of this general density statement, there has been an ever-increasing interest in precise approximation rates. More specifically: Which settings allow approximation without the curse of dimensionality, i.e., a dimension-independent rate? A very general class of functions that come without curse of dimensionality are spectral Barron spaces. I will first review the history and some of the most important approximation results for these spaces. Second, I will focus on the recent rise of machine learning for partial differential equations. Namely, I will discuss our contribution to identifying more general approximation norms and topologies in which neural networks can approximate Barron functions without the curse of dimensionality.
16:00
Rate-distortion optimal approximation
Dennis Elbrächter, Department of Mathematics, University of Vienna, Vienna, Austria
We will present neural network approximation results with an information theoretic flavour. Specifically, we will require that a "size" M network may, canonically, be represented as a bit-string of length M (up to polylogarithmic factors). This is based on adapting David Donoho's notion of "best M-term approximation rate under polynomial depth search constraint" from the dictionary setting to the neural network setting. In particular, this allows us to compare the asymptotic expressivity of ReLU neural networks to those of a wide range of dictionaries, and thereby establish optimality for various classically studied function classes.
16:30
On Finite Sample Bounds for Operator Learning with Kernel Methods
Markus Holzleitner, Machine Learning Genoa (MaLGa) Center, Department of Mathematics, University of Genoa, Genoa, Italy
In this talk, we will explore explicit connections between specific operator learning problems and kernel methods. We will examine kernel constructions that are both precisely aligned with the target operator and capable of yielding meaningful and interpretable finite-sample rates. Our primary focus is on problems related to elliptic operators and we will also present empirical evidence to support our findings.
17:00
On $t$-design curves and high-dimensional data representations
Clemens Karner, University of Vienna/Medical University of Vienna, Vienna, Austria
Spherical $t$-design curves have the special property that path integrals along them integrate algebraic polynomials of total degree $t$ exactly on the sphere. In classical $t$-design theory, which is based on point evaluations, a multitude of constructed $t$-design point configurations are known. In comparison, only a few $t$-design curves have been discovered. We present here a novel approach to identify such spherical $t$-design curves based on geodesic chains with applications to dimension reduction techniques to serve visualization and aid information extraction tasks. Furthermore, we study the similar concept of $t$-design curves on the Grassmannian manifold. This enables us to optimize along such a curve on the Grassmannian to obtain a projection yielding lower-dimensional representations when applied for visualization of high-dimensional data or subsequent downstream tasks. Moreover, such a curve of projections can be interpreted as a "movie" in lower dimensions, a tool we claim to be valuable for data visualization.