S04 Data Science

Domain decomposition methods (DDMs) are robust and parallel efficient iterative solvers for discretized partial differential equations. However, the convergence rate of classic DDMs deteriorates for coefficient distributions with large contrasts. To retain the robustness for such problems, the coarse space of the DDM can be enriched by additional coarse basis functions, often obtained by solving local generalized eigenvalue problems. Within overlapping Schwarz methods, we consider the AGDSW (adaptive generalized Dryja-Smith-Widlund) coarse space to obtain robustness. However, the computation of the AGDSW coarse basis functions is computationally expensive due to the solution of many local eigenvalue problems. In this talk, we train a surrogate model based on a deep feedforward regression neural network which directly learns the necessary coarse basis functions. Additionally, we present first results where we also replace the solution of local subdomain problems by surrogate models. This talk is based on joint work with Axel Klawonn and Martin Lanser, University of Cologne, Germany.

Chemical Mechanical Polishing (CMP) is a key process for achieving planarity in semiconductor wafer manufacturing. As modern chips consist of multiple stacked layers, surface irregularities can lead to performance reductions or complete wafer losses, directly impacting production yield. Reliable estimations of planarity are, therefore, crucial for the optimization and control of the CMP process. The combination of the chemical interaction between the slurry's ingredients and the mechanical removal mechanisms driven by small but firm abrasives is one primary driver of material removal. Predicting the slurry flow and leveraging known empirical relations between the shear stress resulting at the wafer surface leads to accurate estimations of the material removal rate (MRR), which serves as an indirect measure of planarity. Finite Element Methods (FEM) enable precise modeling of the incompressible Navier–Stokes equations in complex geometries but are computationally expensive and thus impractical the search for optimal process parameters or real-time control. This presentation explores the use of machine learning-based surrogate models to replace direct FEM simulations. In particular, Deep Operator Networks (DeepONets) and various extensions allow mappings from boundary and geometry conditions to full-field flow solutions. The presentation covers surrogate models applied to a canonical flow problem, the Kármán vortex street, and the transfer to slurry flow prediction in CMP. Results demonstrate that DeepONets can provide fast and sufficiently accurate approximations of the flow fields, enabling accelerated parameter studies.

Approximating solutions to time-dependent Partial Differential Equations (PDEs) is a central challenge in computational science. While Physics-Informed Neural Networks (PINNs) offer a flexible, mesh-free framework using automatic differentiation, their reliance on gradient-based backpropagation, especially when treating time as a spatial variable, often leads to poor accuracy and slow training. In this talk, we will discuss a backpropagation-free method for training PINNs that separates space and time variables and randomly samples hidden-layer parameters. We will discuss how our approach achieves superior accuracy and speed, often by 1–5 orders of magnitude, compared to conventional PINNs.

Operator learning refers to the use of machine learning to approximate operators, that is, maps between function spaces. A typical example is learning the solution operator of initial-boundary value problems. When neural networks are used for this purpose, the resulting models are known as neural operators. Arguably the most prominent neural operator architectures are Fourier neural operators (FNOs), introduced by Li et al. (2020), and deep operator networks (DeepONets), introduced by Lu et al. (2019). This talk focuses on both data-driven and physics-informed training of neural operators. Therefore, the focus is on DeepONets, as they naturally allow for incorporating physics-informed loss terms; see Goswami et al. (2022). The DeepONet architecture consists of two neural networks: the branch net, which encodes the parametrization of the initial-boundary value problem, and the trunk net, which learns spatio-temporal features (or basis functions) to represent the solution. These networks are combined via a dot product to form the DeepONet output. Training DeepONets can be challenging and often requires a large amount of data. The presentation addresses two main topics. First, it investigates how DeepONets approximate solutions and the contributions of the branch and trunk networks to the overall error, particularly in data-driven training. Second, it investigates how localization through domain decomposition can enhance the learning of fine-scale

A parametrization is proposed in conjunction with a learning method which preserves the property of semiconcavity of the approximatied function. The method utilizes the minimization of a $W^{1,1}$ loss together with a penalty functional which to controls the semiconcavity constant of the approximation. Convergence issues are addressed. The importance of these results is related to otimal feedback control were the feedback function is meerly semiconcave and note necessarily smooth. For implementation a setting utilizing Legendre polynomials is introduced. Numerical experiments demonstrate the strength of the method. This is joint work with Donato Vasquez-Varas, Univ. Santiago de Chile.

In this talk, we discuss class stability, the expected distance of an input to the decision boundary, and show that it captures what classical model capacity measures, such as weight norms, fail to explain. In particular, we prove a generalization bound that improves inversely with the class stability. As a corollary, by interpreting class stability as a quantifiable notion of robustness, we derive a law of robustness for classification, thereby extending a similar result of Bubeck and Selke beyond smoothness assumptions to discontinuous functions: any interpolating model on n data points with number of parameters p ~ n must be unstable, so high stability requires significant overparameterization. Preliminary experiments support our theory: empirical stability increases with model size, while norm-based measures remain uninformative.

In this talk, we address the computation of the posterior variance in Gaussian processes defined over graph-time domains, also known as Multi-Output Gaussian processes (MOGP). We assume that the covariance function is separable, meaning it can be written as the product of a spatial (graph) and a temporal covariance function. This structure induces a Kronecker product form in the covariance matrix of the MOGP. The posterior variance is encoded in the diagonal of the posterior covariance matrix. However, for reasonably large graphs and time sets, computing this matrix explicitly becomes infeasible due to memory and computational constraints. Therefore, we focus on efficient methods for estimating its diagonal entries.

State-of-the-art image reconstruction often relies on complex, highly parameterized deep architectures. We propose an alternative: a data-driven reconstruction method inspired by the classic Tikhonov regularization. Our approach iteratively refines intermediate reconstructions by solving a sequence of quadratic problems. These updates have two key components: (i) learned filters to extract salient image features, and (ii) an attention mechanism that locally adjusts the penalty of filter responses. Thereby, it achieves performance on par with leading plug-and-play and learned regularizer approaches while offering interpretability, robustness, and convergent behavior. In effect, we bridge traditional regularization and deep learning with a principled reconstruction approach.

13:30	Machine learning-enhanced overlapping Schwarz solvers Janine Weber, Universität zu Köln, Köln, Germany Domain decomposition methods (DDMs) are robust and parallel efficient iterative solvers for discretized partial differential equations. However, the convergence rate of classic DDMs deteriorates for coefficient distributions with large contrasts. To retain the robustness for such problems, the coarse space of the DDM can be enriched by additional coarse basis functions, often obtained by solving local generalized eigenvalue problems. Within overlapping Schwarz methods, we consider the AGDSW (adaptive generalized Dryja-Smith-Widlund) coarse space to obtain robustness. However, the computation of the AGDSW coarse basis functions is computationally expensive due to the solution of many local eigenvalue problems. In this talk, we train a surrogate model based on a deep feedforward regression neural network which directly learns the necessary coarse basis functions. Additionally, we present first results where we also replace the solution of local subdomain problems by surrogate models. This talk is based on joint work with Axel Klawonn and Martin Lanser, University of Cologne, Germany.
14:00	Application of deep operator networks as surrogate models for simulations of slurry flow in chemical mechanical polishing Georg Winkler, TU Chemnitz, Chemnitz, Germany Chemical Mechanical Polishing (CMP) is a key process for achieving planarity in semiconductor wafer manufacturing. As modern chips consist of multiple stacked layers, surface irregularities can lead to performance reductions or complete wafer losses, directly impacting production yield. Reliable estimations of planarity are, therefore, crucial for the optimization and control of the CMP process. The combination of the chemical interaction between the slurry's ingredients and the mechanical removal mechanisms driven by small but firm abrasives is one primary driver of material removal. Predicting the slurry flow and leveraging known empirical relations between the shear stress resulting at the wafer surface leads to accurate estimations of the material removal rate (MRR), which serves as an indirect measure of planarity. Finite Element Methods (FEM) enable precise modeling of the incompressible Navier–Stokes equations in complex geometries but are computationally expensive and thus impractical the search for optimal process parameters or real-time control. This presentation explores the use of machine learning-based surrogate models to replace direct FEM simulations. In particular, Deep Operator Networks (DeepONets) and various extensions allow mappings from boundary and geometry conditions to full-field flow solutions. The presentation covers surrogate models applied to a canonical flow problem, the Kármán vortex street, and the transfer to slurry flow prediction in CMP. Results demonstrate that DeepONets can provide fast and sufficiently accurate approximations of the flow fields, enabling accelerated parameter studies.
14:30	Can physics-informed machine learning be made easier? Chinmay Data, TU München, München, Germany Approximating solutions to time-dependent Partial Differential Equations (PDEs) is a central challenge in computational science. While Physics-Informed Neural Networks (PINNs) offer a flexible, mesh-free framework using automatic differentiation, their reliance on gradient-based backpropagation, especially when treating time as a spatial variable, often leads to poor accuracy and slow training. In this talk, we will discuss a backpropagation-free method for training PINNs that separates space and time variables and randomly samples hidden-layer parameters. We will discuss how our approach achieves superior accuracy and speed, often by 1–5 orders of magnitude, compared to conventional PINNs.
15:00	Deep Operator Networks: Some Insights into the Learning Process and the Role of Localization Alexander Heinlein, TU Delft, Delft, The Netherlands Operator learning refers to the use of machine learning to approximate operators, that is, maps between function spaces. A typical example is learning the solution operator of initial-boundary value problems. When neural networks are used for this purpose, the resulting models are known as neural operators. Arguably the most prominent neural operator architectures are Fourier neural operators (FNOs), introduced by Li et al. (2020), and deep operator networks (DeepONets), introduced by Lu et al. (2019). This talk focuses on both data-driven and physics-informed training of neural operators. Therefore, the focus is on DeepONets, as they naturally allow for incorporating physics-informed loss terms; see Goswami et al. (2022). The DeepONet architecture consists of two neural networks: the branch net, which encodes the parametrization of the initial-boundary value problem, and the trunk net, which learns spatio-temporal features (or basis functions) to represent the solution. These networks are combined via a dot product to form the DeepONet output. Training DeepONets can be challenging and often requires a large amount of data. The presentation addresses two main topics. First, it investigates how DeepONets approximate solutions and the contributions of the branch and trunk networks to the overall error, particularly in data-driven training. Second, it investigates how localization through domain decomposition can enhance the learning of fine-scale

16:00	Machine Learning Approximation of Semi-concave Functions with Applications to Optimal Feedback Control Karl Kunisch, Universität Graz, Graz, Austria A parametrization is proposed in conjunction with a learning method which preserves the property of semiconcavity of the approximatied function. The method utilizes the minimization of a $W^{1,1}$ loss together with a penalty functional which to controls the semiconcavity constant of the approximation. Convergence issues are addressed. The importance of these results is related to otimal feedback control were the feedback function is meerly semiconcave and note necessarily smooth. For implementation a setting utilizing Legendre polynomials is introduced. Numerical experiments demonstrate the strength of the method. This is joint work with Donato Vasquez-Varas, Univ. Santiago de Chile.
16:30	The Price of Robustness: Stable Classifiers Need Overparameterization Adalbert Fono, LMU München, München, Germany In this talk, we discuss class stability, the expected distance of an input to the decision boundary, and show that it captures what classical model capacity measures, such as weight norms, fail to explain. In particular, we prove a generalization bound that improves inversely with the class stability. As a corollary, by interpreting class stability as a quantifiable notion of robustness, we derive a law of robustness for classification, thereby extending a similar result of Bubeck and Selke beyond smoothness assumptions to discontinuous functions: any interpolating model on n data points with number of parameters p ~ n must be unstable, so high stability requires significant overparameterization. Preliminary experiments support our theory: empirical stability increases with model size, while norm-based measures remain uninformative.
17:00	Computing the Posterior Variance of Separable Multi-Output Gaussian Processes Sebastian Esche, TU Chemnitz, Chemnitz, Germany In this talk, we address the computation of the posterior variance in Gaussian processes defined over graph-time domains, also known as Multi-Output Gaussian processes (MOGP). We assume that the covariance function is separable, meaning it can be written as the product of a spatial (graph) and a temporal covariance function. This structure induces a Kronecker product form in the covariance matrix of the MOGP. The posterior variance is encoded in the diagonal of the posterior covariance matrix. However, for reasonably large graphs and time sets, computing this matrix explicitly becomes infeasible due to memory and computational constraints. Therefore, we focus on efficient methods for estimating its diagonal entries.
17:30	DEALing with Image Reconstruction: Deep Attentive Least Squares Erich Kobler, Johannes-Kepler Universität, Linz, Austria State-of-the-art image reconstruction often relies on complex, highly parameterized deep architectures. We propose an alternative: a data-driven reconstruction method inspired by the classic Tikhonov regularization. Our approach iteratively refines intermediate reconstructions by solving a sequence of quadratic problems. These updates have two key components: (i) learned filters to extract salient image features, and (ii) an attention mechanism that locally adjusts the penalty of filter responses. Thereby, it achieves performance on par with leading plug-and-play and learned regularizer approaches while offering interpretability, robustness, and convergent behavior. In effect, we bridge traditional regularization and deep learning with a principled reconstruction approach.