JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at karsch@saw-tagungsmanagement.com.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Location indicates the building first and then the room number!

Click on "Floor plan" for orientation in the builings and on the campus.

Session Overview

Session

S 1 (3): Machine Learning

Time:

Tuesday, 11/Mar/2025:

4:20 pm - 6:00 pm

Session Chair: Merle Behr
Session Chair: Alexandra Carpentier

Location: POT 06
Floor plan

Potthoff Bau

Session Topics:

1. Machine Learning

Presentations

4:20 pm - 4:45 pm

Effective fluctuating continuum models for SGD with small learning rate, or in overparameterized limits

Benjamin Gess

MPI MIS Leipzig & Universität Bielefeld, Germany

In this talk we present recent results on the derivation of effective models for the training dynamics of SGD in limits of small learning rate or large, shallow networks. The focus lies on developing effective limiting models that also capture the fluctuations inherent in SGD. This will lead to novel concepts of stochastic modified flows, and distribution dependent modified flows. The advantage of these limiting models is that they match the SGD dynamics to higher order, and recover the correct multi-point distributions.

This is joint work with Vitalii Konarovskyi and Sebastian Kassing.

4:45 pm - 5:10 pm

Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization

Michael Kohler¹, Adam Krzyżak², Alisha Sänger¹

¹TU Darmstadt, Germany; ²Concordia University, Canada

Image classification from independent and identically distributed random variables is considered. Image classifiers are defined which are based on a linear combination of deep convolutional networks with max-pooling layer. Here all the weights are learned by stochastic gradient descent.

A general result is presented which shows that the image classifiers are able to approximate the best possible deep convolutional network. In case that the a posteriori probability satisfies a suitable hierarchical composition model it is shown that the corresponding deep convolutional neural network image classifier achieves a rate of convergence which is independent of the dimension of the images.

5:10 pm - 5:35 pm

Optimal Rates for Forward Gradient Descent based on Multiple Queries

Niklas Dexheimer, Johannes Schmidt-Hieber

University of Twente, Netherlands, The

We investigate the prediction error of forward gradient descent based on multiple queries in the linear model. It is shown that if the number of queries is chosen suitably large, the minimax optimal rate of convergence can be achieved, matching the performance of stochastic gradient descent. The results also address the case of low-rank training data, which can be beneficial in high-dimensional problems. As forward gradient descent only requires forward passes through a network, which are feasible in the human brain, our results show that rate-optimal results are achievable by biologically plausible optimization methods.

5:35 pm - 6:00 pm

Stochastic Modified Flows for Riemannian Stochastic Gradient Descent

Benjamin Gess^2,3, Sebastian Kassing¹, Nimit Rana⁴

¹Universität Bielefeld, Germany; ²Technische Universität Berlin, Germany; ³Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany; ⁴University of York, UK

We give quantitative estimates for the rate of convergence of Riemannian stochastic gradient descent (RSGD) to Riemannian gradient flow and to a diffusion process, the so-called Riemannian stochastic modified flow (RSMF). Using tools from stochastic differential geometry we show that, in the small learning rate regime, RSGD can be approximated by the solution to the RSMF driven by an infinite-dimensional Wiener process. The RSMF accounts for the random fluctuations of RSGD and, thereby, increases the order of approximation compared to the deterministic Riemannian gradient flow. The RSGD is build using the concept of a retraction map, that is, a cost efficient approximation of the exponential map, and we prove quantitative bounds for the weak error of the diffusion approximation under assumptions on the retraction map, the geometry of the manifold, and the random estimators of the gradient.

Mobile View Print View

Contact and Legal Notice · Contact Address:

karsch{at}saw-tagungsmanagement dot

com

Conference: GPSD 2025