Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
MS13 1: Stochastic iterative methods for inverse problems
Time:
Friday, 08/Sept/2023:
1:30pm - 3:30pm
Session Chair: Tim Jahn
Location:VG0.111
Presentations
Beating the Saturation of the Stochastic Gradient Descent for Linear Inverse Problems
Bangti Jin1, Zehui Zhou2, Jun Zou1
1The Chinese University of Hong Kong; 2Rutgers University, United States of America
Stochastic gradient descent (SGD) is a promising method for solving large-scale inverse problems, due to its excellent scalability with respect to data size. The current mathematical theory in the lens of regularization theory predicts that SGD with a polynomially decaying stepsize schedule may suffer from an undesirable saturation phenomenon, i.e., the convergence rate does not further improve with the solution regularity index when it is beyond a certain range. In this talk, I will present our recent results on beating this saturation phenomenon:
(i) By using a small initial step size. We derive a refined convergence rate analysis of SGD, which shows that saturation does not occur if the initial stepsize of the schedule is sufficiently small.
(ii) By using Stochastic variance reduced gradient (SVRG), a popular variance reduction technique for SGD. We prove that, with a suitable constant step size schedule, SVRG can achieve an optimal convergence rate in terms of the noise level (under suitable regularity conditions), which means the saturation does not occur.
Early stopping of untrained convolutional networks
Tim Jahn1, Bangti Jin2
1University of Bonn, Germany; 2Chinese University of Hong Kong
In recent years new regularisation methods based on neural networks have shown promising performance for the solution of ill-posed problems, e.g., in imaging science. Due to the non-linearity of the networks, these methods often lack profound theoretical justification. In this talk we rigorously discuss convergence for an untrained convolutional network. Untrained networks are particulary attractive for applications, since they do not require any training data. Its regularising property is solely based on the architecture of the network. Because of this, appropriate early stopping is essential for the success of the method. We show that the discrepancy principle is an adequate method for early stopping here, as it yields minimax optimal convergence rates.
Stochastic mirror descent method for linear ill-posed problems in Banach spaces
Qinian Jin
The Australian National University, Australia
Consider linear ill-posed problems governed by the system $A_i x = y_i$ for $i= 1,\cdots, p$, where each $A_i$ is a bounded linear operator from a Banach space $X$ to a Hilbert space $Y_i$. In case p is huge, solving the problem by an iterative regularization method using the whole information at each iteration step can be very expensive, due to the huge amount of memory and excessive computational load per iteration. To solve such large-scale ill-posed systems efficiently, we develop a stochastic mirror descent method which uses only a small portion of equations randomly selected at each iteration steps and incorporates convex regularization terms into the algorithm design. Therefore, our method scales very well with the problem size and has the capability of capturing features of sought solutions. The convergence property of the method depends crucially on the choice of
step-sizes. We consider various rules for choosing step-sizes and obtain convergence results under a priori stopping rules. Furthermore, we establish an order optimal convergence rate result when the sought solution satisfies a benchmark source condition. Various numerical simulations are reported to test the performance of the method. This is a joint work with Xiliang Lu and Liuying Zhang.
Early stopping for spectral filter estimators regularized by projection
When using iterative algorithms such as gradient descent, a classical problem is the choice of the number of iterations to perform in practice. This is a crucial question since the number of iterations determines the final statistical performance of the resulting estimator.
The main purpose of the present talk is to design such data-criven stopping rules called "early stopping rules" (ESR) that will answer the above question not only for gradient descent but also for the broader class of spectral filter estimators among which ridge regression for instance.
Compared to previous works in this direction, the present contribution focuses on the computational issue raised by the use of spectral filter estimators in the context of a huge amount of data. In particular this requires the additional use of regularization by projection techniques for efficiently computing (approximations to) the spectral filter estimators.
In this talk we develop a theoretical analysis of the behavior of these projection-based spectral filter (PSF) estimators. Oracle inequalities also quantify the performance of the data-driven early stopping rule applied to these PSF estimators.