PD3O formulations and Stochastic Optimisation #2021

epapoutsellis · 2024-12-19T15:38:51Z

In the update method of PD3O algorithm, we have two gradient methods for the function f:

CIL/Wrappers/Python/cil/optimisation/algorithms/PD3O.py

Line 121 in d58c330

self.f.gradient(self.x_old, out=self.grad_f)

CIL/Wrappers/Python/cil/optimisation/algorithms/PD3O.py

Line 128 in d58c330

self.f.gradient(self.x, out=self.x_old)

If f is an instance of ApproximateGradientSumFunction, then the f.gradient method calls the approximate_gradient and does:
a) selects a function number based on the selection method
b) updates the .data_passes attribute.

CIL/Wrappers/Python/cil/optimisation/functions/ApproximateGradientSumFunction.py

Lines 182 to 184 in d58c330

    
           self.function_num = self.sampler.next() 
        
           self._update_data_passes_indices([self.function_num])

The implementation that we have is based on equations 5a-5c from https://arxiv.org/pdf/1611.09805.

and NOT 4a-4c

Both formulations are equivalent (I think I have an implementation of (4a-4c) ) and are used in the paper to derive specific subcases of PD3O algorithm, PDHG, PAPC etc.

The stochastic version of PD3O proposed in https://arxiv.org/pdf/2004.02635 follows the (4a-4c) formulation where the gradient of f is computed ONCE at x^{k}.

When we compute the gradient the second time

CIL/Wrappers/Python/cil/optimisation/algorithms/PD3O.py

Line 128 in d58c330

self.f.gradient(self.x, out=self.x_old)

another function is selected and also the data_passes is updated wrongly. For instance in the first case $f_{5}$ is selected and we compute $\nabla f_{5}(x_{k})$ and then $\nabla f_{9}(x_{k+1})$. So far I have not seen any actual convergence issue but the data passes are for sure wrong.

Actually, I need to check again the actual update method because it has a different order from the paper above and the one that have implemented here

The text was updated successfully, but these errors were encountered:

MargaretDuff · 2025-01-15T15:47:40Z

Discussed this today with @epapoutsellis, @paskino and @jakobsj today - thanks @epapoutsellis for all your work on this.

To summarise the discussion, we identified the major issue that in our implementation of PD3O the gradient of $f$ is calculated twice per iteration, once on $\nabla f(x)$ and once on $\nabla f(x^\dagger)$
.

When doing stochastic PD3O, for both calls we should use the same approximation of the gradient (i.e. for SGD the approximate gradient should be calculated on the same subset).
As a secondary, more minor problem, making two calls to the gradient on each iteration means that our data passes calculation may not be correct.

One solution (for SGD, SAG, and SAGA) is to replace this line in PD3O

CIL/Wrappers/Python/cil/optimisation/algorithms/PD3O.py

Line 128 in d58c330

self.f.gradient(self.x, out=self.x_old)

with

if isinstance(self.f, ApproximateGradientSumFunction):
      self.f.approximate_gradient(self.func_num, self.x, out=self.x_bar)
else:
      self.f.gradient(self.x, out=self.x_bar)

For SVRG and LSVRG, this will need more careful thought when dealing with full gradient snapshot updates and on calculating data passes (as this is done in the approximate_gradient function).

epapoutsellis self-assigned this Dec 19, 2024

MargaretDuff added this to CIL work Jan 9, 2025

github-project-automation bot moved this to Todo in CIL work Jan 9, 2025

MargaretDuff moved this from Todo to Blocked in CIL work Jan 9, 2025

MargaretDuff self-assigned this Jan 15, 2025

MargaretDuff moved this from Blocked to In Progress in CIL work Jan 20, 2025

MargaretDuff linked a pull request Jan 20, 2025 that will close this issue

Bug fix to PD3O with stochastic gradient optimisers #2043

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PD3O formulations and Stochastic Optimisation #2021

PD3O formulations and Stochastic Optimisation #2021

epapoutsellis commented Dec 19, 2024 •

edited

Loading

MargaretDuff commented Jan 15, 2025 •

edited

Loading

PD3O formulations and Stochastic Optimisation #2021

PD3O formulations and Stochastic Optimisation #2021

Comments

epapoutsellis commented Dec 19, 2024 • edited Loading

MargaretDuff commented Jan 15, 2025 • edited Loading

epapoutsellis commented Dec 19, 2024 •

edited

Loading

MargaretDuff commented Jan 15, 2025 •

edited

Loading