-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split FISTA and APGD to give more options for momentum #2061
base: master
Are you sure you want to change the base?
Conversation
Thoughts on implementation
|
From @samdporter The problem is that it can't be implemented easily within the currently suggested framework for momentum. Could we instead do something similar to the i.e
we would have something like
which can then accommodate more complex types of momentum. P.S. Would it be more appropriate to create a discussion for this?" |
@samdporter Atm, we can design Table III algorithm, using the FISTA algorithm, with ![]() Your suggestion can indeed cover more complex cases of momentum. For example the Optimised Gradient Method (OGM) which is actually faster than NAG here The issue here is more on the software design and how we want or our users to apply these algorithms.
I am more in favour of creating new algorithms that implement (P)OGM algorithms and cover all the cases above. Otherwise we need to edit the |
The initial value for the momentum coefficient. | ||
''' | ||
|
||
def __init__(self, t= 1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need t=1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think self.t=1
fits better in the __init__
as before because it is one of the parameters need to initialise the algorithm.
m = 50 | ||
A = np.random.normal(0,1, (m, n)).astype('float32') | ||
# A /= np.linalg.norm(A, axis=1, keepdims=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the rescaling.
@@ -45,7 +45,7 @@ An algorithm is designed for a particular generic optimisation problem accepts a | |||
instances of :code:`Function` derived classes and/or :code:`Operator` derived classes as input to | |||
define a specific instance of the generic optimisation problem to be solved. | |||
They are iterable objects which can be run in a for loop. | |||
The user can provide a stopping criterion different than the default max_iteration. | |||
The user can provide a stopping criterion different than the defaul. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor comments
A draft PR for discussion
Changes
Splits APGD and FISTA algorithms. This includes:
Would still need to add:
Testing you performed
Related issues/links
Checklist
Contribution Notes
Please read and adhere to the developer guide and local patterns and conventions.