-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialization and Linearization #3352
Comments
One (breaking) solution is to lean in to the |
TL; DR: I think In my opinion, an
The code for every sub-problem in the chain and the input/output piping between them should be completely locked in after calling
It should be illegal for the user to modify/remake If the user wants
All This works nicely with a Does such a "stage-separation" or a generalization thereof work for linearization? I am not familiar enough with it to understand all the challenges you present, but I thought this sounded like a very clear and composable structure, so I hope it could be of some help. |
We have a bit of a fundamental issue with running initialization in
linearization_function
and the like. Note that all of this discussion is in the context of #3347, on the assumption it is merged. With that PR, any initial conditions of the formu0 = [x =>1, y => 2]
will be turned into the equationsx ~ x0_tmp, y ~ y0_tmp
in the initialization system. The new parametersx0_tmp
andy0_tmp
are only present in the initialization system, and take whatever value they have in theODEProblem
. This allows mutatingprob[x] = 2.0
and automatically getting the equivalent ofx ~ 2.0
in initialization.As with all good things, there's a catch. This transformation only happens for variables that are unknowns in the system. If
z
is an observed variable (present as an equation of the formz ~ ...
inobserved(sys)
) thenu0 = [z => 1.0]
will be hardcoded in the initialization system asz ~ 1.0
. This is because even ifz0_tmp
existed, where would it get its value from in theODEProblem
?z
isn't in the state vector, it is calculated from the state vector. We could addz0_tmp
as a parameter to theODESystem
, but the user doesn't know about this parameter. It's going to have a name with some unicode to prevent collisions, which automatically makes it difficult to type. Since systems are immutable, this would also mean that the new parameter is only present in the system stored inprob.f.sys
, not the system the user has, the one they passed toODEProblem
. We'd also need to be very careful about how this parameter is managed, because it would be very easy to make it so that the parameters inprob.f.sys
have different indexes thansys
and thus getter functions made usingsys
won't work on the actual value providers, which is unacceptable.Where does linearization come into this? Well, for
ODEProblem
if the user wants to change the initial condition forz
, they can simply callremake
and passz => 2.0
. This will rebuild the initialization system accordingly. This approach isn't an option for linearization, becauselinearization_function
is meant to return a function that can be called with multiple operating points and run very quickly on account of being called very often. Effectively,linearization_function
returns aLinearizationProblem
for which the intended use case is the equivalent of calling symbolicremake
many times in a loop. "Many" can be upwards of 6000 times on relatively small problems. A challenge with UX here is that the system the user passes tolinearization_function
is not actually the system that is used for code generation. It goes throughio_preprocessing
to mark some variables as inputs/outputs and hence structurally change the system. This is even worse in e.g.sensitivity_function
, which adds equations and variables before callinglinearization_function
. As a result, the user has no control and almost no intuition or feedback for which variables are observed and which are unknowns in the system used to build initialization. This is of course bad: the user should be able to tune what they want without being concerned about whether it's an unknown or not, and currently they can't.One possible way to handle this is the same
z0_tmp
approach, but since MTK has control over the object returned fromlinearization_function
(instead of lower-level SciML infrastructure as is the case withODEProblem
) it can update the "LinearizationProblem
" appropriately. The downside here is thatlinearization_function
so the appropriatevar ~ var0_tmp
variables are created.linearization_function
is stateful in a much more indirect way. If the user passesz => 2.0
as part ofop
tolinearize(sys, lin_fun; op)
, it will mutatelin_fun
. The next timelinearize
is called,z0_tmp
will have the value of2
if the user doesn't pass a new value forz
. If they calledlinearize
in the reverse order, this wouldn't be the case and they'd get a different result out. Moreover, the user doesn't know whether a variable passed toop
will do this mutation so they can't schedule theirlinearize
calls appropriately either.@baggepinnen had phrased this problem nicely, which I'll paraphrase here: SciML follows a
prob[var] = value; solve(prob)
user workflow. Linearization wants asolve(prob, values)
workflow which is antithetical to how everything else is designed.One way to look at this is if we were to solve this problem for
ODEProblem
(allow changing initial conditions of observed variables withoutremake
) how would we do it? What would the user API look like?The text was updated successfully, but these errors were encountered: