Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MTK-generated ODEFunction evaluates more slowly than hand-written function #3338

Open
vyudu opened this issue Jan 17, 2025 · 11 comments
Open
Labels
bug Something isn't working

Comments

@vyudu
Copy link
Contributor

vyudu commented Jan 17, 2025

Seems like the MTK-generated is ~5x slower than a corresponding hand-written function

using ModelingToolkit
using BenchmarkTools
using SciMLBase
t = ModelingToolkit.t_nounits; D = ModelingToolkit.D_nounits

@parameters α β γ δ
@variables x(t) y(t)
    
eqs = [D(x) ~ α * x - β * x * y,
       D(y) ~ -γ * y + δ * x * y]
    
@mtkbuild sys = ODESystem(eqs, t)
f1 = ODEFunction(sys)
f2 = ODEFunction(sys; eval_expression = true)
f3 = ODEFunction{true, SciMLBase.FullSpecialize}(sys)
f4 = ODEFunction{true, SciMLBase.FullSpecialize}(sys; eval_expression = true)

function lotkavolterra!(du, u, p, t) 
    du[1] = p[1]*u[1] - p[2]*u[1]*u[2]
    du[2] = -p[4]*u[2] + p[3]*u[1]*u[2]
end

du = zeros(2); u = rand(2); p = rand(4); t_ = 0.

@btime f1(du,u,p,t_) # 93.335 ns
du = zeros(2)
@btime f2(du,u,p,t_) # 92.068 ns
du = zeros(2)
@btime f3(du,u,p,t_) # 88.889 ns
du = zeros(2)
@btime f4(du,u,p,t_) # 85.842 ns
du = zeros(2)
@btime lotkavolterra!(du,u,p,t_) # 17.368 ns
@vyudu vyudu added the bug Something isn't working label Jan 17, 2025
@baggepinnen
Copy link
Contributor

All f functions are global variables, do the timings change if you interpolate those into the benchmarked expression?

@vyudu
Copy link
Contributor Author

vyudu commented Jan 19, 2025

Ah yeah that's what was going on. Interpolating everything gives 5.500 ns for all the generated ones and 4.500 ns for the handwritten one. Not sure if that's expected, if so feel free to close this.

@baggepinnen
Copy link
Contributor

Try interpolation also the input arrays to make sure there's no difference

@vyudu
Copy link
Contributor Author

vyudu commented Jan 20, 2025

The latter result is with all the functions/arrays interpolated yep

@ChrisRackauckas
Copy link
Member

using ModelingToolkit
using BenchmarkTools
using ModelingToolkit.SciMLBase
t = ModelingToolkit.t_nounits; D = ModelingToolkit.D_nounits

@parameters α β γ δ
@variables x(t) y(t)
    
eqs = [D(x) ~ α * x - β * x * y,
       D(y) ~ -γ * y + δ * x * y]
    
@mtkbuild sys = ODESystem(eqs, t)
f1 = ODEFunction(sys)
f2 = ODEFunction(sys; eval_expression = true)
f3 = ODEFunction{true, SciMLBase.FullSpecialize}(sys)
f4 = ODEFunction{true, SciMLBase.FullSpecialize}(sys; eval_expression = true)

function lotkavolterra!(du, u, p, t) 
    du[1] = p[1]*u[1] - p[2]*u[1]*u[2]
    du[2] = -p[4]*u[2] + p[3]*u[1]*u[2]
end

du = zeros(2); u = rand(2); p = rand(4); t_ = 0.

@btime $f1($du,$u,$p,$t_) # 4.500 ns (0 allocations: 0 bytes)
du = zeros(2)
@btime $f2($du,$u,$p,$t_) # 4.500 ns (0 allocations: 0 bytes)
du = zeros(2)
@btime $f3.f($du,$u,$p,$t_) # 4.500 ns (0 allocations: 0 bytes)
du = zeros(2)
@btime $f4.f($du,$u,$p,$t_) # 4.500 ns (0 allocations: 0 bytes)
du = zeros(2)
@btime $lotkavolterra!($du,$u,$p,$t_) # 3.666 ns (0 allocations: 0 bytes)

I can't tell what the rest is, but it might be due to using NaNMath? Maybe @oscardssmith can help track it down.JuliaMath/NaNMath.jl#67 isn't related since pow isn't used here, but I think it's some kind of thing like this.

@oscardssmith
Copy link
Contributor

NaNMath seems very unlikely to be related since this is just doing arithmatic which NaNMath doesn't overload.

@oscardssmith
Copy link
Contributor

This is just the overhead of a RuntimeGeneratedFunction.

@ChrisRackauckas
Copy link
Member

Interesting, it used to not have an overhead. Is it some effect thing?

@oscardssmith
Copy link
Contributor

oscardssmith commented Jan 22, 2025

oh, it's slightly more subtle than that. ModelingToolkit is exposing to RGFs, fiip and foop and the extra cost comes from the call to f(du, u, p, t) = f_iip(du, u, p, t) in ModelingToolkit/src/systems/diffeqs/abstractodesystem.jl

@ChrisRackauckas
Copy link
Member

Why wouldn't that just inline?

@oscardssmith
Copy link
Contributor

I believe this is because the compiler doesn't know the value but just the type of fiip (since the value will be different for every possible system)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants