-
Notifications
You must be signed in to change notification settings - Fork 1
Benchmarking and Accuracy Documentation #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've been thinking of things along these lines too. I'll summarize my ideas for I worked on creating reference implementations for special functions based on My idea for testing So your idea of tracking the accuracy over time with benchmarks is essentially how I envisioned the testing strategy working for
I like the idea of considering documented inaccuracies to be opportunities for improvement rather than outright bugs. I also support the idea of returning Some other comments on specific points above:
That's not in my wheelhouse, but I was thinking about having interactive in-REPL tools in
I like this idea a lot, but think it could be very difficult in many cases. We should do it when it's practical though.
This might actually be doable though if we can do it in a way that doesn't require lots of coordination with other projects. Let's discuss how we could actually move this forward. I'm hoping to finish getting testing set up for Footnotes
|
In the meantime, I went ahead and opened numpy/numpy#28397 to float the idea. |
Uh oh!
There was an error while loading. Please reload this page.
Accurate evaluation of special functions (e.g. distribution functions, moments, entropy) with finite precision arithmetic is hard, and expecting results accurate to machine precision for arbitrary machine-representable inputs is impractical. At the time of writing, many SciPy
stats
andspecial
functions produce inaccurate results without any warnings in the documentation or during execution, users typically report accuracy problems individually as "bugs", and occasionally we make improvements. Historically, there has not been a long-term, birds-eye plan. We want to do better by:Only results that are inconsistent with the documented accuracy will be "bugs"; the rest is opportunity for improvement.
There are two things that need to happen in SciPy before we can completely satisfy these goals.
error
attribute to the returned array has not really caught on, so we just don't have a way of doing that.mparray
s through it. The idea is that we can (at least partially) automate the process of generating reference values by evaluating functions with arbitrary precision arithmetic.In the meantime, though, I think we can plan for what we want to do with those features when they are available. For instance, we can manually add ReferenceDistributions, and use those to run benchmarks and document method accuracy.
I don't have a complete picture of what those benchmarks and documentation will look like. I have a vague notion of some aspects.
For some inspiration, see how Boost presents accuracy information, e.g. here. (However, I'd like to go quite a bit beyond that.) We should also coordinate with scipy/xsf to avoid duplicating effort, since I think we'lll be trying to solve similar problems.
In any case, clearly there is a lot we can do while waiting for features in the infrastructure!
The text was updated successfully, but these errors were encountered: