Skip to content

[Nonlinear.ReverseAD] fix performance bug in Hessian computation #2783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 10, 2025

Conversation

odow
Copy link
Member

@odow odow commented Jul 9, 2025

Closes #2782

I bisected this to #2730

I obviously didn't notice the logic change that was hidden in the refactoring. We go from zeroing a small number of elements, to repeatedly zeroing a very large vector.

I learned that fill!(x, 0.0) doesn't show up on ProfileView because it isn't implemented in Julia. It just showed as a blank time with no hits. But PProf.jl showed:

image

which gave the game away because what is bzero doing off to the side and taking 80% of the total runtime!

I think this is a great example of how difficult it is to make even very minor seeming changes to the ReverseAD code. It is very temperamental. I assume that when we did the other PRs, we benchmarked against master instead of the last release, and so we missed this innocuous first regression.

@odow
Copy link
Member Author

odow commented Jul 9, 2025

Yes, silly Oscar, I should have noticed that 6 seconds was much too long:

#2736 (comment)

@mlubin
Copy link
Member

mlubin commented Jul 10, 2025

Nice catch!

@odow
Copy link
Member Author

odow commented Jul 10, 2025

@odow
Copy link
Member Author

odow commented Jul 10, 2025

Original benchmark:

julia> using Revise

julia> using JuMP

julia> import PowerModels

julia> import Ipopt

julia> begin
       PowerModels.silence()
       pm = PowerModels.instantiate_model(
           "pglib_opf_case9241_pegase.m",
           PowerModels.ACPPowerModel,
           PowerModels.build_opf,
       )
       set_optimizer(pm.model, Ipopt.Optimizer)
       set_optimizer_attribute(pm.model, "max_iter", 10)
       optimize!(pm.model)
       nlp_block = JuMP.MOI.get(JuMP.unsafe_backend(pm.model), JuMP.MOI.NLPBlock())
       total_callback_time =
               nlp_block.evaluator.eval_objective_timer +
               nlp_block.evaluator.eval_objective_gradient_timer +
               nlp_block.evaluator.eval_constraint_timer +
               nlp_block.evaluator.eval_constraint_jacobian_timer +
               nlp_block.evaluator.eval_hessian_lagrangian_timer
       println("")
       println("   callbacks time:")
       println("   * obj.....: $(nlp_block.evaluator.eval_objective_timer)")
       println("   * grad....: $(nlp_block.evaluator.eval_objective_gradient_timer)")
       println("   * cons....: $(nlp_block.evaluator.eval_constraint_timer)")
       println("   * jac.....: $(nlp_block.evaluator.eval_constraint_jacobian_timer)")
       println("   * hesslag.: $(nlp_block.evaluator.eval_hessian_lagrangian_timer)")
       end
[info | PowerModels]: Suppressing information and warning messages for the rest of this session.  Use the Memento package for more fine-grained control of logging.
This is Ipopt version 3.14.17, running with linear solver MUMPS 5.8.0.

Number of nonzeros in equality constraint Jacobian...:   395686
Number of nonzeros in inequality constraint Jacobian.:    92610
Number of nonzeros in Lagrangian Hessian.............:   713775

Total number of variables............................:    85568
                     variables with only lower bounds:        0
                variables with lower and upper bounds:    76327
                     variables with only upper bounds:        0
Total number of equality constraints.................:    82679
Total number of inequality constraints...............:    46305
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:    14207
        inequality constraints with only upper bounds:    32098

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  3.2558304e+06 2.84e+01 1.00e+02  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  3.3372098e+06 2.78e+01 9.84e+01  -1.0 4.34e+01    -  1.24e-02 1.94e-02h  1
   2  3.4280547e+06 2.71e+01 9.72e+01  -1.0 2.28e+01    -  4.12e-03 2.66e-02h  1
   3  3.5289937e+06 2.58e+01 9.56e+01  -1.0 3.02e+01    -  6.36e-03 4.59e-02h  1
   4  3.5702968e+06 2.45e+01 9.32e+01  -1.0 1.07e+02    -  1.79e-02 5.04e-02h  1
   5  3.5683907e+06 2.33e+01 9.08e+01  -1.0 3.09e+02    -  2.17e-02 4.89e-02h  1
   6  3.5755535e+06 2.17e+01 8.64e+01  -1.0 4.37e+02    -  4.46e-02 6.90e-02h  1
   7  3.6404373e+06 1.99e+01 8.12e+01  -1.0 4.61e+02    -  5.66e-02 8.20e-02h  1
   8  3.8204827e+06 1.73e+01 7.37e+01  -1.0 4.59e+02    -  8.46e-02 1.33e-01H  1
   9  4.0996155e+06 1.46e+01 6.99e+01  -1.0 4.05e+02    -  3.36e-02 1.58e-01h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  4.3347590e+06 1.27e+01 6.25e+01  -1.0 3.14e+02    -  1.02e-01 1.29e-01h  1

Number of Iterations....: 10

                                   (scaled)                 (unscaled)
Objective...............:   3.1946168015315146e+04    4.3347589898830801e+06
Dual infeasibility......:   6.2495182233300511e+01    8.4799389047321492e+03
Constraint violation....:   1.2668657111871276e+01    1.2668657111871276e+01
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   2.6555734531625495e+06    3.6033338627257758e+08
Overall NLP error.......:   2.6555734531625495e+06    3.6033338627257758e+08


Number of objective function evaluations             = 12
Number of objective gradient evaluations             = 11
Number of equality constraint evaluations            = 12
Number of inequality constraint evaluations          = 12
Number of equality constraint Jacobian evaluations   = 11
Number of inequality constraint Jacobian evaluations = 11
Number of Lagrangian Hessian evaluations             = 10
Total seconds in IPOPT                               = 10.166

EXIT: Maximum Number of Iterations Exceeded.

   callbacks time:
   * obj.....: 0.0
   * grad....: 0.0
   * cons....: 0.993605375289917
   * jac.....: 0.4068117141723633
   * hesslag.: 1.6697413921356201

@odow odow merged commit 90bcaae into master Jul 10, 2025
94 of 97 checks passed
@odow odow deleted the od/nlp-perf branch July 10, 2025 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

[Nonlinear] Performance regression of Hessian evaluation in MOI 1.41
2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy