Automatic Dense AD

OpenMDAOCore.jl can create explicit components that are differentiated automatically by the AD packages supported by DifferentiationInterface.jl. Three approaches to differentiating the components are supported:

  • Dense AD, where we assume the Jacobian is dense
  • Sparse AD, where we assume the Jacobian is sparse, and use the sparsity pattern to accelerate the differentiation
  • Matrix-free AD, where we perform Jacobian-vector or vector-Jacobian products instead of forming the entire Jacobian

This page will describe the first and simplest approach (dense AD), with the next couple of pages getting into the others.

The User-Defined Function

The interface for the AD functionality in OpenMDAO.jl is a bit different from the "plain" AbstractExplicitComp and AbstractImplicitComp structs described in earlier examples (see A Simple Example: Optimizing a Paraboloid or A More Complicated Example: Nonlinear Circuit). Instead of creating subtypes of AbstractExplicitComp that implement OpenMDAOCore.setup, OpenMDAOCore.compute!, etc., we'll be writing a Julia function that performs our desired computation. This user-defined function will then be passed to a constructor of the DenseADExplicitComp struct, which will implement the necessary OpenMDAOCore methods for us. (The same user-defined function can also be used for the sparse AD and matrix-free AD approaches, making it relatively simple to try all three out to see what's fastest!)

The user-defined function must follow one of two forms: either it can be an "in-place" function that writes its outputs to an output vector, or it can be an "out-of-place" function that returns a single output vector. Both types must also have a params argument that will contain inputs that are needed for the calculation, but won't be differentiated. So, an example of an in-place function would be

function f_in_place!(Y, X, params)
   # calculate stuff with X and params, storing result in Y
   return nothing
end

where X is the input vector and Y is the output vector. (The function doesn't have to return nothing, but any returned value will be ignored, so I like to include return nothing to make it clear that the return value doesn't matter.) An out-of-place function would look like

function f_out_of_place(X, params)
   # calculate stuff with X and params, returning Y
   return Y
end

where again X is the input vector.

Now, the X and Y arguments of those functions must not be plain Julia Vectors, but ComponentVectors from the ComponentArrays.jl package. What are those? They are objects provided by the ComponentArrays.jl package that act like Vectors, but allow the user to define names for each part ("component") of the vector. For example:

using ComponentArrays: ComponentVector

x1 = ComponentVector(foo=-1.0, bar=-2.0, baz=-3.0)
@show x1 x1[3] x1.foo x1[:foo]
x1 = (foo = -1.0, bar = -2.0, baz = -3.0)
x1[3] = -3.0
x1.foo = -1.0
x1[:foo] = -1.0

Notice that we can get, say, the third value of x1 the usual way (x1[3]), but also by referring to the foo field value via x1.foo and by indexing the ComponentVector with the symbol :foo (x1[:foo]).

Each of the components in x1 are scalars, but they don't have to be:

x2 = ComponentVector(foo=-1.0, bar=1:4, baz=reshape(5:10, 2, 3))
@show x2 x2[:foo] x2[:bar] x2[:baz]
x2 = (foo = -1.0, bar = [1.0, 2.0, 3.0, 4.0], baz = [5.0 7.0 9.0; 6.0 8.0 10.0])
x2[:foo] = -1.0
x2[:bar] = [1.0, 2.0, 3.0, 4.0]
x2[:baz] = [5.0 7.0 9.0; 6.0 8.0 10.0]

In x2, the foo component is a scalar, bar refers to a Vector (aka a 1D Array) and baz refers to a Matrix (aka a 2D Array). But x2 still "looks like" a Vector:

@show x2[3]  # will give the third value of `x2`, which happens to be the second value of x2[:bar]
@show ndims(x2)  # Should be 1, since a Vector is 1-dimensional
@show length(x2)  # length(x2) gives the total number of entries in `x2`, aka 1 + 4 + 2*3 = 11
@show size(x2)  # size is a length-1 tuple since a Vector has just one dimension
x2[3] = 2.0
ndims(x2) = 1
length(x2) = 11
size(x2) = (11,)

Now, how will we use ComponentVectors here? We'll use them to define the names and sizes of all the inputs and outputs to our component. For example, with the paraboloid component in A Simple Example: Optimizing a Paraboloid, we created one component with two inputs x and y and one output f_xy, all scalars. So for that case, our X_ca would be

X_ca = ComponentVector(x=1.0, y=1.0)
Y_ca = ComponentVector(f_xy=0.0)
@show X_ca Y_ca
X_ca = (x = 1.0, y = 1.0)
Y_ca = (f_xy = 0.0)

Actually, why don't we try to implement the Paraboloid component using a DenseADExplicitComp?

DenseADExplicitComp Paraboloid

We'll start fresh, first with importing the stuff we'll need:

using ADTypes: ADTypes
using ComponentArrays: ComponentVector
using OpenMDAOCore: OpenMDAOCore
using OpenMDAO: make_component

Next, we need to define the function that implements our paraboloid equation, which, again, is

\[f(x,y) = (x - 3.0)^2 + xy + (y + 4.0)^2 - 3.0\]

That would look like this:

function f_paraboloid!(Y_ca, X_ca, params)
    # Get the inputs:
    x = @view(X_ca[:x])
    y = @view(X_ca[:y])
    # Could also do this:
    # x = X_ca.x
    # y = X_ca.y
    # or even this
    # (; x, y) = X_ca

    # Get the output:
    f_xy = @view(Y_ca[:f_xy])
    # Again, could also do this:
    # f_xy = Y_ca.f_xy
    # or
    # (; f_xy) = Y_ca

    # Do the calculation:
    @. f_xy = (x - 3.0)^2 + x*y + (y + 4.0)^2 - 3.0

    # Return value doesn't matter.
    return nothing
end
  • The @view macro is used when extracting the inputs and outputs from the X_ca and Y_ca ComponentVectors. This creates a view into the original ComponentVector, instead of a new array with a copy of the original data, which avoids unnecessary allocations and (for the outputs) allows modifications to the view to be reflected in the Y_ca array. In this example everything is a scalar, so no allocations would have happened anyway. But it doesn't hurt to use @view: it's a good habit to get into, and it allows us to use the @. broadcasting macro with the scalar f_xy output.
  • The params argument is not used in this example, but it is still required, since the DenseADExplicitComp constructor will expect the function to accept it. Also needed for SparseADExplicitComp and MatrixFreeADExplicitComp.

Our next step is to create the ComponentVectors that will be used to hold the inputs and outputs:

X_ca = ComponentVector(x=1.0, y=1.0)
Y_ca = ComponentVector(f_xy=0.0)
@show X_ca Y_ca
X_ca = (x = 1.0, y = 1.0)
Y_ca = (f_xy = 0.0)
Use sane values for `X_ca` and `Y_ca`

The values of the entries in X_ca and Y_ca will be passed as initial values when creating the OpenMDAO ExplicitComponent. Depending on your application they may affect e.g. initial guesses for nonlinear solvers or determining the sparsity pattern of your System.

Now we're almost ready to create the SparseADExplicitComp. The last step is to decide what AD library to use. OpenMDAOCore.jl relies on the ADTypes.jl and DifferentiationInterface.jl packages for implementing the interface for calling the AD. Theoretically we can use any AD that those packages support. We'll use ForwardDiff.jl for this example, which is a popular and very robust forward-mode AD:

using ForwardDiff: ForwardDiff
ad_backend = ADTypes.AutoForwardDiff()

Now we are finally ready to create the component:

comp = OpenMDAOCore.DenseADExplicitComp(ad_backend, f_paraboloid!, Y_ca, X_ca; params=nothing)
parab_comp = make_component(comp)

make_component will convert the DenseADExplicitComp into a OpenMDAO Python component that we can use with OpenMDAO. So now we just need to proceed with the paraboloid example as usual:

using OpenMDAO: om

model = om.Group()
model.add_subsystem("parab_comp", parab_comp)

prob = om.Problem(model)

prob.driver = om.ScipyOptimizeDriver()
prob.driver.options["optimizer"] = "SLSQP"

prob.model.add_design_var("parab_comp.x")
prob.model.add_design_var("parab_comp.y")
prob.model.add_objective("parab_comp.f_xy")

prob.setup(force_alloc_complex=true)

prob.set_val("parab_comp.x", 3.0)
prob.set_val("parab_comp.y", -4.0)

prob.run_model()
println(prob["parab_comp.f_xy"])  # Should print `[-15.]`

prob.set_val("parab_comp.x", 5.0)
prob.set_val("parab_comp.y", -2.0)

prob.run_model()
println(prob.get_val("parab_comp.f_xy"))  # Should print `[-5.]`
/home/runner/work/OpenMDAO.jl/OpenMDAO.jl/docs/.CondaPkg/.pixi/envs/default/lib/python3.14/site-packages/openmdao/visualization/n2_viewer/n2_viewer.py:115: OpenMDAOWarning:'float' object has no attribute 'shape'
-15.0
-5.0

Looks OK so far. But we should check our derivatives, just to be safe. We can do that with the finite difference method:

println(prob.check_partials(method="fd"))
{'parab_comp': {('f_xy', 'x'): {'J_fwd': array([[2.]]), 'J_fd': array([[2.000001]]), 'rows': None, 'cols': None, 'tol violation': _ErrorData(forward=-9.99632543852158e-07, reverse=None, fwd_rev=None), 'magnitude': _MagnitudeData(forward=2.0, reverse=0.0, fd=2.0000010003684565), 'vals_at_max_error': _ErrorData(forward=(np.float64(2.0), np.float64(2.0000010003684565)), reverse=None, fwd_rev=None), 'abs error': _ErrorData(forward=1.0003684565162985e-06, reverse=None, fwd_rev=None), 'rel error': _ErrorData(forward=5.001839780740122e-07, reverse=None, fwd_rev=None)}, ('f_xy', 'y'): {'J_fwd': array([[9.]]), 'J_fd': array([[9.000001]]), 'rows': None, 'cols': None, 'tol violation': _ErrorData(forward=-7.999542276593274e-06, reverse=None, fwd_rev=None), 'magnitude': _MagnitudeData(forward=9.0, reverse=0.0, fd=9.000001000458724), 'vals_at_max_error': _ErrorData(forward=(np.float64(9.0), np.float64(9.000001000458724)), reverse=None, fwd_rev=None), 'abs error': _ErrorData(forward=1.0004587238654494e-06, reverse=None, fwd_rev=None), 'rel error': _ErrorData(forward=1.1116206807248763e-07, reverse=None, fwd_rev=None)}}}

or the complex-step method:

println(prob.check_partials(method="cs"))
{'parab_comp': {('f_xy', 'x'): {'J_fwd': array([[2.]]), 'J_fd': array([[2.]]), 'rows': None, 'cols': None, 'tol violation': _ErrorData(forward=-2e-06, reverse=None, fwd_rev=None), 'magnitude': _MagnitudeData(forward=2.0, reverse=0.0, fd=2.0), 'vals_at_max_error': _ErrorData(forward=(np.float64(2.0), np.float64(2.0)), reverse=None, fwd_rev=None), 'abs error': _ErrorData(forward=0.0, reverse=None, fwd_rev=None), 'rel error': _ErrorData(forward=0.0, reverse=None, fwd_rev=None)}, ('f_xy', 'y'): {'J_fwd': array([[9.]]), 'J_fd': array([[9.]]), 'rows': None, 'cols': None, 'tol violation': _ErrorData(forward=-9e-06, reverse=None, fwd_rev=None), 'magnitude': _MagnitudeData(forward=9.0, reverse=0.0, fd=9.0), 'vals_at_max_error': _ErrorData(forward=(np.float64(9.0), np.float64(9.0)), reverse=None, fwd_rev=None), 'abs error': _ErrorData(forward=0.0, reverse=None, fwd_rev=None), 'rel error': _ErrorData(forward=0.0, reverse=None, fwd_rev=None)}}}

Derivatives look great, so let's go ahead and perform the optimization:

prob.run_driver()
println("f_xy = $(prob.get_val("parab_comp.f_xy"))")  # Should print `[-27.333333]`
println("x = $(prob.get_val("parab_comp.x"))")  # Should print `[6.666666]`
println("y = $(prob.get_val("parab_comp.y"))")  # Should print `[-7.333333]`
-----------------------------------------
Component: JuliaExplicitComp 'parab_comp'
-----------------------------------------

  parab_comp: 'f_xy' wrt 'x'

    Max Tolerance Violation (Jfwd - Jfd) - (atol + rtol * Jfd) : (-9.996325e-07)
      abs error: 1.000368e-06
      rel error: 5.001840e-07
      fwd value @ max viol: 2.000000e+00
      fd value @ max viol: 2.000001e+00 (fd:forward)

    Raw Forward Derivative (Jfwd)
    [[ 2.000000000000e+00]]

    Raw FD Derivative (Jfd)
    [[ 2.000001000368e+00]]

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

  parab_comp: 'f_xy' wrt 'y'

    Max Tolerance Violation (Jfwd - Jfd) - (atol + rtol * Jfd) : (-7.999542e-06)
      abs error: 1.000459e-06
      rel error: 1.111621e-07
      fwd value @ max viol: 9.000000e+00
      fd value @ max viol: 9.000001e+00 (fd:forward)

    Raw Forward Derivative (Jfwd)
    [[ 9.000000000000e+00]]

    Raw FD Derivative (Jfd)
    [[ 9.000001000459e+00]]

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

-----------------------------------------
Component: JuliaExplicitComp 'parab_comp'
-----------------------------------------

  parab_comp: 'f_xy' wrt 'x'

    Max Tolerance Violation (Jfwd - Jfd) - (atol + rtol * Jfd) : (-2.000000e-06)
      abs error: 0.000000e+00
      rel error: 0.000000e+00
      fwd value @ max viol: 2.000000e+00
      fd value @ max viol: 2.000000e+00 (cs:None)

    Raw Forward Derivative (Jfwd)
    [[ 2.000000000000e+00]]

    Raw CS Derivative (Jfd)
    [[ 2.000000000000e+00]]

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

  parab_comp: 'f_xy' wrt 'y'

    Max Tolerance Violation (Jfwd - Jfd) - (atol + rtol * Jfd) : (-9.000000e-06)
      abs error: 0.000000e+00
      rel error: 0.000000e+00
      fwd value @ max viol: 9.000000e+00
      fd value @ max viol: 9.000000e+00 (cs:None)

    Raw Forward Derivative (Jfwd)
    [[ 9.000000000000e+00]]

    Raw CS Derivative (Jfd)
    [[ 9.000000000000e+00]]

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

f_xy = -27.333333333333336
x = [6.66666667]
y = [-7.33333333]

Victory!