doc/manual/tao.md

7f296bb3SBarry Smith(ch_tao)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith# TAO: Optimization Solvers
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Toolkit for Advanced Optimization (TAO) focuses on algorithms for the
7f296bb3SBarry Smithsolution of large-scale optimization problems on high-performance
7f296bb3SBarry Smitharchitectures. Methods are available for
7f296bb3SBarry Smith
7f296bb3SBarry Smith- {any}`sec_tao_leastsquares`
7f296bb3SBarry Smith- {any}`sec_tao_quadratic`
7f296bb3SBarry Smith- {any}`sec_tao_unconstrained`
7f296bb3SBarry Smith- {any}`sec_tao_bound`
7f296bb3SBarry Smith- {any}`sec_tao_constrained`
7f296bb3SBarry Smith- {any}`sec_tao_complementary`
7f296bb3SBarry Smith- {any}`sec_tao_pde_constrained`
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_getting_started)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith## Getting Started: A Simple TAO Example
7f296bb3SBarry Smith
7f296bb3SBarry SmithTo help the user start using TAO immediately, we introduce here a simple
7f296bb3SBarry Smithuniprocessor example. Please read {any}`sec_tao_solvers`
7f296bb3SBarry Smithfor a more in-depth discussion on using the TAO solvers. The code
7f296bb3SBarry Smithpresented {any}`below <tao_example1>` minimizes the
7f296bb3SBarry Smithextended Rosenbrock function $f: \mathbb R^n \to \mathbb R$
7f296bb3SBarry Smithdefined by
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithf(x) = \sum_{i=0}^{m-1} \left( \alpha(x_{2i+1}-x_{2i}^2)^2 + (1-x_{2i})^2 \right),
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $n = 2m$ is the number of variables. Note that while we use
7f296bb3SBarry Smiththe C language to introduce the TAO software, the package is fully
7f296bb3SBarry Smithusable from C++ and Fortran.
7f296bb3SBarry Smith{any}`ch_fortran` discusses additional
7f296bb3SBarry Smithissues concerning Fortran usage.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe code in {any}`the example <tao_example1>` contains many of
7f296bb3SBarry Smiththe components needed to write most TAO programs and thus is
7f296bb3SBarry Smithillustrative of the features present in complex optimization problems.
7f296bb3SBarry SmithNote that for display purposes we have omitted some nonessential lines
7f296bb3SBarry Smithof code as well as the (essential) code required for the routine
7f296bb3SBarry Smith`FormFunctionGradient`, which evaluates the function and gradient, and
7f296bb3SBarry Smiththe code for `FormHessian`, which evaluates the Hessian matrix for
7f296bb3SBarry SmithRosenbrock’s function. The complete code is available in
7f296bb3SBarry Smith<a href="PETSC_DOC_OUT_ROOT_PLACEHOLDER/src/tao/unconstrained/tutorials/rosenbrock1.c.html">\$TAO_DIR/src/unconstrained/tutorials/rosenbrock1.c</a>.
7f296bb3SBarry SmithThe following sections annotate the lines of code in
7f296bb3SBarry Smith{any}`the example <tao_example1>`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(tao_example1)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith:::{admonition} Listing: `src/tao/unconstrained/tutorials/rosenbrock1.c`
7f296bb3SBarry Smith```{literalinclude} /../src/tao/unconstrained/tutorials/rosenbrock1.c
7f296bb3SBarry Smith:append: return ierr;}
7f296bb3SBarry Smith:end-at: PetscFinalize
7f296bb3SBarry Smith:prepend: '#include <petsctao.h>'
7f296bb3SBarry Smith:start-at: typedef struct
7f296bb3SBarry Smith```
7f296bb3SBarry Smith:::
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_workflow)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith## TAO Workflow
7f296bb3SBarry Smith
7f296bb3SBarry SmithMany TAO applications will follow an ordered set of procedures for
7f296bb3SBarry Smithsolving an optimization problem: The user creates a `Tao` context and
7f296bb3SBarry Smithselects a default algorithm. Call-back routines as well as vector
7f296bb3SBarry Smith(`Vec`) and matrix (`Mat`) data structures are then set. These
7f296bb3SBarry Smithcall-back routines will be used for evaluating the objective function,
7f296bb3SBarry Smithgradient, and perhaps the Hessian matrix. The user then invokes TAO to
7f296bb3SBarry Smithsolve the optimization problem and finally destroys the `Tao` context.
7f296bb3SBarry SmithA list of the necessary functions for performing these steps using TAO
7f296bb3SBarry Smithis shown below.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoCreate(MPI_Comm comm, Tao *tao);
7f296bb3SBarry SmithTaoSetType(Tao tao, TaoType type);
7f296bb3SBarry SmithTaoSetSolution(Tao tao, Vec x);
*2a8381b2SBarry SmithTaoSetObjectiveAndGradient(Tao tao, Vec g, PetscErrorCode (*FormFGradient)(Tao, Vec, PetscReal*, Vec, PetscCtx), PetscCtx ctx);
*2a8381b2SBarry SmithTaoSetHessian(Tao tao, Mat H, Mat Hpre, PetscErrorCode (*FormHessian)(Tao, Vec, Mat, Mat, PetscCtx), PetscCtx ctx);
7f296bb3SBarry SmithTaoSolve(Tao tao);
7f296bb3SBarry SmithTaoDestroy(Tao tao);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithNote that the solver algorithm selected through the function
7f296bb3SBarry Smith`TaoSetType()` can be overridden at runtime by using an options
7f296bb3SBarry Smithdatabase. Through this database, the user not only can select a
7f296bb3SBarry Smithminimization method (e.g., limited-memory variable metric, conjugate
7f296bb3SBarry Smithgradient, Newton with line search or trust region) but also can
7f296bb3SBarry Smithprescribe the convergence tolerance, set various monitoring routines,
7f296bb3SBarry Smithset iterative methods and preconditions for solving the linear systems,
7f296bb3SBarry Smithand so forth. See {any}`sec_tao_solvers` for more
7f296bb3SBarry Smithinformation on the solver methods available in TAO.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Header File
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO applications written in C/C++ should have the statement
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry Smith#include <petsctao.h>
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithin each file that uses a routine in the TAO libraries.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Creation and Destruction
7f296bb3SBarry Smith
7f296bb3SBarry SmithA TAO solver can be created by calling the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoCreate(MPI_Comm, Tao*);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. Much like creating PETSc vector and matrix objects, the first
7f296bb3SBarry Smithargument is an MPI *communicator*. An MPI [^mpi]
7f296bb3SBarry Smithcommunicator indicates a collection of processors that will be used to
7f296bb3SBarry Smithevaluate the objective function, compute constraints, and provide
7f296bb3SBarry Smithderivative information. When only one processor is being used, the
7f296bb3SBarry Smithcommunicator `PETSC_COMM_SELF` can be used with no understanding of
7f296bb3SBarry SmithMPI. Even parallel users need to be familiar with only the basic
7f296bb3SBarry Smithconcepts of message passing and distributed-memory computing. Most
7f296bb3SBarry Smithapplications running TAO in parallel environments can employ the
7f296bb3SBarry Smithcommunicator `PETSC_COMM_WORLD` to indicate all processes known to
7f296bb3SBarry SmithPETSc in a given run.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe routine
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSetType(Tao, TaoType);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithcan be used to set the algorithm TAO uses to solve the application. The
7f296bb3SBarry Smithvarious types of TAO solvers and the flags that identify them will be
7f296bb3SBarry Smithdiscussed in the following sections. The solution method should be
7f296bb3SBarry Smithcarefully chosen depending on the problem being solved. Some solvers,
7f296bb3SBarry Smithfor instance, are meant for problems with no constraints, whereas other
7f296bb3SBarry Smithsolvers acknowledge constraints in the problem and handle them
7f296bb3SBarry Smithaccordingly. The user must also be aware of the derivative information
7f296bb3SBarry Smiththat is available. Some solvers require second-order information, while
7f296bb3SBarry Smithother solvers require only gradient or function information. The command
7f296bb3SBarry Smithline option `-tao_type` followed by
7f296bb3SBarry Smitha TAO method will override any method specified by the second argument.
7f296bb3SBarry SmithThe command line option `-tao_type bqnls`, for instance, will
7f296bb3SBarry Smithspecify the limited-memory quasi-Newton line search method for
7f296bb3SBarry Smithbound-constrained problems. Note that the `TaoType` variable is a string that
7f296bb3SBarry Smithrequires quotation marks in an application program, but quotation marks
7f296bb3SBarry Smithare not required at the command line.
7f296bb3SBarry Smith
7f296bb3SBarry SmithEach TAO solver that has been created should also be destroyed by using
7f296bb3SBarry Smiththe
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoDestroy(Tao tao);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithcommand. This routine frees the internal data structures used by the
7f296bb3SBarry Smithsolver.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Command-line Options
7f296bb3SBarry Smith
7f296bb3SBarry SmithAdditional options for the TAO solver can be set from the command
7f296bb3SBarry Smithline by using the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSetFromOptions(Tao)
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. This command also provides information about runtime options
7f296bb3SBarry Smithwhen the user includes the `-help` option on the command line.
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn addition to common command line options shared by all TAO solvers, each TAO
7f296bb3SBarry Smithmethod also implements its own specialized options. Please refer to the
7f296bb3SBarry Smithdocumentation for individual methods for more details.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Defining Variables
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn all the optimization solvers, the application must provide a `Vec`
7f296bb3SBarry Smithobject of appropriate dimension to represent the variables. This vector
7f296bb3SBarry Smithwill be cloned by the solvers to create additional work space within the
7f296bb3SBarry Smithsolver. If this vector is distributed over multiple processors, it
7f296bb3SBarry Smithshould have a parallel distribution that allows for efficient scaling,
7f296bb3SBarry Smithinner products, and function evaluations. This vector can be passed to
7f296bb3SBarry Smiththe application object by using the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSetSolution(Tao, Vec);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. When using this routine, the application should initialize the
7f296bb3SBarry Smithvector with an approximate solution of the optimization problem before
7f296bb3SBarry Smithcalling the TAO solver. This vector will be used by the TAO solver to
7f296bb3SBarry Smithstore the solution. Elsewhere in the application, this solution vector
7f296bb3SBarry Smithcan be retrieved from the application object by using the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoGetSolution(Tao, Vec*);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. This routine takes the address of a `Vec` in the second
7f296bb3SBarry Smithargument and sets it to the solution vector used in the application.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### User Defined Call-back Routines
7f296bb3SBarry Smith
7f296bb3SBarry SmithUsers of TAO are required to provide routines that perform function
7f296bb3SBarry Smithevaluations. Depending on the solver chosen, they may also have to write
7f296bb3SBarry Smithroutines that evaluate the gradient vector and Hessian matrix.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Application Context
7f296bb3SBarry Smith
7f296bb3SBarry SmithWriting a TAO application may require use of an *application context*.
7f296bb3SBarry SmithAn application context is a structure or object defined by an
7f296bb3SBarry Smithapplication developer, passed into a routine also written by the
7f296bb3SBarry Smithapplication developer, and used within the routine to perform its stated
7f296bb3SBarry Smithtask.
7f296bb3SBarry Smith
7f296bb3SBarry SmithFor example, a routine that evaluates an objective function may need
7f296bb3SBarry Smithparameters, work vectors, and other information. This information, which
7f296bb3SBarry Smithmay be specific to an application and necessary to evaluate the
7f296bb3SBarry Smithobjective, can be collected in a single structure and used as one of the
7f296bb3SBarry Smitharguments in the routine. The address of this structure will be cast as
7f296bb3SBarry Smithtype `(void*)` and passed to the routine in the final argument. Many
7f296bb3SBarry Smithexamples of these structures are included in the TAO distribution.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis technique offers several advantages. In particular, it allows for a
7f296bb3SBarry Smithuniform interface between TAO and the applications. The fundamental
7f296bb3SBarry Smithinformation needed by TAO appears in the arguments of the routine, while
7f296bb3SBarry Smithdata specific to an application and its implementation is confined to an
7f296bb3SBarry Smithopaque pointer. The routines can access information created outside the
7f296bb3SBarry Smithlocal scope without the use of global variables. The TAO solvers and
7f296bb3SBarry Smithapplication objects will never access this structure, so the application
7f296bb3SBarry Smithdeveloper has complete freedom to define it. If no such structure or
7f296bb3SBarry Smithneeded by the application then a NULL pointer can be used.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_fghj)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Objective Function and Gradient Routines
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO solvers that minimize an objective function require the application
7f296bb3SBarry Smithto evaluate the objective function. Some solvers may also require the
7f296bb3SBarry Smithapplication to evaluate derivatives of the objective function. Routines
7f296bb3SBarry Smiththat perform these computations must be identified to the application
7f296bb3SBarry Smithobject and must follow a strict calling sequence.
7f296bb3SBarry Smith
7f296bb3SBarry SmithRoutines should follow the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateObjective(Tao, Vec, PetscReal*, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithin order to evaluate an objective function
7f296bb3SBarry Smith$f: \, \mathbb R^n \to \mathbb R$. The first argument is the TAO
7f296bb3SBarry SmithSolver object, the second argument is the $n$-dimensional vector
7f296bb3SBarry Smiththat identifies where the objective should be evaluated, and the fourth
7f296bb3SBarry Smithargument is an application context. This routine should use the third
7f296bb3SBarry Smithargument to return the objective value evaluated at the point specified
7f296bb3SBarry Smithby the vector in the second argument.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis routine, and the application context, should be passed to the
7f296bb3SBarry Smithapplication object by using the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetObjective(Tao, PetscErrorCode(*)(Tao, Vec, PetscReal*, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. The first argument in this routine is the TAO solver object,
7f296bb3SBarry Smiththe second argument is a function pointer to the routine that evaluates
7f296bb3SBarry Smiththe objective, and the third argument is the pointer to an appropriate
7f296bb3SBarry Smithapplication context. Although the final argument may point to anything,
7f296bb3SBarry Smithit must be cast as a `(void*)` type. This pointer will be passed back
7f296bb3SBarry Smithto the developer in the fourth argument of the routine that evaluates
7f296bb3SBarry Smiththe objective. In this routine, the pointer can be cast back to the
7f296bb3SBarry Smithappropriate type. Examples of these structures and their usage are
7f296bb3SBarry Smithprovided in the distribution.
7f296bb3SBarry Smith
7f296bb3SBarry SmithMany TAO solvers also require gradient information from the application
7f296bb3SBarry SmithThe gradient of the objective function is specified in a similar manner.
7f296bb3SBarry SmithRoutines that evaluate the gradient should have the calling sequence
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateGradient(Tao, Vec, Vec, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the first argument is the TAO solver object, the second argument
7f296bb3SBarry Smithis the variable vector, the third argument is the gradient vector, and
7f296bb3SBarry Smiththe fourth argument is the user-defined application context. Only the
7f296bb3SBarry Smiththird argument in this routine is different from the arguments in the
7f296bb3SBarry Smithroutine for evaluating the objective function. The numbers in the
7f296bb3SBarry Smithgradient vector have no meaning when passed into this routine, but they
7f296bb3SBarry Smithshould represent the gradient of the objective at the specified point at
7f296bb3SBarry Smiththe end of the routine. This routine, and the user-defined pointer, can
7f296bb3SBarry Smithbe passed to the application object by using the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetGradient(Tao, Vec, PetscErrorCode (*)(Tao, Vec, Vec, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. In this routine, the first argument is the Tao object, the second
7f296bb3SBarry Smithargument is the optional vector to hold the computed gradient, the
7f296bb3SBarry Smiththird argument is the function pointer, and the fourth object is the
7f296bb3SBarry Smithapplication context, cast to `(void*)`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithInstead of evaluating the objective and its gradient in separate
7f296bb3SBarry Smithroutines, TAO also allows the user to evaluate the function and the
7f296bb3SBarry Smithgradient in the same routine. In fact, some solvers are more efficient
7f296bb3SBarry Smithwhen both function and gradient information can be computed in the same
7f296bb3SBarry Smithroutine. These routines should follow the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateFunctionAndGradient(Tao, Vec, PetscReal*, Vec, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the first argument is the TAO solver and the second argument
7f296bb3SBarry Smithpoints to the input vector for use in evaluating the function and
7f296bb3SBarry Smithgradient. The third argument should return the function value, while the
7f296bb3SBarry Smithfourth argument should return the gradient vector. The fifth argument is
7f296bb3SBarry Smitha pointer to a user-defined context. This context and the name of the
7f296bb3SBarry Smithroutine should be set with the call
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetObjectiveAndGradient(Tao, Vec PetscErrorCode (*)(Tao, Vec, PetscReal*, Vec, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the arguments are the TAO application, the optional vector to be
7f296bb3SBarry Smithused to hold the computed gradient, a function pointer, and a
7f296bb3SBarry Smithpointer to a user-defined context.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe TAO example problems demonstrate the use of these application
7f296bb3SBarry Smithcontexts as well as specific instances of function, gradient, and
7f296bb3SBarry SmithHessian evaluation routines. All these routines should return the
7f296bb3SBarry Smithinteger $0$ after successful completion and a nonzero integer if
7f296bb3SBarry Smiththe function is undefined at that point or an error occurred.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_matrixfree)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Hessian Evaluation
7f296bb3SBarry Smith
7f296bb3SBarry SmithSome optimization routines also require a Hessian matrix from the user.
7f296bb3SBarry SmithThe routine that evaluates the Hessian should have the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateHessian(Tao, Vec, Mat, Mat, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the first argument of this routine is a TAO solver object. The
7f296bb3SBarry Smithsecond argument is the point at which the Hessian should be evaluated.
7f296bb3SBarry SmithThe third argument is the Hessian matrix, and the sixth argument is a
7f296bb3SBarry Smithuser-defined context. Since the Hessian matrix is usually used in
7f296bb3SBarry Smithsolving a system of linear equations, a preconditioner for the matrix is
7f296bb3SBarry Smithoften needed. The fourth argument is the matrix that will be used for
7f296bb3SBarry Smithpreconditioning the linear system; in most cases, this matrix will be
7f296bb3SBarry Smiththe same as the Hessian matrix. The fifth argument is the flag used to
7f296bb3SBarry Smithset the Hessian matrix and linear solver in the routine
7f296bb3SBarry Smith`KSPSetOperators()`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithOne can set the Hessian evaluation routine by calling the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetHessian(Tao, Mat, Mat, PetscErrorCode (*)(Tao, Vec, Mat, Mat, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. The first argument is the TAO Solver object. The second and
7f296bb3SBarry Smiththird arguments are, respectively, the Mat object where the Hessian will
7f296bb3SBarry Smithbe stored and the Mat object that will be used for the preconditioning
7f296bb3SBarry Smith(they may be the same). The fourth argument is the function that
7f296bb3SBarry Smithevaluates the Hessian, and the fifth argument is a pointer to a
7f296bb3SBarry Smithuser-defined context, cast to `(void*)`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Finite Differences
7f296bb3SBarry Smith
7f296bb3SBarry SmithFinite-difference approximations can be used to compute the gradient and
7f296bb3SBarry Smiththe Hessian of an objective function. These approximations will slow the
7f296bb3SBarry Smithsolve considerably and are recommended primarily for checking the
7f296bb3SBarry Smithaccuracy of hand-coded gradients and Hessians. These routines are
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoDefaultComputeGradient(Tao, Vec, Vec, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithand
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoDefaultComputeHessian(Tao, Vec, Mat, Mat, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithrespectively. They can be set by using `TaoSetGradient()` and
7f296bb3SBarry Smith`TaoSetHessian()` or through the options database with the
7f296bb3SBarry Smithoptions `-tao_fdgrad` and `-tao_fd`, respectively.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe efficiency of the finite-difference Hessian can be improved if the
7f296bb3SBarry Smithcoloring of the matrix is known. If the application programmer creates a
7f296bb3SBarry SmithPETSc `MatFDColoring` object, it can be applied to the
7f296bb3SBarry Smithfinite-difference approximation by setting the Hessian evaluation
7f296bb3SBarry Smithroutine to
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoDefaultComputeHessianColor(Tao, Vec, Mat, Mat, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithand using the `MatFDColoring` object as the last (`void *`) argument
7f296bb3SBarry Smithto `TaoSetHessian()`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithOne also can use finite-difference approximations to directly check the
7f296bb3SBarry Smithcorrectness of the gradient and/or Hessian evaluation routines. This
7f296bb3SBarry Smithprocess can be initiated from the command line by using the special TAO
7f296bb3SBarry Smithsolver `tao_fd_test` together with the option `-tao_test_gradient`
7f296bb3SBarry Smithor `-tao_test_hessian`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Matrix-Free Methods
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO fully supports matrix-free methods. The matrices specified in the
7f296bb3SBarry SmithHessian evaluation routine need not be conventional matrices; instead,
7f296bb3SBarry Smiththey can point to the data required to implement a particular
7f296bb3SBarry Smithmatrix-free method. The matrix-free variant is allowed *only* when the
7f296bb3SBarry Smithlinear systems are solved by an iterative method in combination with no
7f296bb3SBarry Smithpreconditioning (`PCNONE` or `-pc_type none`), a user-provided
7addb90fSBarry Smithmatrix from which to construct the preconditioner, or a user-provided preconditioner shell
7f296bb3SBarry Smith(`PCSHELL`). In other words, matrix-free methods cannot be used if a
7f296bb3SBarry Smithdirect solver is to be employed. Details about using matrix-free methods
7f296bb3SBarry Smithare provided in the {doc}`/manual/index`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith:::{figure} /images/manual/taofig.svg
7f296bb3SBarry Smith:name: fig_taocallbacks
7f296bb3SBarry Smith
7f296bb3SBarry SmithTao use of PETSc and callbacks
7f296bb3SBarry Smith:::
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bounds)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Constraints
7f296bb3SBarry Smith
7f296bb3SBarry SmithSome optimization problems also impose constraints on the variables or
7f296bb3SBarry Smithintermediate application states. The user defines these constraints through
7f296bb3SBarry Smiththe appropriate TAO interface functions and call-back routines where necessary.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Variable Bounds
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe simplest type of constraint on an optimization problem puts lower or
7f296bb3SBarry Smithupper bounds on the variables. Vectors that represent lower and upper
7f296bb3SBarry Smithbounds for each variable can be set with the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSetVariableBounds(Tao, Vec, Vec);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithcommand. The first vector and second vector should contain the lower and
7f296bb3SBarry Smithupper bounds, respectively. When no upper or lower bound exists for a
7f296bb3SBarry Smithvariable, the bound may be set to `PETSC_INFINITY` or `PETSC_NINFINITY`.
7f296bb3SBarry SmithAfter the two bound vectors have been set, they may be accessed with the
7f296bb3SBarry Smithcommand `TaoGetVariableBounds()`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithSince not all solvers recognize the presence of bound constraints on
7f296bb3SBarry Smithvariables, the user must be careful to select a solver that acknowledges
7f296bb3SBarry Smiththese bounds.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_programming)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### General Constraints
7f296bb3SBarry Smith
7f296bb3SBarry SmithSome TAO algorithms also support general constraints as a linear or nonlinear
7f296bb3SBarry Smithfunction of the optimization variables. These constraints can be imposed either
7f296bb3SBarry Smithas equalities or inequalities. TAO currently does not make any distinctions
7f296bb3SBarry Smithbetween linear and nonlinear constraints, and implements them through the
7f296bb3SBarry Smithsame software interfaces.
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn the equality constrained case, TAO assumes that the constraints are
7f296bb3SBarry Smithformulated as $c_e(x) = 0$ and requires the user to implement a call-back
7f296bb3SBarry Smithroutine for evaluating $c_e(x)$ at a given vector of optimization
7f296bb3SBarry Smithvariables,
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateEqualityConstraints(Tao, Vec, Vec, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithAs in the previous call-back routines, the first argument is the TAO solver
7f296bb3SBarry Smithobject. The second and third arguments are the vector of optimization variables
7f296bb3SBarry Smith(input) and vector of equality constraints (output), respectively. The final
7f296bb3SBarry Smithargument is a pointer to the user-defined application context, cast into
7f296bb3SBarry Smith`(void*)`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithGenerally constrained TAO algorithms also require a second user call-back
7f296bb3SBarry Smithfunction to compute the constraint Jacobian matrix $\nabla_x c_e(x)$,
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateEqualityJacobian(Tao, Vec, Mat, Mat, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the first and last arguments are the TAO solver object and the application
7f296bb3SBarry Smithcontext pointer as before. The second argument is the vector of optimization
7f296bb3SBarry Smithvariables at which the computation takes place. The third and fourth arguments
7f296bb3SBarry Smithare the constraint Jacobian and its pseudo-inverse (optional), respectively. The
7f296bb3SBarry Smithpseudoinverse is optional, and if not available, the user can simply set it
7f296bb3SBarry Smithto the constraint Jacobian itself.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThese call-back functions are then given to the TAO solver using the
7f296bb3SBarry Smithinterface functions
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetEqualityConstraintsRoutine(Tao, Vec, PetscErrorCode (*)(Tao, Vec, Vec, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithand
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetJacobianEqualityRoutine(Tao, Mat, Mat, PetscErrorCode (*)(Tao, Vec, Mat, Mat, PetscCtx, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
dd1e8424Spaul.kuehnerInequality constraints are assumed to be formulated as $c_i(x) \geq 0$
7f296bb3SBarry Smithand follow the same workflow as equality constraints using the
7f296bb3SBarry Smith`TaoSetInequalityConstraintsRoutine()` and `TaoSetJacobianInequalityRoutine()`
7f296bb3SBarry Smithinterfaces.
7f296bb3SBarry Smith
7f296bb3SBarry SmithSome TAO algorithms may adopt an alternative double-sided
7f296bb3SBarry Smith$c_l \leq c_i(x) \leq c_u$ formulation and require the lower and upper
7f296bb3SBarry Smithbounds $c_l$ and $c_u$ to be set using the
7f296bb3SBarry Smith`TaoSetInequalityBounds(Tao, Vec, Vec)` interface. Please refer to the
7f296bb3SBarry Smithdocumentation for each TAO algorithm for further details.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Solving
7f296bb3SBarry Smith
7f296bb3SBarry SmithOnce the application and solver have been set up, the solver can be
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSolve(Tao);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. We discuss several universal options below.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_customize)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Convergence
7f296bb3SBarry Smith
7f296bb3SBarry SmithAlthough TAO and its solvers set default parameters that are useful for
7f296bb3SBarry Smithmany problems, the user may need to modify these parameters in order to
7f296bb3SBarry Smithchange the behavior and convergence of various algorithms.
7f296bb3SBarry Smith
7f296bb3SBarry SmithOne convergence criterion for most algorithms concerns the number of
7f296bb3SBarry Smithdigits of accuracy needed in the solution. In particular, the
7f296bb3SBarry Smithconvergence test employed by TAO attempts to stop when the error in the
7f296bb3SBarry Smithconstraints is less than $\epsilon_{crtol}$ and either
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{lcl}
7f296bb3SBarry Smith||g(X)|| &\leq& \epsilon_{gatol}, \\
7f296bb3SBarry Smith||g(X)||/|f(X)| &\leq& \epsilon_{grtol}, \quad \text{or} \\
7f296bb3SBarry Smith||g(X)||/|g(X_0)| &\leq& \epsilon_{gttol},
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $X$ is the current approximation to the true solution
7f296bb3SBarry Smith$X^*$ and $X_0$ is the initial guess. $X^*$ is
7f296bb3SBarry Smithunknown, so TAO estimates $f(X) - f(X^*)$ with either the square
7f296bb3SBarry Smithof the norm of the gradient or the duality gap. A relative tolerance of
7f296bb3SBarry Smith$\epsilon_{frtol}=0.01$ indicates that two significant digits are
7f296bb3SBarry Smithdesired in the objective function. Each solver sets its own convergence
7f296bb3SBarry Smithtolerances, but they can be changed by using the routine
7f296bb3SBarry Smith`TaoSetTolerances()`. Another set of convergence tolerances terminates
7f296bb3SBarry Smiththe solver when the norm of the gradient function (or Lagrangian
7f296bb3SBarry Smithfunction for bound-constrained problems) is sufficiently close to zero.
7f296bb3SBarry Smith
7f296bb3SBarry SmithOther stopping criteria include a minimum trust-region radius or a
7f296bb3SBarry Smithmaximum number of iterations. These parameters can be set with the
7f296bb3SBarry Smithroutines `TaoSetTrustRegionTolerance()` and
7f296bb3SBarry Smith`TaoSetMaximumIterations()` Similarly, a maximum number of function
7f296bb3SBarry Smithevaluations can be set with the command
7f296bb3SBarry Smith`TaoSetMaximumFunctionEvaluations()`. `-tao_max_it`, and
7f296bb3SBarry Smith`-tao_max_funcs`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Viewing Status
7f296bb3SBarry Smith
7f296bb3SBarry SmithTo see parameters and performance statistics for the solver, the routine
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoView(Tao tao)
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithcan be used. This routine will display to standard output the number of
7f296bb3SBarry Smithfunction evaluations need by the solver and other information specific
7f296bb3SBarry Smithto the solver. This same output can be produced by using the command
7f296bb3SBarry Smithline option `-tao_view`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe progress of the optimization solver can be monitored with the
7f296bb3SBarry Smithruntime option `-tao_monitor`. Although monitoring routines can be
7f296bb3SBarry Smithcustomized, the default monitoring routine will print out several
7f296bb3SBarry Smithrelevant statistics to the screen.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe user also has access to information about the current solution. The
7f296bb3SBarry Smithcurrent iteration number, objective function value, gradient norm,
7f296bb3SBarry Smithinfeasibility norm, and step length can be retrieved with the following
7f296bb3SBarry Smithcommand.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoGetSolutionStatus(Tao tao, PetscInt* iterate, PetscReal* f,
7f296bb3SBarry Smith                  PetscReal* gnorm, PetscReal* cnorm, PetscReal* xdiff,
7f296bb3SBarry Smith                  TaoConvergedReason* reason)
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe last argument returns a code that indicates the reason that the
7f296bb3SBarry Smithsolver terminated. Positive numbers indicate that a solution has been
7f296bb3SBarry Smithfound, while negative numbers indicate a failure. A list of reasons can
7f296bb3SBarry Smithbe found in the manual page for `TaoGetConvergedReason()`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Obtaining a Solution
7f296bb3SBarry Smith
7f296bb3SBarry SmithAfter exiting the `TaoSolve()` function, the solution and the gradient can be
7f296bb3SBarry Smithrecovered with the following routines.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoGetSolution(Tao, Vec*);
7f296bb3SBarry SmithTaoGetGradient(Tao, Vec*, NULL, NULL);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithNote that the `Vec` returned by `TaoGetSolution()` will be the
7f296bb3SBarry Smithsame vector passed to `TaoSetSolution()`. This information can be
7f296bb3SBarry Smithobtained during user-defined routines such as a function evaluation and
7f296bb3SBarry Smithcustomized monitoring routine or after the solver has terminated.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Special Problem structures
7f296bb3SBarry Smith
7f296bb3SBarry SmithCertain special classes of problems solved with TAO utilize specialized
7f296bb3SBarry Smithcode interfaces that are described below per problem type.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_pde_constrained)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### PDE-constrained Optimization
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO solves PDE-constrained optimization problems of the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{u,v} & f(u,v) \\
7f296bb3SBarry Smith\text{subject to} & g(u,v) = 0,
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the state variable $u$ is the solution to the discretized
7f296bb3SBarry Smithpartial differential equation defined by $g$ and parametrized by
7f296bb3SBarry Smiththe design variable $v$, and $f$ is an objective function.
7f296bb3SBarry SmithThe Lagrange multipliers on the constraint are denoted by $y$.
7f296bb3SBarry SmithThis method is set by using the linearly constrained augmented
7f296bb3SBarry SmithLagrangian TAO solver `tao_lcl`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithWe make two main assumptions when solving these problems: the objective
7f296bb3SBarry Smithfunction and PDE constraints have been discretized so that we can treat
7f296bb3SBarry Smiththe optimization problem as finite dimensional and
7f296bb3SBarry Smith$\nabla_u g(u,v)$ is invertible for all $u$ and $v$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithUnlike other TAO solvers where the solution vector contains only the
7f296bb3SBarry Smithoptimization variables, PDE-constrained problems solved with `tao_lcl`
7f296bb3SBarry Smithcombine the design and state variables together in a monolithic solution vector
7f296bb3SBarry Smith$x^T = [u^T, v^T]$. Consequently, the user must provide index sets to
7f296bb3SBarry Smithseparate the two,
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSetStateDesignIS(Tao, IS, IS);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the first IS is a PETSc IndexSet containing the indices of the
7f296bb3SBarry Smithstate variables and the second IS the design variables.
7f296bb3SBarry Smith
7f296bb3SBarry SmithPDE constraints have the general form $g(x) = 0$,
7f296bb3SBarry Smithwhere $c: \mathbb R^n \to \mathbb R^m$. These constraints should
7f296bb3SBarry Smithbe specified in a routine, written by the user, that evaluates
7f296bb3SBarry Smith$g(x)$. The routine that evaluates the constraint equations
7f296bb3SBarry Smithshould have the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateConstraints(Tao, Vec, Vec, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe first argument of this routine is a TAO solver object. The second
7f296bb3SBarry Smithargument is the variable vector at which the constraint function should
7f296bb3SBarry Smithbe evaluated. The third argument is the vector of function values
7f296bb3SBarry Smith$g(x)$, and the fourth argument is a pointer to a user-defined
7f296bb3SBarry Smithcontext. This routine and the user-defined context should be set in the
7f296bb3SBarry SmithTAO solver with the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetConstraintsRoutine(Tao, Vec, PetscErrorCode (*)(Tao, Vec, Vec, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithcommand. In this function, the first argument is the TAO solver object,
7f296bb3SBarry Smiththe second argument a vector in which to store the constraints, the
7f296bb3SBarry Smiththird argument is a function point to the routine for evaluating the
7f296bb3SBarry Smithconstraints, and the fourth argument is a pointer to a user-defined
7f296bb3SBarry Smithcontext.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Jacobian of $g(x)$ is the matrix in
7f296bb3SBarry Smith$\mathbb R^{m \times n}$ such that each column contains the
7f296bb3SBarry Smithpartial derivatives of $g(x)$ with respect to one variable. The
7f296bb3SBarry Smithevaluation of the Jacobian of $g$ should be performed by calling
7f296bb3SBarry Smiththe
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode JacobianState(Tao, Vec, Mat, Mat, Mat, PetscCtx);
*2a8381b2SBarry SmithPetscErrorCode JacobianDesign(Tao, Vec, Mat*, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutines. In these functions, The first argument is the TAO solver
7f296bb3SBarry Smithobject. The second argument is the variable vector at which to evaluate
7f296bb3SBarry Smiththe Jacobian matrix, the third argument is the Jacobian matrix, and the
7f296bb3SBarry Smithlast argument is a pointer to a user-defined context. The fourth and
7f296bb3SBarry Smithfifth arguments of the Jacobian evaluation with respect to the state
7f296bb3SBarry Smithvariables are for providing PETSc matrix objects for the preconditioner
7f296bb3SBarry Smithand for applying the inverse of the state Jacobian, respectively. This
7f296bb3SBarry Smithinverse matrix may be `PETSC_NULL`, in which case TAO will use a PETSc
7f296bb3SBarry SmithKrylov subspace solver to solve the state system. These evaluation
7f296bb3SBarry Smithroutines should be registered with TAO by using the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSetJacobianStateRoutine(Tao, Mat, Mat, Mat,
*2a8381b2SBarry Smith                        PetscErrorCode (*)(Tao, Vec, Mat, Mat, PetscCtx),
*2a8381b2SBarry Smith                        PetscCtx);
7f296bb3SBarry SmithTaoSetJacobianDesignRoutine(Tao, Mat,
*2a8381b2SBarry Smith                        PetscErrorCode (*)(Tao, Vec, Mat*, PetscCtx),
*2a8381b2SBarry Smith                        PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutines. The first argument is the TAO solver object, and the second
7f296bb3SBarry Smithargument is the matrix in which the Jacobian information can be stored.
7f296bb3SBarry SmithFor the state Jacobian, the third argument is the matrix that will be
7f296bb3SBarry Smithused for preconditioning, and the fourth argument is an optional matrix
7f296bb3SBarry Smithfor the inverse of the state Jacobian. One can use `PETSC_NULL` for
7f296bb3SBarry Smiththis inverse argument and let PETSc apply the inverse using a KSP
7f296bb3SBarry Smithmethod, but faster results may be obtained by manipulating the structure
7f296bb3SBarry Smithof the Jacobian and providing an inverse. The fifth argument is the
7f296bb3SBarry Smithfunction pointer, and the sixth argument is an optional user-defined
7f296bb3SBarry Smithcontext. Since no solve is performed with the design Jacobian, there is
7f296bb3SBarry Smithno need to provide preconditioner or inverse matrices.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_evalsof)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Nonlinear Least Squares
7f296bb3SBarry Smith
7f296bb3SBarry SmithFor nonlinear least squares applications, we are solving the
7f296bb3SBarry Smithoptimization problem
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\min_{x} \;\frac{1}{2}||r(x)||_2^2.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithFor these problems, the objective function value should be computed as a
7f296bb3SBarry Smithvector of residuals, $r(x)$, computed with a function of the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateResidual(Tao, Vec, Vec, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithand set with the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetResidualRoutine(Tao, PetscErrorCode (*)(Tao, Vec, Vec, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine. If required by the algorithm, the Jacobian of the residual,
7f296bb3SBarry Smith$J = \partial r(x) / \partial x$, should be computed with a
7f296bb3SBarry Smithfunction of the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateJacobian(Tao, Vec, Mat, PetscCtx;
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithand set with the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetJacobianResidualRoutine(Tao, PetscErrorCode (*)(Tao, Vec, Mat, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithroutine.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_complementary)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Complementarity
7f296bb3SBarry Smith
7f296bb3SBarry SmithComplementarity applications have equality constraints in the form of
7f296bb3SBarry Smithnonlinear equations $C(X) = 0$, where
7f296bb3SBarry Smith$C: \mathbb R^n \to \mathbb R^m$. These constraints should be
7f296bb3SBarry Smithspecified in a routine written by the user with the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EqualityConstraints(Tao, Vec, Vec, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smiththat evaluates $C(X)$. The first argument of this routine is a TAO
7f296bb3SBarry SmithSolver object. The second argument is the variable vector $X$ at
7f296bb3SBarry Smithwhich the constraint function should be evaluated. The third argument is
7f296bb3SBarry Smiththe output vector of function values $C(X)$, and the fourth
7f296bb3SBarry Smithargument is a pointer to a user-defined context.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis routine and the user-defined context must be registered with TAO by
7f296bb3SBarry Smithusing the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetConstraintRoutine(Tao, Vec, PetscErrorCode (*)(Tao, Vec, Vec, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithcommand. In this command, the first argument is TAO Solver object, the
7f296bb3SBarry Smithsecond argument is vector in which to store the function values, the
7f296bb3SBarry Smiththird argument is the user-defined routine that evaluates $C(X)$,
7f296bb3SBarry Smithand the fourth argument is a pointer to a user-defined context that will
7f296bb3SBarry Smithbe passed back to the user.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Jacobian of the function is the matrix in
7f296bb3SBarry Smith$\mathbb R^{m \times n}$ such that each column contains the
7f296bb3SBarry Smithpartial derivatives of $f$ with respect to one variable. The
7f296bb3SBarry Smithevaluation of the Jacobian of $C$ should be performed in a routine
7f296bb3SBarry Smithof the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithPetscErrorCode EvaluateJacobian(Tao, Vec, Mat, Mat, PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn this function, the first argument is the TAO Solver object and the
7f296bb3SBarry Smithsecond argument is the variable vector at which to evaluate the Jacobian
7f296bb3SBarry Smithmatrix. The third argument is the Jacobian matrix, and the sixth
7f296bb3SBarry Smithargument is a pointer to a user-defined context. Since the Jacobian
7f296bb3SBarry Smithmatrix may be used in solving a system of linear equations, a
7f296bb3SBarry Smithpreconditioner for the matrix may be needed. The fourth argument is the
7f296bb3SBarry Smithmatrix that will be used for preconditioning the linear system; in most
7f296bb3SBarry Smithcases, this matrix will be the same as the Hessian matrix. The fifth
7f296bb3SBarry Smithargument is the flag used to set the Jacobian matrix and linear solver
7f296bb3SBarry Smithin the routine `KSPSetOperators()`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis routine should be specified to TAO by using the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
*2a8381b2SBarry SmithTaoSetJacobianRoutine(Tao, Mat, Mat, PetscErrorCode (*)(Tao, Vec, Mat, Mat, PetscCtx), PetscCtx);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithcommand. The first argument is the TAO Solver object; the second and
7f296bb3SBarry Smiththird arguments are the Mat objects in which the Jacobian will be stored
7f296bb3SBarry Smithand the Mat object that will be used for the preconditioning (they may
7f296bb3SBarry Smithbe the same), respectively. The fourth argument is the function pointer;
7f296bb3SBarry Smithand the fifth argument is an optional user-defined context. The Jacobian
7f296bb3SBarry Smithmatrix should be created in a way such that the product of it and the
7f296bb3SBarry Smithvariable vector can be stored in the constraint vector.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_solvers)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith## TAO Algorithms
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO includes a variety of optimization algorithms for several classes of
7f296bb3SBarry Smithproblems (unconstrained, bound-constrained, and PDE-constrained
7f296bb3SBarry Smithminimization, nonlinear least-squares, and complementarity). The TAO
7f296bb3SBarry Smithalgorithms for solving these problems are detailed in this section, a
7f296bb3SBarry Smithparticular algorithm can chosen by using the `TaoSetType()` function
7f296bb3SBarry Smithor using the command line arguments `-tao_type <name>`. For those
7f296bb3SBarry Smithinterested in extending these algorithms or using new ones, please see
7f296bb3SBarry Smith{any}`sec_tao_addsolver` for more information.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_unconstrained)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Unconstrained Minimization
7f296bb3SBarry Smith
7f296bb3SBarry SmithUnconstrained minimization is used to minimize a function of many
7f296bb3SBarry Smithvariables without any constraints on the variables, such as bounds. The
7f296bb3SBarry Smithmethods available in TAO for solving these problems can be classified
7f296bb3SBarry Smithaccording to the amount of derivative information required:
7f296bb3SBarry Smith
7f296bb3SBarry Smith1. Function evaluation only – Nelder-Mead method (`tao_nm`)
7f296bb3SBarry Smith2. Function and gradient evaluations – limited-memory, variable-metric
7f296bb3SBarry Smith   method (`tao_lmvm`) and nonlinear conjugate gradient method
7f296bb3SBarry Smith   (`tao_cg`)
7f296bb3SBarry Smith3. Function, gradient, and Hessian evaluations – Newton Krylov methods:
7f296bb3SBarry Smith   Newton line search (`tao_nls`), Newton trust-region (`tao_ntr`),
7f296bb3SBarry Smith   and Newton trust-region line-search (`tao_ntl`)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe best method to use depends on the particular problem being solved
7f296bb3SBarry Smithand the accuracy required in the solution. If a Hessian evaluation
7f296bb3SBarry Smithroutine is available, then the Newton line search and Newton
7f296bb3SBarry Smithtrust-region methods will likely perform best. When a Hessian evaluation
7f296bb3SBarry Smithroutine is not available, then the limited-memory, variable-metric
7f296bb3SBarry Smithmethod is likely to perform best. The Nelder-Mead method should be used
7f296bb3SBarry Smithonly as a last resort when no gradient information is available.
7f296bb3SBarry Smith
7f296bb3SBarry SmithEach solver has a set of options associated with it that can be set with
7f296bb3SBarry Smithcommand line arguments. These algorithms and the associated options are
7f296bb3SBarry Smithbriefly discussed in this section.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Newton-Krylov Methods
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO features three Newton-Krylov algorithms, separated by their globalization methods
7f296bb3SBarry Smithfor unconstrained optimization: line search (NLS), trust region (NTR), and trust
7f296bb3SBarry Smithregion with a line search (NTL). They are available via the TAO solvers
7f296bb3SBarry Smith`TAONLS`, `TAONTR` and `TAONTL`, respectively, or the `-tao_type`
7f296bb3SBarry Smith`nls`/`ntr`/`ntl` flag.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Newton Line Search Method (NLS)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Newton line search method solves the symmetric system of equations
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry SmithH_k d_k = -g_k
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithto obtain a step $d_k$, where $H_k$ is the Hessian of the
7f296bb3SBarry Smithobjective function at $x_k$ and $g_k$ is the gradient of the
7f296bb3SBarry Smithobjective function at $x_k$. For problems where the Hessian matrix
7f296bb3SBarry Smithis indefinite, the perturbed system of equations
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith(H_k + \rho_k I) d_k = -g_k
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithis solved to obtain the direction, where $\rho_k$ is a positive
7f296bb3SBarry Smithconstant. If the direction computed is not a descent direction, the
7f296bb3SBarry Smith(scaled) steepest descent direction is used instead. Having obtained the
7f296bb3SBarry Smithdirection, a Moré-Thuente line search is applied to obtain a step
7f296bb3SBarry Smithlength, $\tau_k$, that approximately solves the one-dimensional
7f296bb3SBarry Smithoptimization problem
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\min_\tau f(x_k + \tau d_k).
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Newton line search method can be selected by using the TAO solver
7f296bb3SBarry Smith`tao_nls`. The options available for this solver are listed in
7f296bb3SBarry Smith{numref}`table_nlsoptions`. For the best efficiency, function and
7f296bb3SBarry Smithgradient evaluations should be performed simultaneously when using this
7f296bb3SBarry Smithalgorithm.
7f296bb3SBarry Smith
7f296bb3SBarry Smith> ```{eval-rst}
7f296bb3SBarry Smith> .. table:: Summary of ``nls`` options
7f296bb3SBarry Smith>    :name: table_nlsoptions
7f296bb3SBarry Smith>
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    | Name  ``-tao_nls_``      | Value          | Default            | Description        |
7f296bb3SBarry Smith>    +==========================+================+====================+====================+
7f296bb3SBarry Smith>    |          ``ksp_type``    | cg, nash,      | stcg               | KSPType for        |
7f296bb3SBarry Smith>    |                          |                |                    | linear system      |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``pc_type``     | none, jacobi   | lmvm               | PCType for linear  |
7f296bb3SBarry Smith>    |                          |                |                    | system             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``sval``        | real           | :math:`0`          | Initial            |
7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
7f296bb3SBarry Smith>    |                          |                |                    | value              |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``imin``        | real           | :math:`10^{-4}`    | Minimum            |
7f296bb3SBarry Smith>    |                          |                |                    | initial            |
7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
7f296bb3SBarry Smith>    |                          |                |                    | value              |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``imax``        | real           | :math:`100`        | Maximum            |
7f296bb3SBarry Smith>    |                          |                |                    | initial            |
7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
7f296bb3SBarry Smith>    |                          |                |                    | value              |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``imfac``       | real           | :math:`0.1`        | Gradient norm      |
7f296bb3SBarry Smith>    |                          |                |                    | factor when        |
7f296bb3SBarry Smith>    |                          |                |                    | initializing       |
7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``pmax``        | real           | :math:`100`        | Maximum            |
7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
7f296bb3SBarry Smith>    |                          |                |                    | when               |
7f296bb3SBarry Smith>    |                          |                |                    | increasing         |
7f296bb3SBarry Smith>    |                          |                |                    | value              |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``pgfac``       | real           | :math:`10`         | Perturbation growth|
7f296bb3SBarry Smith>    |                          |                |                    | when               |
7f296bb3SBarry Smith>    |                          |                |                    | increasing         |
7f296bb3SBarry Smith>    |                          |                |                    | value              |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``pmgfac``      | real           | :math:`0.1`        | Gradient norm      |
7f296bb3SBarry Smith>    |                          |                |                    | factor when        |
7f296bb3SBarry Smith>    |                          |                |                    | increasing         |
7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``pmin``        | real           | :math:`10^{-12}`   | Minimum non-zero   |
7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
7f296bb3SBarry Smith>    |                          |                |                    | when               |
7f296bb3SBarry Smith>    |                          |                |                    | decreasing         |
7f296bb3SBarry Smith>    |                          |                |                    | value              |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``psfac``       | real           | :math:`0.4`        | Perturbation shrink|
7f296bb3SBarry Smith>    |                          |                |                    | factor when        |
7f296bb3SBarry Smith>    |                          |                |                    | decreasing         |
7f296bb3SBarry Smith>    |                          |                |                    | value              |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``pmsfac``      | real           | :math:`0.1`        | Gradient norm      |
7f296bb3SBarry Smith>    |                          |                |                    | factor when        |
7f296bb3SBarry Smith>    |                          |                |                    | decreasing         |
7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``nu1``         | real           | 0.25               | :math:`\nu_1`      |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``nu2``         | real           | 0.50               | :math:`\nu_2`      |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``nu3``         | real           | 1.00               | :math:`\nu_3`      |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``nu4``         | real           | 1.25               | :math:`\nu_4`      |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``omega1``      | real           | 0.25               | :math:`\omega_1`   |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``omega2``      | real           | 0.50               | :math:`\omega_2`   |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``omega3``      | real           | 1.00               | :math:`\omega_3`   |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``omega4``      | real           | 2.00               | :math:`\omega_4`   |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``omega5``      | real           | 4.00               | :math:`\omega_5`   |
7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``eta1``        | real           | :math:`10^{-4}`    | :math:`\eta_1`     |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``eta2``        | real           | 0.25               | :math:`\eta_2`     |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``eta3``        | real           | 0.50               | :math:`\eta_3`     |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``eta4``        | real           | 0.90               | :math:`\eta_4`     |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``alpha1``      | real           | 0.25               | :math:`\alpha_1`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``alpha2``      | real           | 0.50               | :math:`\alpha_2`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``alpha3``      | real           | 1.00               | :math:`\alpha_3`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``alpha4``      | real           | 2.00               | :math:`\alpha_4`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``alpha5``      | real           | 4.00               | :math:`\alpha_5`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``mu1``         | real           | 0.10               | :math:`\mu_1`      |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``mu2``         | real           | 0.50               | :math:`\mu_2`      |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``gamma1``      | real           | 0.25               | :math:`\gamma_1`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``gamma2``      | real           | 0.50               | :math:`\gamma_2`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``gamma3``      | real           | 2.00               | :math:`\gamma_3`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``gamma4``      | real           | 4.00               | :math:`\gamma_4`   |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith>    |          ``theta``       | real           | 0.05               | :math:`\theta`     |
7f296bb3SBarry Smith>    |                          |                |                    | in                 |
7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
7f296bb3SBarry Smith>    |                          |                |                    | update             |
7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
7f296bb3SBarry Smith> ```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe system of equations is approximately solved by applying the
7f296bb3SBarry Smithconjugate gradient method, Nash conjugate gradient method,
7f296bb3SBarry SmithSteihaug-Toint conjugate gradient method, generalized Lanczos method, or
7f296bb3SBarry Smithan alternative Krylov subspace method supplied by PETSc. The method used
7f296bb3SBarry Smithto solve the systems of equations is specified with the command line
7f296bb3SBarry Smithargument `-tao_nls_ksp_type <cg,nash,stcg,gltr,gmres,...>`; `stcg`
7f296bb3SBarry Smithis the default. See the PETSc manual for further information on changing
7f296bb3SBarry Smiththe behavior of the linear system solvers.
7f296bb3SBarry Smith
7f296bb3SBarry SmithA good preconditioner reduces the number of iterations required to solve
7f296bb3SBarry Smiththe linear system of equations. For the conjugate gradient methods and
7f296bb3SBarry Smithgeneralized Lanczos method, this preconditioner must be symmetric and
7f296bb3SBarry Smithpositive definite. The available options are to use no preconditioner,
7f296bb3SBarry Smiththe absolute value of the diagonal of the Hessian matrix, a
7f296bb3SBarry Smithlimited-memory BFGS approximation to the Hessian matrix, or one of the
7f296bb3SBarry Smithother preconditioners provided by the PETSc package. These
7f296bb3SBarry Smithpreconditioners are specified by the command line arguments
7f296bb3SBarry Smith`-tao_nls_pc_type <none,jacobi,icc,ilu,lmvm>`, respectively. The
7f296bb3SBarry Smithdefault is the `lmvm` preconditioner, which uses a BFGS approximation
7f296bb3SBarry Smithof the inverse Hessian. See the PETSc manual for further information on
7f296bb3SBarry Smithchanging the behavior of the preconditioners.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe perturbation $\rho_k$ is added when the direction returned by
7f296bb3SBarry Smiththe Krylov subspace method is not a descent direction, the Krylov method
7f296bb3SBarry Smithdiverged due to an indefinite preconditioner or matrix, or a direction
7f296bb3SBarry Smithof negative curvature was found. In the last two cases, if the step
7f296bb3SBarry Smithreturned is a descent direction, it is used during the line search.
7f296bb3SBarry SmithOtherwise, a steepest descent direction is used during the line search.
7f296bb3SBarry SmithThe perturbation is decreased as long as the Krylov subspace method
7f296bb3SBarry Smithreports success and increased if further problems are encountered. There
7f296bb3SBarry Smithare three cases: initializing, increasing, and decreasing the
7f296bb3SBarry Smithperturbation. These cases are described below.
7f296bb3SBarry Smith
7f296bb3SBarry Smith1. If $\rho_k$ is zero and a problem was detected with either the
7f296bb3SBarry Smith   direction or the Krylov subspace method, the perturbation is
7f296bb3SBarry Smith   initialized to
7f296bb3SBarry Smith
7f296bb3SBarry Smith   $$
7f296bb3SBarry Smith   \rho_{k+1} = \text{median}\left\{\text{imin}, \text{imfac} * \|g(x_k)\|, \text{imax}\right\},
7f296bb3SBarry Smith   $$
7f296bb3SBarry Smith
7f296bb3SBarry Smith   where $g(x_k)$ is the gradient of the objective function and
7f296bb3SBarry Smith   `imin` is set with the command line argument
7f296bb3SBarry Smith   `-tao_nls_imin <real>` with a default value of $10^{-4}$,
7f296bb3SBarry Smith   `imfac` by `-tao_nls_imfac` with a default value of 0.1, and
7f296bb3SBarry Smith   `imax` by `-tao_nls_imax` with a default value of 100. When using
7f296bb3SBarry Smith   the `gltr` method to solve the system of equations, an estimate of
7f296bb3SBarry Smith   the minimum eigenvalue $\lambda_1$ of the Hessian matrix is
7f296bb3SBarry Smith   available. This value is used to initialize the perturbation to
7f296bb3SBarry Smith   $\rho_{k+1} = \max\left\{\rho_{k+1}, -\lambda_1\right\}$ in
7f296bb3SBarry Smith   this case.
7f296bb3SBarry Smith
7f296bb3SBarry Smith2. If $\rho_k$ is nonzero and a problem was detected with either
7f296bb3SBarry Smith   the direction or Krylov subspace method, the perturbation is
7f296bb3SBarry Smith   increased to
7f296bb3SBarry Smith
7f296bb3SBarry Smith   $$
7f296bb3SBarry Smith   \rho_{k+1} = \min\left\{\text{pmax}, \max\left\{\text{pgfac} * \rho_k, \text{pmgfac} * \|g(x_k)\|\right\}\right\},
7f296bb3SBarry Smith   $$
7f296bb3SBarry Smith
7f296bb3SBarry Smith   where $g(x_k)$ is the gradient of the objective function and
7f296bb3SBarry Smith   `pgfac` is set with the command line argument `-tao_nls_pgfac`
7f296bb3SBarry Smith   with a default value of 10, `pmgfac` by `-tao_nls_pmgfac` with a
7f296bb3SBarry Smith   default value of 0.1, and `pmax` by `-tao_nls_pmax` with a
7f296bb3SBarry Smith   default value of 100.
7f296bb3SBarry Smith
7f296bb3SBarry Smith3. If $\rho_k$ is nonzero and no problems were detected with
7f296bb3SBarry Smith   either the direction or Krylov subspace method, the perturbation is
7f296bb3SBarry Smith   decreased to
7f296bb3SBarry Smith
7f296bb3SBarry Smith   $$
7f296bb3SBarry Smith   \rho_{k+1} = \min\left\{\text{psfac} * \rho_k, \text{pmsfac} * \|g(x_k)\|\right\},
7f296bb3SBarry Smith   $$
7f296bb3SBarry Smith
7f296bb3SBarry Smith   where $g(x_k)$ is the gradient of the objective function,
7f296bb3SBarry Smith   `psfac` is set with the command line argument `-tao_nls_psfac`
7f296bb3SBarry Smith   with a default value of 0.4, and `pmsfac` is set by
7f296bb3SBarry Smith   `-tao_nls_pmsfac` with a default value of 0.1. Moreover, if
7f296bb3SBarry Smith   $\rho_{k+1} < \text{pmin}$, then $\rho_{k+1} = 0$, where
7f296bb3SBarry Smith   `pmin` is set with the command line argument `-tao_nls_pmin` and
7f296bb3SBarry Smith   has a default value of $10^{-12}$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithNear a local minimizer to the unconstrained optimization problem, the
7f296bb3SBarry SmithHessian matrix will be positive-semidefinite; the perturbation will
7f296bb3SBarry Smithshrink toward zero, and one would eventually observe a superlinear
7f296bb3SBarry Smithconvergence rate.
7f296bb3SBarry Smith
7f296bb3SBarry SmithWhen using `nash`, `stcg`, or `gltr` to solve the linear systems
7f296bb3SBarry Smithof equation, a trust-region radius needs to be initialized and updated.
7f296bb3SBarry SmithThis trust-region radius simultaneously limits the size of the step
7f296bb3SBarry Smithcomputed and reduces the number of iterations of the conjugate gradient
7f296bb3SBarry Smithmethod. The method for initializing the trust-region radius is set with
7f296bb3SBarry Smiththe command line argument
7f296bb3SBarry Smith`-tao_nls_init_type <constant,direction,interpolation>`;
7f296bb3SBarry Smith`interpolation`, which chooses an initial value based on the
7f296bb3SBarry Smithinterpolation scheme found in {cite}`cgt`, is the default.
7f296bb3SBarry SmithThis scheme performs a number of function and gradient evaluations to
7f296bb3SBarry Smithdetermine a radius such that the reduction predicted by the quadratic
7f296bb3SBarry Smithmodel along the gradient direction coincides with the actual reduction
7f296bb3SBarry Smithin the nonlinear function. The iterate obtaining the best objective
7f296bb3SBarry Smithfunction value is used as the starting point for the main line search
7f296bb3SBarry Smithalgorithm. The `constant` method initializes the trust-region radius
7f296bb3SBarry Smithby using the value specified with the `-tao_trust0 <real>` command
7f296bb3SBarry Smithline argument, where the default value is 100. The `direction`
7f296bb3SBarry Smithtechnique solves the first quadratic optimization problem by using a
7f296bb3SBarry Smithstandard conjugate gradient method and initializes the trust region to
7f296bb3SBarry Smith$\|s_0\|$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe method for updating the trust-region radius is set with the command
7f296bb3SBarry Smithline argument `-tao_nls_update_type <step,reduction,interpolation>`;
7f296bb3SBarry Smith`step` is the default. The `step` method updates the trust-region
7f296bb3SBarry Smithradius based on the value of $\tau_k$. In particular,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\Delta_{k+1} = \left\{\begin{array}{ll}
7f296bb3SBarry Smith\omega_1 \text{min}(\Delta_k, \|d_k\|) & \text{if } \tau_k \in [0, \nu_1) \\
7f296bb3SBarry Smith\omega_2 \text{min}(\Delta_k, \|d_k\|) & \text{if } \tau_k \in [\nu_1, \nu_2) \\
7f296bb3SBarry Smith\omega_3 \Delta_k & \text{if } \tau_k \in [\nu_2, \nu_3) \\
7f296bb3SBarry Smith\text{max}(\Delta_k, \omega_4 \|d_k\|) & \text{if } \tau_k \in [\nu_3, \nu_4) \\
7f296bb3SBarry Smith\text{max}(\Delta_k, \omega_5 \|d_k\|) & \text{if } \tau_k \in [\nu_4, \infty),
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith\right.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere
7f296bb3SBarry Smith$0 < \omega_1 < \omega_2 < \omega_3 = 1 < \omega_4 < \omega_5$ and
7f296bb3SBarry Smith$0 < \nu_1 < \nu_2 < \nu_3 < \nu_4$ are constants. The
7f296bb3SBarry Smith`reduction` method computes the ratio of the actual reduction in the
7f296bb3SBarry Smithobjective function to the reduction predicted by the quadratic model for
7f296bb3SBarry Smiththe full step,
7f296bb3SBarry Smith$\kappa_k = \frac{f(x_k) - f(x_k + d_k)}{q(x_k) - q(x_k + d_k)}$,
7f296bb3SBarry Smithwhere $q_k$ is the quadratic model. The radius is then updated as
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\Delta_{k+1} = \left\{\begin{array}{ll}
7f296bb3SBarry Smith\alpha_1 \text{min}(\Delta_k, \|d_k\|) & \text{if } \kappa_k \in (-\infty, \eta_1) \\
7f296bb3SBarry Smith\alpha_2 \text{min}(\Delta_k, \|d_k\|) & \text{if } \kappa_k \in [\eta_1, \eta_2) \\
7f296bb3SBarry Smith\alpha_3 \Delta_k & \text{if } \kappa_k \in [\eta_2, \eta_3) \\
7f296bb3SBarry Smith\text{max}(\Delta_k, \alpha_4 \|d_k\|) & \text{if } \kappa_k \in [\eta_3, \eta_4) \\
7f296bb3SBarry Smith\text{max}(\Delta_k, \alpha_5 \|d_k\|) & \text{if } \kappa_k \in [\eta_4, \infty),
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith\right.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere
7f296bb3SBarry Smith$0 < \alpha_1 < \alpha_2 < \alpha_3 = 1 < \alpha_4 < \alpha_5$ and
7f296bb3SBarry Smith$0 < \eta_1 < \eta_2 < \eta_3 < \eta_4$ are constants. The
7f296bb3SBarry Smith`interpolation` method uses the same interpolation mechanism as in the
7f296bb3SBarry Smithinitialization to compute a new value for the trust-region radius.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
7f296bb3SBarry Smiththe Bounded Newton Line Search (BNLS) algorithm that can solve both
7f296bb3SBarry Smithbound constrained and unconstrained problems.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Newton Trust-Region Method (NTR)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Newton trust-region method solves the constrained quadratic
7f296bb3SBarry Smithprogramming problem
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\min_d  & \frac{1}{2}d^T H_k d  + g_k^T d \\
7f296bb3SBarry Smith\text{subject to} & \|d\| \leq \Delta_k
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithto obtain a direction $d_k$, where $H_k$ is the Hessian of
7f296bb3SBarry Smiththe objective function at $x_k$, $g_k$ is the gradient of
7f296bb3SBarry Smiththe objective function at $x_k$, and $\Delta_k$ is the
7f296bb3SBarry Smithtrust-region radius. If $x_k + d_k$ sufficiently reduces the
7f296bb3SBarry Smithnonlinear objective function, then the step is accepted, and the
7f296bb3SBarry Smithtrust-region radius is updated. However, if $x_k + d_k$ does not
7f296bb3SBarry Smithsufficiently reduce the nonlinear objective function, then the step is
7f296bb3SBarry Smithrejected, the trust-region radius is reduced, and the quadratic program
7f296bb3SBarry Smithis re-solved by using the updated trust-region radius. The Newton
7f296bb3SBarry Smithtrust-region method can be set by using the TAO solver `tao_ntr`. The
7f296bb3SBarry Smithoptions available for this solver are listed in
7f296bb3SBarry Smith{numref}`table_ntroptions`. For the best efficiency, function and
7f296bb3SBarry Smithgradient evaluations should be performed separately when using this
7f296bb3SBarry Smithalgorithm.
7f296bb3SBarry Smith
7f296bb3SBarry Smith> ```{eval-rst}
7f296bb3SBarry Smith> .. table:: Summary of ``ntr`` options
7f296bb3SBarry Smith>    :name: table_ntroptions
7f296bb3SBarry Smith>
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | Name ``-tao_ntr_``        | Value          | Default          | Description          |
7f296bb3SBarry Smith>    +===========================+================+==================+======================+
7f296bb3SBarry Smith>    | ``ksp_type``              | nash, stcg     | stcg             | KSPType for          |
7f296bb3SBarry Smith>    |                           |                |                  | linear system        |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | ``pc_type``               | none, jacobi   | lmvm             | PCType for linear    |
7f296bb3SBarry Smith>    |                           |                |                  | system               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``init_type``    | constant,      | interpolation    | Radius               |
7f296bb3SBarry Smith>    |                           | direction,     |                  | initialization       |
7f296bb3SBarry Smith>    |                           | interpolation  |                  | method               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``mu1_i``        | real           | 0.35             | :math:`\mu_1`        |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``mu2_i``        | real           | 0.50             | :math:`\mu_2`        |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``gamma1_i``     | real           | 0.0625           | :math:`\gamma_1`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``gamma2_i``     | real           | 0.50             | :math:`\gamma_2`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``gamma3_i``     | real           | 2.00             | :math:`\gamma_3`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``gamma4_i``     | real           | 5.00             | :math:`\gamma_4`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``theta_i``      | real           | 0.25             | :math:`\theta`       |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``update_type``  | step,          | step             | Radius               |
7f296bb3SBarry Smith>    |                           | reduction,     |                  | update method        |
7f296bb3SBarry Smith>    |                           | interpolation  |                  |                      |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | ``mu1_i``                 | real           | 0.35             | :math:`\mu_1`        |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | ``mu2_i``                 | real           | 0.50             | :math:`\mu_2`        |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | ``gamma1_i``              | real           | 0.0625           | :math:`\gamma_1`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | ``gamma2_i``              | real           | 0.50             | :math:`\gamma_2`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | ``gamma3_i``              | real           | 2.00             | :math:`\gamma_3`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | ``gamma4_i``              | real           | 5.00             | :math:`\gamma_4`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    | ``theta_i``               | real           | 0.25             | :math:`\theta`       |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | init                 |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``eta1``         | real           | :                | :math:`\eta_1`       |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``eta2``         | real           | 0.25             | :math:`\eta_2`       |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``eta3``         | real           | 0.50             | :math:`\eta_3`       |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``eta4``         | real           | 0.90             | :math:`\eta_4`       |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``alpha1``       | real           | 0.25             | :math:`\alpha_1`     |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``alpha2``       | real           | 0.50             | :math:`\alpha_2`     |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``alpha3``       | real           | 1.00             | :math:`\alpha_3`     |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``alpha4``       | real           | 2.00             | :math:`\alpha_4`     |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``alpha5``       | real           | 4.00             | :math:`\alpha_5`     |
7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``mu1``          | real           | 0.10             | :math:`\mu_1`        |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``mu2``          | real           | 0.50             | :math:`\mu_2`        |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``gamma1``       | real           | 0.25             | :math:`\gamma_1`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``gamma2``       | real           | 0.50             | :math:`\gamma_2`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``gamma3``       | real           | 2.00             | :math:`\gamma_3`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``gamma4``       | real           | 4.00             | :math:`\gamma_4`     |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith>    |          ``theta``        | real           | 0.05             | :math:`\theta`       |
7f296bb3SBarry Smith>    |                           |                |                  | in                   |
7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
7f296bb3SBarry Smith>    |                           |                |                  | update               |
7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
7f296bb3SBarry Smith> ```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe quadratic optimization problem is approximately solved by applying
7f296bb3SBarry Smiththe Nash or Steihaug-Toint conjugate gradient methods or the generalized
7f296bb3SBarry SmithLanczos method to the symmetric system of equations
7f296bb3SBarry Smith$H_k d = -g_k$. The method used to solve the system of equations
7f296bb3SBarry Smithis specified with the command line argument
7f296bb3SBarry Smith`-tao_ntr_ksp_type <nash,stcg,gltr>`; `stcg` is the default. See the
7f296bb3SBarry SmithPETSc manual for further information on changing the behavior of these
7f296bb3SBarry Smithlinear system solvers.
7f296bb3SBarry Smith
7f296bb3SBarry SmithA good preconditioner reduces the number of iterations required to
7f296bb3SBarry Smithcompute the direction. For the Nash and Steihaug-Toint conjugate
7f296bb3SBarry Smithgradient methods and generalized Lanczos method, this preconditioner
7f296bb3SBarry Smithmust be symmetric and positive definite. The available options are to
7f296bb3SBarry Smithuse no preconditioner, the absolute value of the diagonal of the Hessian
7f296bb3SBarry Smithmatrix, a limited-memory BFGS approximation to the Hessian matrix, or
7f296bb3SBarry Smithone of the other preconditioners provided by the PETSc package. These
7f296bb3SBarry Smithpreconditioners are specified by the command line argument
7f296bb3SBarry Smith`-tao_ntr_pc_type <none,jacobi,icc,ilu,lmvm>`, respectively. The
7f296bb3SBarry Smithdefault is the `lmvm` preconditioner. See the PETSc manual for further
7f296bb3SBarry Smithinformation on changing the behavior of the preconditioners.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe method for computing an initial trust-region radius is set with the
7f296bb3SBarry Smithcommand line arguments
7f296bb3SBarry Smith`-tao_ntr_init_type <constant,direction,interpolation>`;
7f296bb3SBarry Smith`interpolation`, which chooses an initial value based on the
7f296bb3SBarry Smithinterpolation scheme found in {cite}`cgt`, is the default.
7f296bb3SBarry SmithThis scheme performs a number of function and gradient evaluations to
7f296bb3SBarry Smithdetermine a radius such that the reduction predicted by the quadratic
7f296bb3SBarry Smithmodel along the gradient direction coincides with the actual reduction
7f296bb3SBarry Smithin the nonlinear function. The iterate obtaining the best objective
7f296bb3SBarry Smithfunction value is used as the starting point for the main trust-region
7f296bb3SBarry Smithalgorithm. The `constant` method initializes the trust-region radius
7f296bb3SBarry Smithby using the value specified with the `-tao_trust0 <real>` command
7f296bb3SBarry Smithline argument, where the default value is 100. The `direction`
7f296bb3SBarry Smithtechnique solves the first quadratic optimization problem by using a
7f296bb3SBarry Smithstandard conjugate gradient method and initializes the trust region to
7f296bb3SBarry Smith$\|s_0\|$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe method for updating the trust-region radius is set with the command
7f296bb3SBarry Smithline arguments `-tao_ntr_update_type <reduction,interpolation>`;
7f296bb3SBarry Smith`reduction` is the default. The `reduction` method computes the
7f296bb3SBarry Smithratio of the actual reduction in the objective function to the reduction
7f296bb3SBarry Smithpredicted by the quadratic model for the full step,
7f296bb3SBarry Smith$\kappa_k = \frac{f(x_k) - f(x_k + d_k)}{q(x_k) - q(x_k + d_k)}$,
7f296bb3SBarry Smithwhere $q_k$ is the quadratic model. The radius is then updated as
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\Delta_{k+1} = \left\{\begin{array}{ll}
7f296bb3SBarry Smith\alpha_1 \text{min}(\Delta_k, \|d_k\|) & \text{if } \kappa_k \in (-\infty, \eta_1) \\
7f296bb3SBarry Smith\alpha_2 \text{min}(\Delta_k, \|d_k\|) & \text{if } \kappa_k \in [\eta_1, \eta_2) \\
7f296bb3SBarry Smith\alpha_3 \Delta_k & \text{if } \kappa_k \in [\eta_2, \eta_3) \\
7f296bb3SBarry Smith\text{max}(\Delta_k, \alpha_4 \|d_k\|) & \text{if } \kappa_k \in [\eta_3, \eta_4) \\
7f296bb3SBarry Smith\text{max}(\Delta_k, \alpha_5 \|d_k\|) & \text{if } \kappa_k \in [\eta_4, \infty),
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith\right.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere
7f296bb3SBarry Smith$0 < \alpha_1 < \alpha_2 < \alpha_3 = 1 < \alpha_4 < \alpha_5$ and
7f296bb3SBarry Smith$0 < \eta_1 < \eta_2 < \eta_3 < \eta_4$ are constants. The
7f296bb3SBarry Smith`interpolation` method uses the same interpolation mechanism as in the
7f296bb3SBarry Smithinitialization to compute a new value for the trust-region radius.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
7f296bb3SBarry Smiththe Bounded Newton Trust Region (BNTR) algorithm that can solve both
7f296bb3SBarry Smithbound constrained and unconstrained problems.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Newton Trust Region with Line Search (NTL)
7f296bb3SBarry Smith
7f296bb3SBarry SmithNTL safeguards the trust-region globalization such that a line search
7f296bb3SBarry Smithis used in the event that the step is initially rejected by the
7f296bb3SBarry Smithpredicted versus actual decrease comparison. If the line search fails to
7f296bb3SBarry Smithfind a viable step length for the Newton step, it falls back onto a
7f296bb3SBarry Smithscaled gradient or a gradient descent step. The trust radius is then
7f296bb3SBarry Smithmodified based on the line search step length.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
7f296bb3SBarry Smiththe Bounded Newton Trust Region with Line Search (BNTL) algorithm that
7f296bb3SBarry Smithcan solve both bound constrained and unconstrained problems.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Limited-Memory Variable-Metric Method (LMVM)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe limited-memory, variable-metric method (LMVM) computes a positive definite
7f296bb3SBarry Smithapproximation to the Hessian matrix from a limited number of previous
7f296bb3SBarry Smithiterates and gradient evaluations. A direction is then obtained by
7f296bb3SBarry Smithsolving the system of equations
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry SmithH_k d_k = -\nabla f(x_k),
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $H_k$ is the Hessian approximation obtained by using the
7f296bb3SBarry SmithBFGS update formula. The inverse of $H_k$ can readily be applied
7f296bb3SBarry Smithto obtain the direction $d_k$. Having obtained the direction, a
7f296bb3SBarry SmithMoré-Thuente line search is applied to compute a step length,
7f296bb3SBarry Smith$\tau_k$, that approximately solves the one-dimensional
7f296bb3SBarry Smithoptimization problem
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\min_\tau f(x_k + \tau d_k).
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe current iterate and Hessian approximation are updated, and the
7f296bb3SBarry Smithprocess is repeated until the method converges. This algorithm is the
7f296bb3SBarry Smithdefault unconstrained minimization solver and can be selected by using
7f296bb3SBarry Smiththe TAO solver `tao_lmvm`. For best efficiency, function and gradient
7f296bb3SBarry Smithevaluations should be performed simultaneously when using this
7f296bb3SBarry Smithalgorithm.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe primary factors determining the behavior of this algorithm are the
7f296bb3SBarry Smithtype of Hessian approximation used, the number of vectors stored for the
7f296bb3SBarry Smithapproximation and the initialization/scaling of the approximation. These
7f296bb3SBarry Smithoptions can be configured using the `-tao_lmvm_mat_lmvm` prefix. For
7f296bb3SBarry Smithfurther detail, we refer the reader to the `MATLMVM` matrix type
7f296bb3SBarry Smithdefinitions in the PETSc Manual.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe LMVM algorithm also allows the user to define a custom initial
7f296bb3SBarry SmithHessian matrix $H_{0,k}$ through the interface function
7f296bb3SBarry Smith`TaoLMVMSetH0()`. This user-provided initialization overrides any
7f296bb3SBarry Smithother scalar or diagonal initialization inherent to the LMVM
7f296bb3SBarry Smithapproximation. The provided $H_{0,k}$ must be a PETSc `Mat` type
7f296bb3SBarry Smithobject that represents a positive-definite matrix. The approximation
7f296bb3SBarry Smithprefers `MatSolve()` if the provided matrix has `MATOP_SOLVE`
7f296bb3SBarry Smithimplemented. Otherwise, `MatMult()` is used in a KSP solve to perform
7f296bb3SBarry Smiththe inversion of the user-provided initial Hessian.
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn applications where `TaoSolve()` on the LMVM algorithm is repeatedly
7f296bb3SBarry Smithcalled to solve similar or related problems, `-tao_lmvm_recycle` flag
7f296bb3SBarry Smithcan be used to prevent resetting the LMVM approximation between
7f296bb3SBarry Smithsubsequent solutions. This recycling also avoids one extra function and
7f296bb3SBarry Smithgradient evaluation, instead re-using the values already computed at the
7f296bb3SBarry Smithend of the previous solution.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
7f296bb3SBarry Smiththe Bounded Quasi-Newton Line Search (BQNLS) algorithm that can solve
7f296bb3SBarry Smithboth bound constrained and unconstrained problems.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Nonlinear Conjugate Gradient Method (CG)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe nonlinear conjugate gradient method can be viewed as an extension of
7f296bb3SBarry Smiththe conjugate gradient method for solving symmetric, positive-definite
7f296bb3SBarry Smithlinear systems of equations. This algorithm requires only function and
7f296bb3SBarry Smithgradient evaluations as well as a line search. The TAO implementation
7f296bb3SBarry Smithuses a Moré-Thuente line search to obtain the step length. The nonlinear
7f296bb3SBarry Smithconjugate gradient method can be selected by using the TAO solver
7f296bb3SBarry Smith`tao_cg`. For the best efficiency, function and gradient evaluations
7f296bb3SBarry Smithshould be performed simultaneously when using this algorithm.
7f296bb3SBarry Smith
7f296bb3SBarry SmithFive variations are currently supported by the TAO implementation: the
7f296bb3SBarry SmithFletcher-Reeves method, the Polak-Ribiére method, the Polak-Ribiére-Plus
7f296bb3SBarry Smithmethod {cite}`nocedal2006numerical`, the Hestenes-Stiefel method, and the
7f296bb3SBarry SmithDai-Yuan method. These conjugate gradient methods can be specified by
7f296bb3SBarry Smithusing the command line argument `-tao_cg_type <fr,pr,prp,hs,dy>`,
7f296bb3SBarry Smithrespectively. The default value is `prp`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe conjugate gradient method incorporates automatic restarts when
7f296bb3SBarry Smithsuccessive gradients are not sufficiently orthogonal. TAO measures the
7f296bb3SBarry Smithorthogonality by dividing the inner product of the gradient at the
7f296bb3SBarry Smithcurrent point and the gradient at the previous point by the square of
7f296bb3SBarry Smiththe Euclidean norm of the gradient at the current point. When the
7f296bb3SBarry Smithabsolute value of this ratio is greater than $\eta$, the algorithm
7f296bb3SBarry Smithrestarts using the gradient direction. The parameter $\eta$ can be
7f296bb3SBarry Smithset by using the command line argument `-tao_cg_eta <real>`; 0.1 is
7f296bb3SBarry Smiththe default value.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
7f296bb3SBarry Smiththe Bounded Nonlinear Conjugate Gradient (BNCG) algorithm that can solve
7f296bb3SBarry Smithboth bound constrained and unconstrained problems.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Nelder-Mead Simplex Method (NM)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Nelder-Mead algorithm {cite}`nelder.mead:simplex` is a
7f296bb3SBarry Smithdirect search method for finding a local minimum of a function
7f296bb3SBarry Smith$f(x)$. This algorithm does not require any gradient or Hessian
7f296bb3SBarry Smithinformation of $f$ and therefore has some expected advantages and
7f296bb3SBarry Smithdisadvantages compared to the other TAO solvers. The obvious advantage
7f296bb3SBarry Smithis that it is easier to write an application when no derivatives need to
7f296bb3SBarry Smithbe calculated. The downside is that this algorithm can be slow to
7f296bb3SBarry Smithconverge or can even stagnate, and it performs poorly for large numbers
7f296bb3SBarry Smithof variables.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis solver keeps a set of $N+1$ sorted vectors
7f296bb3SBarry Smith${x_1,x_2,\ldots,x_{N+1}}$ and their corresponding objective
7f296bb3SBarry Smithfunction values $f_1 \leq f_2 \leq \ldots \leq f_{N+1}$. At each
7f296bb3SBarry Smithiteration, $x_{N+1}$ is removed from the set and replaced with
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithx(\mu) = (1+\mu) \frac{1}{N} \sum_{i=1}^N x_i - \mu x_{N+1},
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $\mu$ can be one of
7f296bb3SBarry Smith${\mu_0,2\mu_0,\frac{1}{2}\mu_0,-\frac{1}{2}\mu_0}$ depending on
7f296bb3SBarry Smiththe values of each possible $f(x(\mu))$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe algorithm terminates when the residual $f_{N+1} - f_1$ becomes
7f296bb3SBarry Smithsufficiently small. Because of the way new vectors can be added to the
7f296bb3SBarry Smithsorted set, the minimum function value and/or the residual may not be
7f296bb3SBarry Smithimpacted at each iteration.
7f296bb3SBarry Smith
7f296bb3SBarry SmithTwo options can be set specifically for the Nelder-Mead algorithm:
7f296bb3SBarry Smith
7f296bb3SBarry Smith`-tao_nm_lambda <value>`
7f296bb3SBarry Smith
7f296bb3SBarry Smith: sets the initial set of vectors ($x_0$ plus `value` in each
7f296bb3SBarry Smith  coordinate direction); the default value is $1$.
7f296bb3SBarry Smith
7f296bb3SBarry Smith`-tao_nm_mu <value>`
7f296bb3SBarry Smith
7f296bb3SBarry Smith: sets the value of $\mu_0$; the default is $\mu_0=1$.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bound)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Bound-Constrained Optimization
7f296bb3SBarry Smith
7f296bb3SBarry SmithBound-constrained optimization algorithms solve optimization problems of
7f296bb3SBarry Smiththe form
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll} \displaystyle
7f296bb3SBarry Smith\min_{x} & f(x) \\
7f296bb3SBarry Smith\text{subject to} & l \leq x \leq u.
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThese solvers use the bounds on the variables as well as objective
7f296bb3SBarry Smithfunction, gradient, and possibly Hessian information.
7f296bb3SBarry Smith
7f296bb3SBarry SmithFor any unbounded variables, the bound value for the associated index
7f296bb3SBarry Smithcan be set to `PETSC_INFINITY` for the upper bound and
7f296bb3SBarry Smith`PETSC_NINFINITY` for the lower bound. If all bounds are set to
7f296bb3SBarry Smithinfinity, then the bounded algorithms are equivalent to their
7f296bb3SBarry Smithunconstrained counterparts.
7f296bb3SBarry Smith
7f296bb3SBarry SmithBefore introducing specific methods, we will first define two projection
7f296bb3SBarry Smithoperations used by all bound constrained algorithms.
7f296bb3SBarry Smith
7f296bb3SBarry Smith- Gradient projection:
7f296bb3SBarry Smith
7f296bb3SBarry Smith  $$
7f296bb3SBarry Smith  \mathfrak{P}(g) = \left\{\begin{array}{ll}
7f296bb3SBarry Smith  0 & \text{if} \; (x \leq l_i \land g_i > 0) \lor (x \geq u_i \land g_i < 0) \\
7f296bb3SBarry Smith  g_i & \text{otherwise}
7f296bb3SBarry Smith  \end{array}
7f296bb3SBarry Smith  \right.
7f296bb3SBarry Smith  $$
7f296bb3SBarry Smith
7f296bb3SBarry Smith- Bound projection:
7f296bb3SBarry Smith
7f296bb3SBarry Smith  $$
7f296bb3SBarry Smith  \mathfrak{B}(x) = \left\{\begin{array}{ll}
7f296bb3SBarry Smith  l_i & \text{if} \; x_i < l_i \\
7f296bb3SBarry Smith  u_i & \text{if} \; x_i > u_i \\
7f296bb3SBarry Smith  x_i & \text{otherwise}
7f296bb3SBarry Smith  \end{array}
7f296bb3SBarry Smith  \right.
7f296bb3SBarry Smith  $$
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bnk)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Bounded Newton-Krylov Methods
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO features three bounded Newton-Krylov (BNK) class of algorithms,
7f296bb3SBarry Smithseparated by their globalization methods: projected line search (BNLS),
7f296bb3SBarry Smithtrust region (BNTR), and trust region with a projected line search
7f296bb3SBarry Smithfall-back (BNTL). They are available via the TAO solvers `TAOBNLS`,
7f296bb3SBarry Smith`TAOBNTR` and `TAOBNTL`, respectively, or the `-tao_type`
7f296bb3SBarry Smith`bnls`/`bntr`/`bntl` flag.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe BNK class of methods use an active-set approach to solve the
7f296bb3SBarry Smithsymmetric system of equations,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry SmithH_k p_k = -g_k,
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithonly for inactive variables in the interior of the bounds. The
7f296bb3SBarry Smithactive-set estimation is based on Bertsekas
7f296bb3SBarry Smith{cite}`bertsekas:projected` with the following variable
7f296bb3SBarry Smithindex categories:
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{rlll} \displaystyle
7f296bb3SBarry Smith\text{lower bounded}: & \mathcal{L}(x) & = & \{ i \; : \; x_i \leq l_i + \epsilon \; \land \; g(x)_i > 0 \}, \\
7f296bb3SBarry Smith\text{upper bounded}: & \mathcal{U}(x) & = & \{ i \; : \; x_i \geq u_i + \epsilon \; \land \; g(x)_i < 0 \}, \\
7f296bb3SBarry Smith\text{fixed}: & \mathcal{F}(x) & = & \{ i \; : \; l_i = u_i \}, \\
7f296bb3SBarry Smith\text{active-set}: & \mathcal{A}(x) & = & \{ \mathcal{L}(x) \; \bigcup \; \mathcal{U}(x) \; \bigcup \; \mathcal{F}(x) \}, \\
7f296bb3SBarry Smith\text{inactive-set}: & \mathcal{I}(x) & = & \{ 1,2,\ldots,n \} \; \backslash \; \mathcal{A}(x).
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithAt each iteration, the bound tolerance is estimated as
7f296bb3SBarry Smith$\epsilon_{k+1} = \text{min}(\epsilon_k, ||w_k||_2)$ with
7f296bb3SBarry Smith$w_k = x_k - \mathfrak{B}(x_k - \beta D_k g_k)$, where the
7f296bb3SBarry Smithdiagonal matrix $D_k$ is an approximation of the Hessian inverse
7f296bb3SBarry Smith$H_k^{-1}$. The initial bound tolerance $\epsilon_0$ and the
7f296bb3SBarry Smithstep length $\beta$ have default values of $0.001$ and can
7f296bb3SBarry Smithbe adjusted using `-tao_bnk_as_tol` and `-tao_bnk_as_step` flags,
7f296bb3SBarry Smithrespectively. The active-set estimation can be disabled using the option
7f296bb3SBarry Smith`-tao_bnk_as_type none`, in which case the algorithm simply uses the
7f296bb3SBarry Smithcurrent iterate with no bound tolerances to determine which variables
7f296bb3SBarry Smithare actively bounded and which are free.
7f296bb3SBarry Smith
7f296bb3SBarry SmithBNK algorithms invert the reduced Hessian using a Krylov iterative
7f296bb3SBarry Smithmethod. Trust-region conjugate gradient methods (`KSPNASH`,
7f296bb3SBarry Smith`KSPSTCG`, and `KSPGLTR`) are required for the BNTR and BNTL
7f296bb3SBarry Smithalgorithms, and recommended for the BNLS algorithm. The preconditioner
7f296bb3SBarry Smithtype can be changed using the `-tao_bnk_pc_type`
7f296bb3SBarry Smith`none`/`ilu`/`icc`/`jacobi`/`lmvm`. The `lmvm` option, which
7f296bb3SBarry Smithis also the default, preconditions the Krylov solution with a
7f296bb3SBarry Smith`MATLMVM` matrix. The remaining supported preconditioner types are
7f296bb3SBarry Smithdefault PETSc types. If Jacobi is selected, the diagonal values are
7f296bb3SBarry Smithsafeguarded to be positive. `icc` and `ilu` options produce good
7f296bb3SBarry Smithresults for problems with dense Hessians. The LMVM and Jacobi
7f296bb3SBarry Smithpreconditioners are also used as the approximate inverse-Hessian in the
7f296bb3SBarry Smithactive-set estimation. If neither are available, or if the Hessian
7f296bb3SBarry Smithmatrix does not have `MATOP_GET_DIAGONAL` defined, then the active-set
7f296bb3SBarry Smithestimation falls back onto using an identity matrix in place of
7f296bb3SBarry Smith$D_k$ (this is equivalent to estimating the active-set using a
7f296bb3SBarry Smithgradient descent step).
7f296bb3SBarry Smith
7f296bb3SBarry SmithA special option is available to *accelerate* the convergence of the BNK
7f296bb3SBarry Smithalgorithms by taking a finite number of BNCG iterations at each Newton
7f296bb3SBarry Smithiteration. By default, the number of BNCG iterations is set to zero and
7f296bb3SBarry Smiththe algorithms do not take any BNCG steps. This can be changed using the
7f296bb3SBarry Smithoption flag `-tao_bnk_max_cg_its <i>`. While this reduces the number
7f296bb3SBarry Smithof Newton iterations, in practice it simply trades off the Hessian
7f296bb3SBarry Smithevaluations in the BNK solver for more function and gradient evaluations
7f296bb3SBarry Smithin the BNCG solver. However, it may be useful for certain types of
7f296bb3SBarry Smithproblems where the Hessian evaluation is disproportionately more
7f296bb3SBarry Smithexpensive than the objective function or its gradient.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bnls)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Bounded Newton Line Search (BNLS)
7f296bb3SBarry Smith
7f296bb3SBarry SmithBNLS safeguards the Newton step by falling back onto a BFGS, scaled
7f296bb3SBarry Smithgradient, or gradient steps based on descent direction verifications.
7f296bb3SBarry SmithFor problems with indefinite Hessian matrices, the step direction is
7f296bb3SBarry Smithcalculated using a perturbed system of equations,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith(H_k + \rho_k I)p_k = -g_k,
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $\rho_k$ is a dynamically adjusted positive constant. The
7f296bb3SBarry Smithstep is globalized using a projected Moré-Thuente line search. If a
7f296bb3SBarry Smithtrust-region conjugate gradient method is used for the Hessian
7f296bb3SBarry Smithinversion, the trust radius is modified based on the line search step
7f296bb3SBarry Smithlength.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bntr)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Bounded Newton Trust Region (BNTR)
7f296bb3SBarry Smith
7f296bb3SBarry SmithBNTR globalizes the Newton step using a trust region method based on the
7f296bb3SBarry Smithpredicted versus actual reduction in the cost function. The trust radius
7f296bb3SBarry Smithis increased only if the accepted step is at the trust region boundary.
7f296bb3SBarry SmithThe reduction check features a safeguard for numerical values below
7f296bb3SBarry Smithmachine epsilon, scaled by the latest function value, where the full
7f296bb3SBarry SmithNewton step is accepted without modification.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bntl)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Bounded Newton Trust Region with Line Search (BNTL)
7f296bb3SBarry Smith
7f296bb3SBarry SmithBNTL safeguards the trust-region globalization such that a line search
7f296bb3SBarry Smithis used in the event that the step is initially rejected by the
7f296bb3SBarry Smithpredicted versus actual decrease comparison. If the line search fails to
7f296bb3SBarry Smithfind a viable step length for the Newton step, it falls back onto a
7f296bb3SBarry Smithscaled gradient or a gradient descent step. The trust radius is then
7f296bb3SBarry Smithmodified based on the line search step length.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bqnls)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Bounded Quasi-Newton Line Search (BQNLS)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe BQNLS algorithm uses the BNLS infrastructure, but replaces the step
7f296bb3SBarry Smithcalculation with a direct inverse application of the approximate Hessian
7f296bb3SBarry Smithbased on quasi-Newton update formulas. No Krylov solver is used in the
7f296bb3SBarry Smithsolution, and therefore the quasi-Newton method chosen must guarantee a
7f296bb3SBarry Smithpositive-definite Hessian approximation. This algorithm is available via
7f296bb3SBarry Smith`tao_type bqnls`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bqnk)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Bounded Quasi-Newton-Krylov
7f296bb3SBarry Smith
7f296bb3SBarry SmithBQNK algorithms use the BNK infrastructure, but replace the exact
7f296bb3SBarry SmithHessian with a quasi-Newton approximation. The matrix-free forward
7f296bb3SBarry Smithproduct operation based on quasi-Newton update formulas are used in
7f296bb3SBarry Smithconjunction with Krylov solvers to compute step directions. The
7f296bb3SBarry Smithquasi-Newton inverse application is used to precondition the Krylov
7f296bb3SBarry Smithsolution, and typically helps converge to a step direction in
7f296bb3SBarry Smith$\mathcal{O}(10)$ iterations. This approach is most useful with
7f296bb3SBarry Smithquasi-Newton update types such as Symmetric Rank-1 that cannot strictly
7f296bb3SBarry Smithguarantee positive-definiteness. The BNLS framework with Hessian
7f296bb3SBarry Smithshifting, or the BNTR framework with trust region safeguards, can
7f296bb3SBarry Smithsuccessfully compensate for the Hessian approximation becoming
7f296bb3SBarry Smithindefinite.
7f296bb3SBarry Smith
7f296bb3SBarry SmithSimilar to the full Newton-Krylov counterpart, BQNK algorithms come in
7f296bb3SBarry Smiththree forms separated by the globalization technique: line search
7f296bb3SBarry Smith(BQNKLS), trust region (BQNKTR) and trust region w/ line search
7f296bb3SBarry Smithfall-back (BQNKTL). These algorithms are available via
7f296bb3SBarry Smith`tao_type <bqnkls, bqnktr, bqnktl>`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bncg)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Bounded Nonlinear Conjugate Gradient (BNCG)
7f296bb3SBarry Smith
7f296bb3SBarry SmithBNCG extends the unconstrained nonlinear conjugate gradient algorithm to
7f296bb3SBarry Smithbound constraints via gradient projections and a bounded Moré-Thuente
7f296bb3SBarry Smithline search.
7f296bb3SBarry Smith
7f296bb3SBarry SmithLike its unconstrained counterpart, BNCG offers gradient descent and a
7f296bb3SBarry Smithvariety of CG updates: Fletcher-Reeves, Polak-Ribiére,
7f296bb3SBarry SmithPolak-Ribiére-Plus, Hestenes-Stiefel, Dai-Yuan, Hager-Zhang, Dai-Kou,
7f296bb3SBarry SmithKou-Dai, and the Self-Scaling Memoryless (SSML) BFGS, DFP, and Broyden
7f296bb3SBarry Smithmethods. These methods can be specified by using the command line
7f296bb3SBarry Smithargument
7f296bb3SBarry Smith`-tao_bncg_type <gd,fr,pr,prp,hs,dy,hz,dk,kd,ssml_bfgs,ssml_dfp,ssml_brdn>`,
7f296bb3SBarry Smithrespectively. The default value is `ssml_bfgs`. We have scalar
7f296bb3SBarry Smithpreconditioning for these methods, and it is controlled by the flag
7f296bb3SBarry Smith`tao_bncg_alpha`. To disable rescaling, use $\alpha = -1.0$,
7f296bb3SBarry Smithotherwise $\alpha \in [0, 1]$. BNCG is available via the TAO
7f296bb3SBarry Smithsolver `TAOBNCG` or the `-tao_type bncg` flag.
7f296bb3SBarry Smith
7f296bb3SBarry SmithSome individual methods also contain their own parameters. The
7f296bb3SBarry SmithHager-Zhang and Dou-Kai methods have a parameter that determines the
7f296bb3SBarry Smithminimum amount of contribution the previous search direction gives to
7f296bb3SBarry Smiththe next search direction. The flags are `-tao_bncg_hz_eta` and
7f296bb3SBarry Smith`-tao_bncg_dk_eta`, and by default are set to $0.4$ and
7f296bb3SBarry Smith$0.5$ respectively. The Kou-Dai method has multiple parameters.
7f296bb3SBarry Smith`-tao_bncg_zeta` serves the same purpose as the previous two; set to
7f296bb3SBarry Smith$0.1$ by default. There is also a parameter to scale the
7f296bb3SBarry Smithcontribution of $y_k \equiv \nabla f(x_k) - \nabla f(x_{k-1})$ in
7f296bb3SBarry Smiththe search direction update. It is controlled by `-tao_bncg_xi`, and
7f296bb3SBarry Smithis equal to $1.0$ by default. There are also times where we want
7f296bb3SBarry Smithto maximize the descent as measured by $\nabla f(x_k)^T d_k$, and
7f296bb3SBarry Smiththat may be done by using a negative value of $\xi$; this achieves
7f296bb3SBarry Smithbetter performance when not using the diagonal preconditioner described
7f296bb3SBarry Smithnext. This is enabled by default, and is controlled by
7f296bb3SBarry Smith`-tao_bncg_neg_xi`. Finally, the Broyden method has its convex
7f296bb3SBarry Smithcombination parameter, set with `-tao_bncg_theta`. We have this as 1.0
7f296bb3SBarry Smithby default, i.e. it is by default the BFGS method. One can also
7f296bb3SBarry Smithindividually tweak the BFGS and DFP contributions using the
7f296bb3SBarry Smithmultiplicative constants `-tao_bncg_scale`; both are set to $1$
7f296bb3SBarry Smithby default.
7f296bb3SBarry Smith
7f296bb3SBarry SmithAll methods can be scaled using the parameter `-tao_bncg_alpha`, which
7f296bb3SBarry Smithcontinuously varies in $[0, 1]$. The default value is set
7f296bb3SBarry Smithdepending on the method from initial testing.
7f296bb3SBarry Smith
7f296bb3SBarry SmithBNCG also offers a special type of method scaling. It employs Broyden
7f296bb3SBarry Smithdiagonal scaling as an option for its CG methods, turned on with the
7f296bb3SBarry Smithflag `-tao_bncg_diag_scaling`. Formulations for both the forward
7f296bb3SBarry Smith(regular) and inverse Broyden methods are developed, controlled by the
7f296bb3SBarry Smithflag `-tao_bncg_mat_lmvm_forward`. It is set to True by default.
7f296bb3SBarry SmithWhether one uses the forward or inverse formulations depends on the
7f296bb3SBarry Smithmethod being used. For example, in our preliminary computations, the
7f296bb3SBarry Smithforward formulation works better for the SSML_BFGS method, but the
7f296bb3SBarry Smithinverse formulation works better for the Hestenes-Stiefel method. The
7f296bb3SBarry Smithconvex combination parameter for the Broyden scaling is controlled by
7f296bb3SBarry Smith`-tao_bncg_mat_lmvm_theta`, and is 0 by default. We also employ
7f296bb3SBarry Smithrescaling of the Broyden diagonal, which aids the linesearch immensely.
7f296bb3SBarry SmithThe rescaling parameter is controlled by `-tao_bncg_mat_lmvm_alpha`,
7f296bb3SBarry Smithand should be $\in [0, 1]$. One can disable rescaling of the
7f296bb3SBarry SmithBroyden diagonal entirely by setting
7f296bb3SBarry Smith`-tao_bncg_mat_lmvm_sigma_hist 0`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithOne can also supply their own preconditioner, serving as a Hessian
7f296bb3SBarry Smithinitialization to the above diagonal scaling. The appropriate user
7f296bb3SBarry Smithfunction in the code is `TaoBNCGSetH0(tao, H0)` where `H0` is the
7f296bb3SBarry Smithuser-defined `Mat` object that serves as a preconditioner. For an
7f296bb3SBarry Smithexample of similar usage, see `tao/tutorials/ex3.c`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe active set estimation uses the Bertsekas-based method described in
7f296bb3SBarry Smith{any}`sec_tao_bnk`, which can be deactivated using
7f296bb3SBarry Smith`-tao_bncg_as_type none`, in which case the algorithm will use the
7f296bb3SBarry Smithcurrent iterate to determine the bounded variables with no tolerances
7f296bb3SBarry Smithand no look-ahead step. As in the BNK algorithm, the initial bound
7f296bb3SBarry Smithtolerance and estimator step length used in the Bertsekas method can be
7f296bb3SBarry Smithset via `-tao_bncg_as_tol` and `-tao_bncg_as_step`, respectively.
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn addition to automatic scaled gradient descent restarts under certain
7f296bb3SBarry Smithlocal curvature conditions, we also employ restarts based on a check on
7f296bb3SBarry Smithdescent direction such that
7f296bb3SBarry Smith$\nabla f(x_k)^T d_k \in [-10^{11}, -10^{-9}]$. Furthermore, we
7f296bb3SBarry Smithallow for a variety of alternative restart strategies, all disabled by
7f296bb3SBarry Smithdefault. The `-tao_bncg_unscaled_restart` flag allows one to disable
7f296bb3SBarry Smithrescaling of the gradient for gradient descent steps. The
7f296bb3SBarry Smith`-tao_bncg_spaced_restart` flag tells the solver to restart every
7f296bb3SBarry Smith$Mn$ iterations, where $n$ is the problem dimension and
7f296bb3SBarry Smith$M$ is a constant determined by `-tao_bncg_min_restart_num` and
7f296bb3SBarry Smithis 6 by default. We also have dynamic restart strategies based on
7f296bb3SBarry Smithchecking if a function is locally quadratic; if so, go do a gradient
7f296bb3SBarry Smithdescent step. The flag is `-tao_bncg_dynamic_restart`, disabled by
7f296bb3SBarry Smithdefault since the CG solver usually does better in those cases anyway.
7f296bb3SBarry SmithThe minimum number of quadratic-like steps before a restart is set using
7f296bb3SBarry Smith`-tao_bncg_min_quad` and is 6 by default.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_constrained)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Generally Constrained Solvers
7f296bb3SBarry Smith
7f296bb3SBarry SmithConstrained solvers solve optimization problems that incorporate either or both
7f296bb3SBarry Smithequality and inequality constraints, and may optionally include bounds on
7f296bb3SBarry Smithsolution variables.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Alternating Direction Method of Multipliers (ADMM)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe TAOADMM algorithm is intended to blend the decomposability
7f296bb3SBarry Smithof dual ascent with the superior convergence properties of the method of
7f296bb3SBarry Smithmultipliers. {cite}`boyd` The algorithm solves problems in
7f296bb3SBarry Smiththe form
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{x} & f(x) + g(z) \\
7f296bb3SBarry Smith\text{subject to} & Ax + Bz = c
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $x \in \mathbb R^n$, $z \in \mathbb R^m$,
7f296bb3SBarry Smith$A \in \mathbb R^{p \times n}$,
7f296bb3SBarry Smith$B \in \mathbb R^{p \times m}$, and $c \in \mathbb R^p$.
7f296bb3SBarry SmithEssentially, ADMM is a wrapper over two TAO solver, one for
7f296bb3SBarry Smith$f(x)$, and one for $g(z)$. With method of multipliers, one
7f296bb3SBarry Smithcan form the augmented Lagrangian
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry SmithL_{\rho}(x,z,y) = f(x) + g(z) + y^T(Ax+Bz-c) + (\rho/2)||Ax+Bz-c||_2^2
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThen, ADMM consists of the iterations
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithx^{k+1} := \text{argmin}L_{\rho}(x,z^k,y^k)
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithz^{k+1} := \text{argmin}L_{\rho}(x^{k+1},z,y^k)
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithy^{k+1} := y^k + \rho(Ax^{k+1}+Bz^{k+1}-c)
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn certain formulation of ADMM, solution of $z^{k+1}$ may have
7f296bb3SBarry Smithclosed-form solution. Currently ADMM provides one default implementation
7f296bb3SBarry Smithfor $z^{k+1}$, which is soft-threshold. It can be used with either
7f296bb3SBarry Smith`TaoADMMSetRegularizerType_ADMM()` or
7f296bb3SBarry Smith`-tao_admm_regularizer_type <regularizer_soft_thresh>`. User can also
7f296bb3SBarry Smithpass spectral penalty value, $\rho$, with either
7f296bb3SBarry Smith`TaoADMMSetSpectralPenalty()` or `-tao_admm_spectral_penalty`.
7f296bb3SBarry SmithCurrently, user can use
7f296bb3SBarry Smith
7f296bb3SBarry Smith- `TaoADMMSetMisfitObjectiveAndGradientRoutine()`
7f296bb3SBarry Smith- `TaoADMMSetRegularizerObjectiveAndGradientRoutine()`
7f296bb3SBarry Smith- `TaoADMMSetMisfitHessianRoutine()`
7f296bb3SBarry Smith- `TaoADMMSetRegularizerHessianRoutine()`
7f296bb3SBarry Smith
7f296bb3SBarry SmithAny other combination of routines is currently not supported. Hessian
7f296bb3SBarry Smithmatrices can either be constant or non-constant, of which fact can be
7f296bb3SBarry Smithset via `TaoADMMSetMisfitHessianChangeStatus()`, and
7f296bb3SBarry Smith`TaoADMMSetRegularizerHessianChangeStatus()`. Also, it may appear in
7f296bb3SBarry Smithcertain cases where augmented Lagrangian’s Hessian may become nearly
7f296bb3SBarry Smithsingular depending on the $\rho$, which may change in the case of
7f296bb3SBarry Smith`-tao_admm_dual_update <update_adaptive>, <update_adaptive_relaxed>`.
7f296bb3SBarry SmithThis issue can be prevented by `TaoADMMSetMinimumSpectralPenalty()`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Augmented Lagrangian Method of Multipliers (ALMM)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe TAOALMM method solves generally constrained problems of the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{x} & f(x) \\
7f296bb3SBarry Smith\text{subject to} & g(x) = 0\\
7f296bb3SBarry Smith                  & h(x) \geq 0 \\
7f296bb3SBarry Smith                  & l \leq x \leq u
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $g(x)$ are equality constraints, $h(x)$ are inequality
7f296bb3SBarry Smithconstraints and $l$ and $u$ are lower and upper bounds on
7f296bb3SBarry Smiththe optimization variables, respectively.
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAOALMM converts the above general constrained problem into a sequence
7f296bb3SBarry Smithof bound constrained problems at each outer iteration
7f296bb3SBarry Smith$k = 1,2,\dots$
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{x} & L(x, \lambda_k) \\
7f296bb3SBarry Smith\text{subject to} & l \leq x \leq u
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $L(x, \lambda_k)$ is the augmented Lagrangian merit function
7f296bb3SBarry Smithand $\lambda_k$ is the Lagrange multiplier estimates at outer
7f296bb3SBarry Smithiteration $k$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAOALMM offers two versions of the augmented Lagrangian formulation: the
7f296bb3SBarry Smithcanonical Hestenes-Powell augmented
7f296bb3SBarry SmithLagrangian {cite}`hestenes1969multiplier` {cite}`powell1969method`
7f296bb3SBarry Smithwith inequality constrained converted to equality constraints via slack
7f296bb3SBarry Smithvariables, and the slack-less Powell-Hestenes-Rockafellar
7f296bb3SBarry Smithformulation {cite}`rockafellar1974augmented` that utilizes a
7f296bb3SBarry Smithpointwise `max()` on the inequality constraints. For most
7f296bb3SBarry Smithapplications, the canonical Hestenes-Powell formulation is likely to
7f296bb3SBarry Smithperform better. However, the PHR formulation may be desirable for
7f296bb3SBarry Smithproblems featuring very large numbers of inequality constraints as it
7f296bb3SBarry Smithavoids inflating the dimension of the subproblem with slack variables.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe inner subproblem is solved using a nested bound-constrained
7f296bb3SBarry Smithfirst-order TAO solver. By default, TAOALM uses a quasi-Newton-Krylov
7f296bb3SBarry Smithtrust-region method (TAOBQNKTR). Other first-order methods such as
7f296bb3SBarry SmithTAOBNCG and TAOBQNLS are also appropriate, but a trust-region
7f296bb3SBarry Smithglobalization is strongly recommended for most applications.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Primal-Dual Interior-Point Method (PDIPM)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe TAOPDIPM method (`-tao_type pdipm`) implements a primal-dual interior
7f296bb3SBarry Smithpoint method for solving general nonlinear programming problems of the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{x} & f(x) \\
7f296bb3SBarry Smith\text{subject to} & g(x) = 0 \\
7f296bb3SBarry Smith                  & h(x) \geq 0 \\
7f296bb3SBarry Smith                  & x^- \leq x \leq x^+
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$ (eq_nlp_gen1)
7f296bb3SBarry Smith
7f296bb3SBarry SmithHere, $f(x)$ is the nonlinear objective function, $g(x)$,
7f296bb3SBarry Smith$h(x)$ are the equality and inequality constraints, and
7f296bb3SBarry Smith$x^-$ and $x^+$ are the lower and upper bounds on decision
7f296bb3SBarry Smithvariables $x$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithPDIPM converts the inequality constraints to equalities using slack variables
7f296bb3SBarry Smith$z$ and a log-barrier term, which transforms {eq}`eq_nlp_gen1` to
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{aligned}
7f296bb3SBarry Smith    \text{min}~&f(x) - \mu\sum_{i=1}^{nci}\ln z_i\\
7f296bb3SBarry Smith    \text{s.t.}& \\
7f296bb3SBarry Smith        &ce(x) = 0 \\
7f296bb3SBarry Smith        &ci(x) - z = 0 \\
7f296bb3SBarry Smith    \end{aligned}
7f296bb3SBarry Smith$$ (eq_nlp_gen2)
7f296bb3SBarry Smith
7f296bb3SBarry SmithHere, $ce(x)$ is set of equality constraints that include
7f296bb3SBarry Smith$g(x)$ and fixed decision variables, i.e., $x^- = x = x^+$.
7f296bb3SBarry SmithSimilarly, $ci(x)$ are inequality constraints including
7f296bb3SBarry Smith$h(x)$ and lower/upper/box-constraints on $x$. $\mu$
7f296bb3SBarry Smithis a parameter that is driven to zero as the optimization progresses.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Lagrangian for {eq}`eq_nlp_gen2`) is
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry SmithL_{\mu}(x,\lambda_{ce},\lambda_{ci},z) = f(x) + \lambda_{ce}^Tce(x) - \lambda_{ci}^T(ci(x) - z) - \mu\sum_{i=1}^{nci}\ln z_i
7f296bb3SBarry Smith$$ (eq_lagrangian)
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere, $\lambda_{ce}$ and $\lambda_{ci}$ are the Lagrangian
7f296bb3SBarry Smithmultipliers for the equality and inequality constraints, respectively.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe first order KKT conditions for optimality are as follows
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\nabla L_{\mu}(x,\lambda_{ce},\lambda_{ci},z)    =
7f296bb3SBarry Smith    \begin{bmatrix}
7f296bb3SBarry Smith        \nabla f(x) + \nabla ce(x)^T\lambda_{ce} -  \nabla ci(x)^T \lambda_{ci} \\
7f296bb3SBarry Smith        ce(x) \\
7f296bb3SBarry Smith        ci(x) - z \\
7f296bb3SBarry Smith        Z\Lambda_{ci}e - \mu e
7f296bb3SBarry Smith    \end{bmatrix}
7f296bb3SBarry Smith= 0
7f296bb3SBarry Smith$$ (eq_nlp_kkt)
7f296bb3SBarry Smith
7f296bb3SBarry Smith{eq}`eq_nlp_kkt` is solved iteratively using Newton’s
7f296bb3SBarry Smithmethod using PETSc’s SNES object. After each Newton iteration, a
7f296bb3SBarry Smithline-search is performed to update $x$ and enforce
7f296bb3SBarry Smith$z,\lambda_{ci} \geq 0$. The barrier parameter $\mu$ is also
7f296bb3SBarry Smithupdated after each Newton iteration. The Newton update is obtained by
7f296bb3SBarry Smithsolving the second-order KKT system $Hd = -\nabla L_{\mu}$.
7f296bb3SBarry SmithHere,$H$ is the Hessian matrix of the KKT system. For
7f296bb3SBarry Smithinterior-point methods such as PDIPM, the Hessian matrix tends to be
7f296bb3SBarry Smithill-conditioned, thus necessitating the use of a direct solver. We
7f296bb3SBarry Smithrecommend using LU preconditioner `-pc_type lu` and using direct
7f296bb3SBarry Smithlinear solver packages such `SuperLU_Dist` or `MUMPS`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### PDE-Constrained Optimization
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO solves PDE-constrained optimization problems of the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{u,v} & f(u,v) \\
7f296bb3SBarry Smith\text{subject to} & g(u,v) = 0,
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the state variable $u$ is the solution to the discretized
7f296bb3SBarry Smithpartial differential equation defined by $g$ and parametrized by
7f296bb3SBarry Smiththe design variable $v$, and $f$ is an objective function.
7f296bb3SBarry SmithThe Lagrange multipliers on the constraint are denoted by $y$.
7f296bb3SBarry SmithThis method is set by using the linearly constrained augmented
7f296bb3SBarry SmithLagrangian TAO solver `tao_lcl`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithWe make two main assumptions when solving these problems: the objective
7f296bb3SBarry Smithfunction and PDE constraints have been discretized so that we can treat
7f296bb3SBarry Smiththe optimization problem as finite dimensional and
7f296bb3SBarry Smith$\nabla_u g(u,v)$ is invertible for all $u$ and $v$.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_lcl)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Linearly-Constrained Augmented Lagrangian Method (LCL)
7f296bb3SBarry Smith
7f296bb3SBarry SmithGiven the current iterate $(u_k, v_k, y_k)$, the linearly
7f296bb3SBarry Smithconstrained augmented Lagrangian method approximately solves the
7f296bb3SBarry Smithoptimization problem
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{u,v} & \tilde{f}_k(u, v) \\
7f296bb3SBarry Smith\text{subject to} & A_k (u-u_k) + B_k (v-v_k) + g_k = 0,
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $A_k = \nabla_u g(u_k,v_k)$,
7f296bb3SBarry Smith$B_k = \nabla_v g(u_k,v_k)$, and $g_k = g(u_k, v_k)$ and
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\tilde{f}_k(u,v) = f(u,v) - g(u,v)^T y^k + \frac{\rho_k}{2} \| g(u,v) \|^2
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithis the augmented Lagrangian function. This optimization problem is
7f296bb3SBarry Smithsolved in two stages. The first computes the Newton direction and finds
7f296bb3SBarry Smitha feasible point for the linear constraints. The second computes a
7f296bb3SBarry Smithreduced-space direction that maintains feasibility with respect to the
7f296bb3SBarry Smithlinearized constraints and improves the augmented Lagrangian merit
7f296bb3SBarry Smithfunction.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Newton Step
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Newton direction is obtained by fixing the design variables at their
7f296bb3SBarry Smithcurrent value and solving the linearized constraint for the state
7f296bb3SBarry Smithvariables. In particular, we solve the system of equations
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry SmithA_k du = -g_k
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithto obtain a direction $du$. We need a direction that provides
7f296bb3SBarry Smithsufficient descent for the merit function
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\frac{1}{2} \|g(u,v)\|^2.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThat is, we require $g_k^T A_k du < 0$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithIf the Newton direction is a descent direction, then we choose a penalty
7f296bb3SBarry Smithparameter $\rho_k$ so that $du$ is also a sufficient descent
7f296bb3SBarry Smithdirection for the augmented Lagrangian merit function. We then find
7f296bb3SBarry Smith$\alpha$ to approximately minimize the augmented Lagrangian merit
7f296bb3SBarry Smithfunction along the Newton direction.
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\displaystyle \min_{\alpha \geq 0} \; \tilde{f}_k(u_k + \alpha du, v_k).
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithWe can enforce either the sufficient decrease condition or the Wolfe
7f296bb3SBarry Smithconditions during the search procedure. The new point,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{lcl}
7f296bb3SBarry Smithu_{k+\frac{1}{2}} & = & u_k + \alpha_k du \\
7f296bb3SBarry Smithv_{k+\frac{1}{2}} & = & v_k,
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithsatisfies the linear constraint
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry SmithA_k (u_{k+\frac{1}{2}} - u_k) + B_k (v_{k+\frac{1}{2}} - v_k) + \alpha_k g_k = 0.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithIf the Newton direction computed does not provide descent for the merit
7f296bb3SBarry Smithfunction, then we can use the steepest descent direction
7f296bb3SBarry Smith$du = -A_k^T g_k$ during the search procedure. However, the
7f296bb3SBarry Smithimplication that the intermediate point approximately satisfies the
7f296bb3SBarry Smithlinear constraint is no longer true.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Modified Reduced-Space Step
7f296bb3SBarry Smith
7f296bb3SBarry SmithWe are now ready to compute a reduced-space step for the modified
7f296bb3SBarry Smithoptimization problem:
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{u,v} & \tilde{f}_k(u, v) \\
7f296bb3SBarry Smith\text{subject to} & A_k (u-u_k) + B_k (v-v_k) + \alpha_k g_k = 0.
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithWe begin with the change of variables
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{du,dv} & \tilde{f}_k(u_k+du, v_k+dv) \\
7f296bb3SBarry Smith\text{subject to} & A_k du + B_k dv + \alpha_k g_k = 0
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithand make the substitution
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithdu = -A_k^{-1}(B_k dv + \alpha_k g_k).
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithHence, the unconstrained optimization problem we need to solve is
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{dv} & \tilde{f}_k(u_k-A_k^{-1}(B_k dv + \alpha_k g_k), v_k+dv), \\
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhich is equivalent to
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{dv} & \tilde{f}_k(u_{k+\frac{1}{2}} - A_k^{-1} B_k dv, v_{k+\frac{1}{2}}+dv). \\
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithWe apply one step of a limited-memory quasi-Newton method to this
7f296bb3SBarry Smithproblem. The direction is obtain by solving the quadratic problem
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{dv} & \frac{1}{2} dv^T \tilde{H}_k dv + \tilde{g}_{k+\frac{1}{2}}^T dv,
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $\tilde{H}_k$ is the limited-memory quasi-Newton
7f296bb3SBarry Smithapproximation to the reduced Hessian matrix, a positive-definite matrix,
7f296bb3SBarry Smithand $\tilde{g}_{k+\frac{1}{2}}$ is the reduced gradient.
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{lcl}
7f296bb3SBarry Smith\tilde{g}_{k+\frac{1}{2}} & = & \nabla_v \tilde{f}_k(u_{k+\frac{1}{2}}, v_{k+\frac{1}{2}}) -
7f296bb3SBarry Smith          \nabla_u \tilde{f}_k(u_{k+\frac{1}{2}}, v_{k+\frac{1}{2}}) A_k^{-1} B_k \\
7f296bb3SBarry Smith       & = & d_{k+\frac{1}{2}} + c_{k+\frac{1}{2}} A_k^{-1} B_k
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe reduced gradient is obtained from one linearized adjoint solve
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithy_{k+\frac{1}{2}} = A_k^{-T}c_{k+\frac{1}{2}}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithand some linear algebra
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\tilde{g}_{k+\frac{1}{2}} = d_{k+\frac{1}{2}} + y_{k+\frac{1}{2}}^T B_k.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithBecause the Hessian approximation is positive definite and we know its
7f296bb3SBarry Smithinverse, we obtain the direction
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithdv = -H_k^{-1} \tilde{g}_{k+\frac{1}{2}}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithand recover the full-space direction from one linearized forward solve,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithdu = -A_k^{-1} B_k dv.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithHaving the full-space direction, which satisfies the linear constraint,
7f296bb3SBarry Smithwe now approximately minimize the augmented Lagrangian merit function
7f296bb3SBarry Smithalong the direction.
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{lcl}
7f296bb3SBarry Smith\displaystyle \min_{\beta \geq 0} & \tilde{f_k}(u_{k+\frac{1}{2}} + \beta du, v_{k+\frac{1}{2}} + \beta dv)
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithWe enforce the Wolfe conditions during the search procedure. The new
7f296bb3SBarry Smithpoint is
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{lcl}
7f296bb3SBarry Smithu_{k+1} & = & u_{k+\frac{1}{2}} + \beta_k du \\
7f296bb3SBarry Smithv_{k+1} & = & v_{k+\frac{1}{2}} + \beta_k dv.
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe reduced gradient at the new point is computed from
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{lcl}
7f296bb3SBarry Smithy_{k+1} & = & A_k^{-T}c_{k+1} \\
7f296bb3SBarry Smith\tilde{g}_{k+1} & = & d_{k+1} - y_{k+1}^T B_k,
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $c_{k+1} = \nabla_u \tilde{f}_k (u_{k+1},v_{k+1})$ and
7f296bb3SBarry Smith$d_{k+1} = \nabla_v \tilde{f}_k (u_{k+1},v_{k+1})$. The
7f296bb3SBarry Smithmultipliers $y_{k+1}$ become the multipliers used in the next
7f296bb3SBarry Smithiteration of the code. The quantities $v_{k+\frac{1}{2}}$,
7f296bb3SBarry Smith$v_{k+1}$, $\tilde{g}_{k+\frac{1}{2}}$, and
7f296bb3SBarry Smith$\tilde{g}_{k+1}$ are used to update $H_k$ to obtain the
7f296bb3SBarry Smithlimited-memory quasi-Newton approximation to the reduced Hessian matrix
7f296bb3SBarry Smithused in the next iteration of the code. The update is skipped if it
7f296bb3SBarry Smithcannot be performed.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_leastsquares)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Nonlinear Least-Squares
7f296bb3SBarry Smith
7f296bb3SBarry SmithGiven a function $F: \mathbb R^n \to \mathbb R^m$, the nonlinear
7f296bb3SBarry Smithleast-squares problem minimizes
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithf(x)= \| F(x) \|_2^2 = \sum_{i=1}^m F_i(x)^2.
7f296bb3SBarry Smith$$ (eq_nlsf)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe nonlinear equations $F$ should be specified with the function
7f296bb3SBarry Smith`TaoSetResidual()`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_pounders)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Bound-constrained Regularized Gauss-Newton (BRGN)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe TAOBRGN algorithms is a Gauss-Newton method is used to iteratively solve nonlinear least
7f296bb3SBarry Smithsquares problem with the iterations
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithx_{k+1} = x_k - \alpha_k(J_k^T J_k)^{-1} J_k^T r(x_k)
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $r(x)$ is the least-squares residual vector,
7f296bb3SBarry Smith$J_k = \partial r(x_k)/\partial x$ is the Jacobian of the
7f296bb3SBarry Smithresidual, and $\alpha_k$ is the step length parameter. In other
7f296bb3SBarry Smithwords, the Gauss-Newton method approximates the Hessian of the objective
7f296bb3SBarry Smithas $H_k \approx (J_k^T J_k)$ and the gradient of the objective as
7f296bb3SBarry Smith$g_k \approx -J_k r(x_k)$. The least-squares Jacobian, $J$,
7f296bb3SBarry Smithshould be provided to Tao using `TaoSetJacobianResidual()` routine.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe BRGN (`-tao_type brgn`) implementation adds a regularization term $\beta(x)$ such
7f296bb3SBarry Smiththat
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\min_{x} \; \frac{1}{2}||R(x)||_2^2 + \lambda\beta(x),
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $\lambda$ is the scalar weight of the regularizer. BRGN
7f296bb3SBarry Smithprovides two default implementations for $\beta(x)$:
7f296bb3SBarry Smith
7f296bb3SBarry Smith- **L2-norm** - $\beta(x) = \frac{1}{2}||x_k||_2^2$
7f296bb3SBarry Smith- **L2-norm Proximal Point** -
7f296bb3SBarry Smith  $\beta(x) = \frac{1}{2}||x_k - x_{k-1}||_2^2$
7f296bb3SBarry Smith- **L1-norm with Dictionary** -
7f296bb3SBarry Smith  $\beta(x) = ||Dx||_1 \approx \sum_{i} \sqrt{y_i^2 + \epsilon^2}-\epsilon$
7f296bb3SBarry Smith  where $y = Dx$ and $\epsilon$ is the smooth approximation
7f296bb3SBarry Smith  parameter.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe regularizer weight can be controlled with either
7f296bb3SBarry Smith`TaoBRGNSetRegularizerWeight()` or `-tao_brgn_regularizer_weight`
7f296bb3SBarry Smithcommand line option, while the smooth approximation parameter can be set
7f296bb3SBarry Smithwith either `TaoBRGNSetL1SmoothEpsilon()` or
7f296bb3SBarry Smith`-tao_brgn_l1_smooth_epsilon`. For the L1-norm term, the user can
7f296bb3SBarry Smithsupply a dictionary matrix with `TaoBRGNSetDictionaryMatrix()`. If no
7f296bb3SBarry Smithdictionary is provided, the dictionary is assumed to be an identity
7f296bb3SBarry Smithmatrix and the regularizer reduces to a sparse solution term.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe regularization selection can be made using the command line option
7f296bb3SBarry Smith`-tao_brgn_regularization_type <l2pure, l2prox, l1dict, user>` where the `user` option allows
7f296bb3SBarry Smiththe user to define a custom $\mathcal{C}2$-continuous
7f296bb3SBarry Smithregularization term. This custom term can be defined by using the
7f296bb3SBarry Smithinterface functions:
7f296bb3SBarry Smith
7f296bb3SBarry Smith- `TaoBRGNSetRegularizerObjectiveAndGradientRoutine()` - Provide
7f296bb3SBarry Smith  user-call back for evaluating the function value and gradient
7f296bb3SBarry Smith  evaluation for the regularization term.
7f296bb3SBarry Smith- `TaoBRGNSetRegularizerHessianRoutine()` - Provide user call-back
7f296bb3SBarry Smith  for evaluating the Hessian of the regularization term.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### POUNDERS
7f296bb3SBarry Smith
7f296bb3SBarry SmithOne algorithm for solving the least squares problem
7f296bb3SBarry Smith({eq}`eq_nlsf`) when the Jacobian of the residual vector
7f296bb3SBarry Smith$F$ is unavailable is the model-based POUNDERS (Practical
7f296bb3SBarry SmithOptimization Using No Derivatives for sums of Squares) algorithm
7f296bb3SBarry Smith(`tao_pounders`). POUNDERS employs a derivative-free trust-region
7f296bb3SBarry Smithframework as described in {cite}`dfobook` in order to
7f296bb3SBarry Smithconverge to local minimizers. An example of this version of POUNDERS
7f296bb3SBarry Smithapplied to a practical least-squares problem can be found in
7f296bb3SBarry Smith{cite}`unedf0`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Derivative-Free Trust-Region Algorithm
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn each iteration $k$, the algorithm maintains a model
7f296bb3SBarry Smith$m_k(x)$, described below, of the nonlinear least squares function
7f296bb3SBarry Smith$f$ centered about the current iterate $x_k$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithIf one assumes that the maximum number of function evaluations has not
7f296bb3SBarry Smithbeen reached and that $\|\nabla m_k(x_k)\|_2>$`gtol`, the next
7f296bb3SBarry Smithpoint $x_+$ to be evaluated is obtained by solving the
7f296bb3SBarry Smithtrust-region subproblem
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\min\left\{
7f296bb3SBarry Smith m_k(x) :
7f296bb3SBarry Smith \|x-x_k\|_{p} \leq \Delta_k,
7f296bb3SBarry Smith \right \},
7f296bb3SBarry Smith$$ (eq_poundersp)
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $\Delta_k$ is the current trust-region radius. By default we
7f296bb3SBarry Smithuse a trust-region norm with $p=\infty$ and solve
7f296bb3SBarry Smith({eq}`eq_poundersp`) with the BLMVM method described in
7f296bb3SBarry Smith{any}`sec_tao_blmvm`. While the subproblem is a
7f296bb3SBarry Smithbound-constrained quadratic program, it may not be convex and the BQPIP
7f296bb3SBarry Smithand GPCG methods may not solve the subproblem. Therefore, a bounded
7f296bb3SBarry SmithNewton-Krylov Method should be used; the default is the BNTR
7f296bb3SBarry Smithalgorithm. Note: BNTR uses its own internal
7f296bb3SBarry Smithtrust region that may interfere with the infinity-norm trust region used
7f296bb3SBarry Smithin the model problem ({eq}`eq_poundersp`).
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe residual vector is then evaluated to obtain $F(x_+)$ and hence
7f296bb3SBarry Smith$f(x_+)$. The ratio of actual decrease to predicted decrease,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\rho_k = \frac{f(x_k)-f(x_+)}{m_k(x_k)-m_k(x_+)},
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithas well as an indicator, `valid`, on the model’s quality of
7f296bb3SBarry Smithapproximation on the trust region is then used to update the iterate,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithx_{k+1} = \left\{\begin{array}{ll}
7f296bb3SBarry Smithx_+ & \text{if } \rho_k \geq \eta_1 \\
7f296bb3SBarry Smithx_+ & \text{if } 0<\rho_k <\eta_1  \text{ and \texttt{valid}=\texttt{true}}
7f296bb3SBarry Smith\\
7f296bb3SBarry Smithx_k & \text{else},
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith\right.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithand trust-region radius,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\Delta_{k+1} = \left\{\begin{array}{ll}
7f296bb3SBarry Smith \text{min}(\gamma_1\Delta_k, \Delta_{\max}) & \text{if } \rho_k \geq
7f296bb3SBarry Smith\eta_1 \text{ and } \|x_+-x_k\|_p\geq \omega_1\Delta_k \\
7f296bb3SBarry Smith\gamma_0\Delta_k & \text{if } \rho_k < \eta_1 \text{ and
7f296bb3SBarry Smith\texttt{valid}=\texttt{true}} \\
7f296bb3SBarry Smith\Delta_k &  \text{else,}
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith\right.
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere $0 < \eta_1 < 1$, $0 < \gamma_0 < 1 < \gamma_1$,
7f296bb3SBarry Smith$0<\omega_1<1$, and $\Delta_{\max}$ are constants.
7f296bb3SBarry Smith
7f296bb3SBarry SmithIf $\rho_k\leq 0$ and `valid` is `false`, the iterate and
7f296bb3SBarry Smithtrust-region radius remain unchanged after the above updates, and the
7f296bb3SBarry Smithalgorithm tests whether the direction $x_+-x_k$ improves the
7f296bb3SBarry Smithmodel. If not, the algorithm performs an additional evaluation to obtain
7f296bb3SBarry Smith$F(x_k+d_k)$, where $d_k$ is a model-improving direction.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe iteration counter is then updated, and the next model $m_{k}$
7f296bb3SBarry Smithis obtained as described next.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Forming the Trust-Region Model
7f296bb3SBarry Smith
7f296bb3SBarry SmithIn each iteration, POUNDERS uses a subset of the available evaluated
7f296bb3SBarry Smithresidual vectors $\{ F(y_1), F(y_2), \cdots \}$ to form an
7f296bb3SBarry Smithinterpolatory quadratic model of each residual component. The $m$
7f296bb3SBarry Smithquadratic models
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithq_k^{(i)}(x) =
7f296bb3SBarry Smith F_i(x_k) + (x-x_k)^T g_k^{(i)} + \frac{1}{2} (x-x_k)^T H_k^{(i)} (x-x_k),
7f296bb3SBarry Smith \qquad i = 1, \ldots, m
7f296bb3SBarry Smith$$ (eq_models)
7f296bb3SBarry Smith
7f296bb3SBarry Smiththus satisfy the interpolation conditions
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithq_k^{(i)}(y_j) = F_i(y_j), \qquad i=1, \ldots, m; \, j=1,\ldots , l_k
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithon a common interpolation set $\{y_1, \cdots , y_{l_k}\}$ of size
7f296bb3SBarry Smith$l_k\in[n+1,$`npmax`$]$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe gradients and Hessians of the models in
7f296bb3SBarry Smith{any}`eq_models` are then used to construct the main
7f296bb3SBarry Smithmodel,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smithm_k(x) = f(x_k) +
7f296bb3SBarry Smith$$ (eq_newton2)
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith2(x-x_k)^T \sum_{i=1}^{m} F_i(x_k) g_k^{(i)} + (x-x_k)^T \sum_{i=1}^{m} \left( g_k^{(i)} \left(g_k^{(i)}\right)^T +  F_i(x_k) H_k^{(i)}\right) (x-x_k).
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe process of forming these models also computes the indicator
7f296bb3SBarry Smith`valid` of the model’s local quality.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Parameters
7f296bb3SBarry Smith
7f296bb3SBarry SmithPOUNDERS supports the following parameters that can be set from the
7f296bb3SBarry Smithcommand line or PETSc options file:
7f296bb3SBarry Smith
7f296bb3SBarry Smith`-tao_pounders_delta <delta>`
7f296bb3SBarry Smith
7f296bb3SBarry Smith: The initial trust-region radius ($>0$, real). This is used to
7f296bb3SBarry Smith  determine the size of the initial neighborhood within which the
7f296bb3SBarry Smith  algorithm should look.
7f296bb3SBarry Smith
7f296bb3SBarry Smith`-tao_pounders_npmax <npmax>`
7f296bb3SBarry Smith
7f296bb3SBarry Smith: The maximum number of interpolation points used ($n+2\leq$
7f296bb3SBarry Smith  `npmax` $\leq 0.5(n+1)(n+2)$). This input is made available
7f296bb3SBarry Smith  to advanced users. We recommend the default value
7f296bb3SBarry Smith  (`npmax`$=2n+1$) be used by others.
7f296bb3SBarry Smith
7f296bb3SBarry Smith`-tao_pounders_gqt`
7f296bb3SBarry Smith
7f296bb3SBarry Smith: Use the gqt algorithm to solve the
7f296bb3SBarry Smith  subproblem ({eq}`eq_poundersp`) (uses $p=2$)
7f296bb3SBarry Smith  instead of BQPIP.
7f296bb3SBarry Smith
7f296bb3SBarry Smith`-pounders_subsolver`
7f296bb3SBarry Smith
7f296bb3SBarry Smith: If the default BQPIP algorithm is used to solve the
7f296bb3SBarry Smith  subproblem ({eq}`eq_poundersp`), the parameters of
7f296bb3SBarry Smith  the subproblem solver can be accessed using the command line options
7f296bb3SBarry Smith  prefix `-pounders_subsolver_`. For example,
7f296bb3SBarry Smith
7f296bb3SBarry Smith  ```
7f296bb3SBarry Smith  -pounders_subsolver_tao_gatol 1.0e-5
7f296bb3SBarry Smith  ```
7f296bb3SBarry Smith
7f296bb3SBarry Smith  sets the gradient tolerance of the subproblem solver to
7f296bb3SBarry Smith  $10^{-5}$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithAdditionally, the user provides an initial solution vector, a vector for
7f296bb3SBarry Smithstoring the separable objective function, and a routine for evaluating
7f296bb3SBarry Smiththe residual vector $F$. These are described in detail in
7f296bb3SBarry Smith{any}`sec_tao_fghj` and
7f296bb3SBarry Smith{any}`sec_tao_evalsof`. Here we remark that because gradient
7f296bb3SBarry Smithinformation is not available for scaling purposes, it can be useful to
7f296bb3SBarry Smithensure that the problem is reasonably well scaled. A simple way to do so
7f296bb3SBarry Smithis to rescale the decision variables $x$ so that their typical
7f296bb3SBarry Smithvalues are expected to lie within the unit hypercube $[0,1]^n$.
7f296bb3SBarry Smith
7f296bb3SBarry Smith##### Convergence Notes
7f296bb3SBarry Smith
7f296bb3SBarry SmithBecause the gradient function is not provided to POUNDERS, the norm of
7f296bb3SBarry Smiththe gradient of the objective function is not available. Therefore, for
7f296bb3SBarry Smithconvergence criteria, this norm is approximated by the norm of the model
7f296bb3SBarry Smithgradient and used only when the model gradient is deemed to be a
7f296bb3SBarry Smithreasonable approximation of the gradient of the objective. In practice,
7f296bb3SBarry Smiththe typical grounds for termination for expensive derivative-free
7f296bb3SBarry Smithproblems is the maximum number of function evaluations allowed.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_complementarity)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Complementarity
7f296bb3SBarry Smith
7f296bb3SBarry SmithMixed complementarity problems, or box-constrained variational
7f296bb3SBarry Smithinequalities, are related to nonlinear systems of equations. They are
7f296bb3SBarry Smithdefined by a continuously differentiable function,
7f296bb3SBarry Smith$F:\mathbb R^n \to \mathbb R^n$, and bounds,
7f296bb3SBarry Smith$\ell \in \{\mathbb R\cup \{-\infty\}\}^n$ and
7f296bb3SBarry Smith$u \in \{\mathbb R\cup \{\infty\}\}^n$, on the variables such that
7f296bb3SBarry Smith$\ell \leq u$. Given this information,
7f296bb3SBarry Smith$\mathbf{x}^* \in [\ell,u]$ is a solution to
7f296bb3SBarry SmithMCP($F$, $\ell$, $u$) if for each
7f296bb3SBarry Smith$i \in \{1, \ldots, n\}$ we have at least one of the following:
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{aligned}
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry SmithF_i(x^*) \geq 0 & \text{if } x^*_i = \ell_i \\
7f296bb3SBarry SmithF_i(x^*) = 0 & \text{if } \ell_i < x^*_i < u_i \\
7f296bb3SBarry SmithF_i(x^*) \leq 0 & \text{if } x^*_i = u_i.
7f296bb3SBarry Smith\end{array}\end{aligned}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithNote that when $\ell = \{-\infty\}^n$ and
7f296bb3SBarry Smith$u = \{\infty\}^n$, we have a nonlinear system of equations, and
7f296bb3SBarry Smith$\ell = \{0\}^n$ and $u = \{\infty\}^n$ correspond to the
7f296bb3SBarry Smithnonlinear complementarity problem {cite}`cottle:nonlinear`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithSimple complementarity conditions arise from the first-order optimality
7f296bb3SBarry Smithconditions from optimization
7f296bb3SBarry Smith{cite}`karush:minima` {cite}`kuhn.tucker:nonlinear`. In the simple
7f296bb3SBarry Smithbound-constrained optimization case, these conditions correspond to
7f296bb3SBarry SmithMCP($\nabla f$, $\ell$, $u$), where
7f296bb3SBarry Smith$f: \mathbb R^n \to \mathbb R$ is the objective function. In a
7f296bb3SBarry Smithone-dimensional setting these conditions are intuitive. If the solution
7f296bb3SBarry Smithis at the lower bound, then the function must be increasing and
7f296bb3SBarry Smith$\nabla f \geq 0$. If the solution is at the upper bound, then the
7f296bb3SBarry Smithfunction must be decreasing and $\nabla f \leq 0$. If the solution
7f296bb3SBarry Smithis strictly between the bounds, we must be at a stationary point and
7f296bb3SBarry Smith$\nabla f = 0$. Other complementarity problems arise in economics
7f296bb3SBarry Smithand engineering {cite}`ferris.pang:engineering`, game theory
7f296bb3SBarry Smith{cite}`nash:equilibrium`, and finance
7f296bb3SBarry Smith{cite}`huang.pang:option`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithEvaluation routines for $F$ and its Jacobian must be supplied
7f296bb3SBarry Smithprior to solving the application. The bounds, $[\ell,u]$, on the
7f296bb3SBarry Smithvariables must also be provided. If no starting point is supplied, a
7f296bb3SBarry Smithdefault starting point of all zeros is used.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Semismooth Methods
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO has two implementations of semismooth algorithms
7f296bb3SBarry Smith{cite}`munson.facchinei.ea:semismooth` {cite}`deluca.facchinei.ea:semismooth`
7f296bb3SBarry Smith{cite}`facchinei.fischer.ea:semismooth` for solving mixed complementarity
7f296bb3SBarry Smithproblems. Both are based on a reformulation of the mixed complementarity
7f296bb3SBarry Smithproblem as a nonsmooth system of equations using the Fischer-Burmeister
7f296bb3SBarry Smithfunction {cite}`fischer:special`. A nonsmooth Newton method
7f296bb3SBarry Smithis applied to the reformulated system to calculate a solution. The
7f296bb3SBarry Smiththeoretical properties of such methods are detailed in the
7f296bb3SBarry Smithaforementioned references.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe Fischer-Burmeister function, $\phi:\mathbb R^2 \to \mathbb R$,
7f296bb3SBarry Smithis defined as
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{aligned}
7f296bb3SBarry Smith\phi(a,b) := \sqrt{a^2 + b^2} - a - b.\end{aligned}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis function has the following key property,
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{aligned}
7f296bb3SBarry Smith\begin{array}{lcr}
7f296bb3SBarry Smith        \phi(a,b) = 0 & \Leftrightarrow & a \geq 0,\; b \geq 0,\; ab = 0,
7f296bb3SBarry Smith\end{array}\end{aligned}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithused when reformulating the mixed complementarity problem as the system
7f296bb3SBarry Smithof equations $\Phi(x) = 0$, where
7f296bb3SBarry Smith$\Phi:\mathbb R^n \to \mathbb R^n$. The reformulation is defined
7f296bb3SBarry Smithcomponentwise as
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{aligned}
7f296bb3SBarry Smith\Phi_i(x) := \left\{ \begin{array}{ll}
7f296bb3SBarry Smith   \phi(x_i - l_i, F_i(x)) & \text{if } -\infty < l_i < u_i = \infty, \\
7f296bb3SBarry Smith   -\phi(u_i-x_i, -F_i(x)) & \text{if } -\infty = l_i < u_i < \infty, \\
7f296bb3SBarry Smith   \phi(x_i - l_i, \phi(u_i - x_i, - F_i(x))) & \text{if } -\infty < l_i < u_i < \infty, \\
7f296bb3SBarry Smith   -F_i(x) & \text{if } -\infty = l_i < u_i = \infty, \\
7f296bb3SBarry Smith   l_i - x_i & \text{if } -\infty < l_i = u_i < \infty.
7f296bb3SBarry Smith   \end{array} \right.\end{aligned}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithWe note that $\Phi$ is not differentiable everywhere but satisfies
7f296bb3SBarry Smitha semismoothness property
7f296bb3SBarry Smith{cite}`mifflin:semismooth` {cite}`qi:convergence` {cite}`qi.sun:nonsmooth`.
7f296bb3SBarry SmithFurthermore, the natural merit function,
7f296bb3SBarry Smith$\Psi(x) := \frac{1}{2} \| \Phi(x) \|_2^2$, is continuously
7f296bb3SBarry Smithdifferentiable.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe two semismooth TAO solvers both solve the system $\Phi(x) = 0$
7f296bb3SBarry Smithby applying a nonsmooth Newton method with a line search. We calculate a
7f296bb3SBarry Smithdirection, $d^k$, by solving the system
7f296bb3SBarry Smith$H^kd^k = -\Phi(x^k)$, where $H^k$ is an element of the
7f296bb3SBarry Smith$B$-subdifferential {cite}`qi.sun:nonsmooth` of
7f296bb3SBarry Smith$\Phi$ at $x^k$. If the direction calculated does not
7f296bb3SBarry Smithsatisfy a suitable descent condition, then we use the negative gradient
7f296bb3SBarry Smithof the merit function, $-\nabla \Psi(x^k)$, as the search
7f296bb3SBarry Smithdirection. A standard Armijo search
7f296bb3SBarry Smith{cite}`armijo:minimization` is used to find the new
7f296bb3SBarry Smithiteration. Nonmonotone searches
7f296bb3SBarry Smith{cite}`grippo.lampariello.ea:nonmonotone` are also available
7f296bb3SBarry Smithby setting appropriate runtime options. See
7f296bb3SBarry Smith{any}`sec_tao_linesearch` for further details.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe first semismooth algorithm available in TAO is not guaranteed to
7f296bb3SBarry Smithremain feasible with respect to the bounds, $[\ell, u]$, and is
7f296bb3SBarry Smithtermed an infeasible semismooth method. This method can be specified by
7f296bb3SBarry Smithusing the `tao_ssils` solver. In this case, the descent test used is
7f296bb3SBarry Smiththat
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{aligned}
7f296bb3SBarry Smith\nabla \Psi(x^k)^Td^k \leq -\delta\| d^k \|^\rho.\end{aligned}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry SmithBoth $\delta > 0$ and $\rho > 2$ can be modified by using
7f296bb3SBarry Smiththe runtime options `-tao_ssils_delta <delta>` and
7f296bb3SBarry Smith`-tao_ssils_rho <rho>`, respectively. By default,
7f296bb3SBarry Smith$\delta = 10^{-10}$ and $\rho = 2.1$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithAn alternative is to remain feasible with respect to the bounds by using
7f296bb3SBarry Smitha projected Armijo line search. This method can be specified by using
7f296bb3SBarry Smiththe `tao_ssfls` solver. The descent test used is the same as above
7f296bb3SBarry Smithwhere the direction in this case corresponds to the first part of the
7f296bb3SBarry Smithpiecewise linear arc searched by the projected line search. Both
7f296bb3SBarry Smith$\delta > 0$ and $\rho > 2$ can be modified by using the
7f296bb3SBarry Smithruntime options `-tao_ssfls_delta <delta>` and
7f296bb3SBarry Smith`-tao_ssfls_rho <rho>` respectively. By default,
7f296bb3SBarry Smith$\delta = 10^{-10}$ and $\rho = 2.1$.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe recommended algorithm is the infeasible semismooth method,
7f296bb3SBarry Smith`tao_ssils`, because of its strong global and local convergence
7f296bb3SBarry Smithproperties. However, if it is known that $F$ is not defined
7f296bb3SBarry Smithoutside of the box, $[\ell,u]$, perhaps because of the presence of
7f296bb3SBarry Smith$\log$ functions, the feasibility-enforcing version of the
7f296bb3SBarry Smithalgorithm, `tao_ssfls`, is a reasonable alternative.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Active-Set Methods
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO also contained two active-set semismooth methods for solving
7f296bb3SBarry Smithcomplementarity problems. These methods solve a reduced system
7f296bb3SBarry Smithconstructed by block elimination of active constraints. The
7f296bb3SBarry Smithsubdifferential in these cases enables this block elimination.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe first active-set semismooth algorithm available in TAO is not guaranteed to
7f296bb3SBarry Smithremain feasible with respect to the bounds, $[\ell, u]$, and is
7f296bb3SBarry Smithtermed an infeasible active-set semismooth method. This method can be
7f296bb3SBarry Smithspecified by using the `tao_asils` solver.
7f296bb3SBarry Smith
7f296bb3SBarry SmithAn alternative is to remain feasible with respect to the bounds by using
7f296bb3SBarry Smitha projected Armijo line search. This method can be specified by using
7f296bb3SBarry Smiththe `tao_asfls` solver.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_quadratic)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Quadratic Solvers
7f296bb3SBarry Smith
7f296bb3SBarry SmithQuadratic solvers solve optimization problems of the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith\begin{array}{ll}
7f296bb3SBarry Smith\displaystyle \min_{x} & \frac{1}{2}x^T Q x + c^T x \\
7f296bb3SBarry Smith\text{subject to} & l \geq x \geq u
7f296bb3SBarry Smith\end{array}
7f296bb3SBarry Smith$$
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere the gradient and the Hessian of the objective are both constant.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Gradient Projection Conjugate Gradient Method (GPCG)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe GPCG {cite}`more-toraldo` algorithm is much like the
7f296bb3SBarry SmithTRON algorithm, discussed in Section {any}`sec_tao_tron`, except that
7f296bb3SBarry Smithit assumes that the objective function is quadratic and convex.
7f296bb3SBarry SmithTherefore, it evaluates the function, gradient, and Hessian only once.
7f296bb3SBarry SmithSince the objective function is quadratic, the algorithm does not use a
7f296bb3SBarry Smithtrust region. All the options that apply to TRON except for trust-region
7f296bb3SBarry Smithoptions also apply to GPCG. It can be set by using the TAO solver
7f296bb3SBarry Smith`tao_gpcg` or via the optio flag `-tao_type gpcg`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_bqpip)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Interior-Point Newton’s Method (BQPIP)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe BQPIP algorithm is an interior-point method for bound constrained
7f296bb3SBarry Smithquadratic optimization. It can be set by using the TAO solver of
7f296bb3SBarry Smith`tao_bqpip` or via the option flag `-tao_type bgpip`. Since it
7f296bb3SBarry Smithassumes the objective function is quadratic, it evaluates the function,
7f296bb3SBarry Smithgradient, and Hessian only once. This method also requires the solution
7f296bb3SBarry Smithof systems of linear equations, whose solver can be accessed and
7f296bb3SBarry Smithmodified with the command `TaoGetKSP()`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Legacy and Contributed Solvers
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Bundle Method for Regularized Risk Minimization (BMRM)
7f296bb3SBarry Smith
7f296bb3SBarry SmithBMRM is a numerical approach to optimizing an
7f296bb3SBarry Smithunconstrained objective in the form of
7f296bb3SBarry Smith$f(x) + 0.5 * \lambda \| x \|^2$. Here $f$ is a convex
7f296bb3SBarry Smithfunction that is finite on the whole space. $\lambda$ is a
7f296bb3SBarry Smithpositive weight parameter, and $\| x \|$ is the Euclidean norm of
7f296bb3SBarry Smith$x$. The algorithm only requires a routine which, given an
7f296bb3SBarry Smith$x$, returns the value of $f(x)$ and the gradient of
7f296bb3SBarry Smith$f$ at $x$.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Orthant-Wise Limited-memory Quasi-Newton (OWLQN)
7f296bb3SBarry Smith
7f296bb3SBarry SmithOWLQN {cite}`owlqn` is a numerical approach to optimizing
7f296bb3SBarry Smithan unconstrained objective in the form of
7f296bb3SBarry Smith$f(x) + \lambda \|x\|_1$. Here f is a convex and differentiable
7f296bb3SBarry Smithfunction, $\lambda$ is a positive weight parameter, and
7f296bb3SBarry Smith$\| x \|_1$ is the $\ell_1$ norm of $x$:
7f296bb3SBarry Smith$\sum_i |x_i|$. The algorithm only requires evaluating the value
7f296bb3SBarry Smithof $f$ and its gradient.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_tron)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Trust-Region Newton Method (TRON)
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe TRON {cite}`lin_c3` algorithm is an active-set method
7f296bb3SBarry Smiththat uses a combination of gradient projections and a preconditioned
7f296bb3SBarry Smithconjugate gradient method to minimize an objective function. Each
7f296bb3SBarry Smithiteration of the TRON algorithm requires function, gradient, and Hessian
7f296bb3SBarry Smithevaluations. In each iteration, the algorithm first applies several
7f296bb3SBarry Smithconjugate gradient iterations. After these iterates, the TRON solver
7f296bb3SBarry Smithmomentarily ignores the variables that equal one of its bounds and
7f296bb3SBarry Smithapplies a preconditioned conjugate gradient method to a quadratic model
7f296bb3SBarry Smithof the remaining set of *free* variables.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe TRON algorithm solves a reduced linear system defined by the rows
7f296bb3SBarry Smithand columns corresponding to the variables that lie between the upper
7f296bb3SBarry Smithand lower bounds. The TRON algorithm applies a trust region to the
7f296bb3SBarry Smithconjugate gradients to ensure convergence. The initial trust-region
7f296bb3SBarry Smithradius can be set by using the command
7f296bb3SBarry Smith`TaoSetInitialTrustRegionRadius()`, and the current trust region size
7f296bb3SBarry Smithcan be found by using the command `TaoGetCurrentTrustRegionRadius()`.
7f296bb3SBarry SmithThe initial trust region can significantly alter the rate of convergence
7f296bb3SBarry Smithfor the algorithm and should be tuned and adjusted for optimal
7f296bb3SBarry Smithperformance.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis algorithm will be deprecated in the next version in favor of the
7f296bb3SBarry SmithBounded Newton Trust Region (BNTR) algorithm.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_blmvm)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Bound-constrained Limited-Memory Variable-Metric Method (BLMVM)
7f296bb3SBarry Smith
7f296bb3SBarry SmithBLMVM is a limited-memory, variable-metric method and is the
7f296bb3SBarry Smithbound-constrained variant of the LMVM method for unconstrained
7f296bb3SBarry Smithoptimization. It uses projected gradients to approximate the Hessian,
7f296bb3SBarry Smitheliminating the need for Hessian evaluations. The method can be set by
7f296bb3SBarry Smithusing the TAO solver `tao_blmvm`. For more details, please see the
7f296bb3SBarry SmithLMVM section in the unconstrained algorithms as well as the LMVM matrix
7f296bb3SBarry Smithdocumentation in the PETSc manual.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis algorithm will be deprecated in the next version in favor of the
7f296bb3SBarry SmithBounded Quasi-Newton Line Search (BQNLS) algorithm.
7f296bb3SBarry Smith
7f296bb3SBarry Smith## Advanced Options
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis section discusses options and routines that apply to most TAO
7f296bb3SBarry Smithsolvers and problem classes. In particular, we focus on linear solvers,
7f296bb3SBarry Smithconvergence tests, and line searches.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_linearsolvers)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Linear Solvers
7f296bb3SBarry Smith
7f296bb3SBarry SmithOne of the most computationally intensive phases of many optimization
7f296bb3SBarry Smithalgorithms involves the solution of linear systems of equations. The
7f296bb3SBarry Smithperformance of the linear solver may be critical to an efficient
7f296bb3SBarry Smithcomputation of the solution. Since linear equation solvers often have a
7f296bb3SBarry Smithwide variety of options associated with them, TAO allows the user to
7f296bb3SBarry Smithaccess the linear solver with the
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoGetKSP(Tao, KSP *);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithcommand. With access to the KSP object, users can customize it for their
7f296bb3SBarry Smithapplication to achieve improved performance. Additional details on the
7f296bb3SBarry SmithKSP options in PETSc can be found in the {doc}`/manual/index`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Monitors
7f296bb3SBarry Smith
7f296bb3SBarry SmithBy default the TAO solvers run silently without displaying information
7f296bb3SBarry Smithabout the iterations. The user can initiate monitoring with the command
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoMonitorSet(Tao, PetscErrorCode (*mon)(Tao,void*), void*);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe routine `mon` indicates a user-defined monitoring routine, and
7f296bb3SBarry Smith`void*` denotes an optional user-defined context for private data for
7f296bb3SBarry Smiththe monitor routine.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe routine set by `TaoMonitorSet()` is called once during each
7f296bb3SBarry Smithiteration of the optimization solver. Hence, the user can employ this
7f296bb3SBarry Smithroutine for any application-specific computations that should be done
7f296bb3SBarry Smithafter the solution update.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_convergence)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Convergence Tests
7f296bb3SBarry Smith
7f296bb3SBarry SmithConvergence of a solver can be defined in many ways. The methods TAO
7f296bb3SBarry Smithuses by default are mentioned in {any}`sec_tao_customize`.
7f296bb3SBarry SmithThese methods include absolute and relative convergence tolerances as
7f296bb3SBarry Smithwell as a maximum number of iterations of function evaluations. If these
7f296bb3SBarry Smithchoices are not sufficient, the user can specify a customized test
7f296bb3SBarry Smith
7f296bb3SBarry SmithUsers can set their own customized convergence tests of the form
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithPetscErrorCode  conv(Tao, void*);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe second argument is a pointer to a structure defined by the user.
7f296bb3SBarry SmithWithin this routine, the solver can be queried for the solution vector,
7f296bb3SBarry Smithgradient vector, or other statistic at the current iteration through
7f296bb3SBarry Smithroutines such as `TaoGetSolutionStatus()` and `TaoGetTolerances()`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithTo use this convergence test within a TAO solver, one uses the command
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSetConvergenceTest(Tao, PetscErrorCode (*conv)(Tao,void*), void*);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe second argument of this command is the convergence routine, and the
7f296bb3SBarry Smithfinal argument of the convergence test routine denotes an optional
7f296bb3SBarry Smithuser-defined context for private data. The convergence routine receives
7f296bb3SBarry Smiththe TAO solver and this private data structure. The termination flag can
7f296bb3SBarry Smithbe set by using the routine
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoSetConvergedReason(Tao, TaoConvergedReason);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_linesearch)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Line Searches
7f296bb3SBarry Smith
7f296bb3SBarry SmithBy using the command line option `-tao_ls_type`. Available line
7f296bb3SBarry Smithsearches include Moré-Thuente {cite}`more:92`, Armijo, gpcg,
7f296bb3SBarry Smithand unit.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe line search routines involve several parameters, which are set to
7f296bb3SBarry Smithdefaults that are reasonable for many applications. The user can
7f296bb3SBarry Smithoverride the defaults by using the following options
7f296bb3SBarry Smith
7f296bb3SBarry Smith- `-tao_ls_max_funcs <max>`
7f296bb3SBarry Smith- `-tao_ls_stepmin <min>`
7f296bb3SBarry Smith- `-tao_ls_stepmax <max>`
7f296bb3SBarry Smith- `-tao_ls_ftol <ftol>`
7f296bb3SBarry Smith- `-tao_ls_gtol <gtol>`
7f296bb3SBarry Smith- `-tao_ls_rtol <rtol>`
7f296bb3SBarry Smith
7f296bb3SBarry SmithOne should run a TAO program with the option `-help` for details.
7f296bb3SBarry SmithUsers may write their own customized line search codes by modeling them
7f296bb3SBarry Smithafter one of the defaults provided.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_recyclehistory)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Recycling History
7f296bb3SBarry Smith
7f296bb3SBarry SmithSome TAO algorithms can re-use information accumulated in the previous
7f296bb3SBarry Smith`TaoSolve()` call to hot-start the new solution. This can be enabled
7f296bb3SBarry Smithusing the `-tao_recycle_history` flag, or in code via the
7f296bb3SBarry Smith`TaoSetRecycleHistory()` interface.
7f296bb3SBarry Smith
7f296bb3SBarry SmithFor the nonlinear conjugate gradient solver (`TAOBNCG`), this option
7f296bb3SBarry Smithre-uses the latest search direction from the previous `TaoSolve()`
7f296bb3SBarry Smithcall to compute the initial search direction of a new `TaoSolve()`. By
7f296bb3SBarry Smithdefault, the feature is disabled and the algorithm sets the initial
7f296bb3SBarry Smithdirection as the negative gradient.
7f296bb3SBarry Smith
7f296bb3SBarry SmithFor the quasi-Newton family of methods (`TAOBQNLS`, `TAOBQNKLS`,
7f296bb3SBarry Smith`TAOBQNKTR`, `TAOBQNKTL`), this option re-uses the accumulated
7f296bb3SBarry Smithquasi-Newton Hessian approximation from the previous `TaoSolve()`
7f296bb3SBarry Smithcall. By default, the feature is disabled and the algorithm will reset
7f296bb3SBarry Smiththe quasi-Newton approximation to the identity matrix at the beginning
7f296bb3SBarry Smithof every new `TaoSolve()`.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe option flag has no effect on other TAO solvers.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(sec_tao_addsolver)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith## Adding a Solver
7f296bb3SBarry Smith
7f296bb3SBarry SmithOne of the strengths of both TAO and PETSc is the ability to allow users
7f296bb3SBarry Smithto extend the built-in solvers with new user-defined algorithms. It is
7f296bb3SBarry Smithcertainly possible to develop new optimization algorithms outside of TAO
7f296bb3SBarry Smithframework, but Using TAO to implement a solver has many advantages,
7f296bb3SBarry Smith
7f296bb3SBarry Smith1. TAO includes other optimization solvers with an identical interface,
7f296bb3SBarry Smith   so application problems may conveniently switch solvers to compare
7f296bb3SBarry Smith   their effectiveness.
7f296bb3SBarry Smith2. TAO provides support for function evaluations and derivative
7f296bb3SBarry Smith   information. It allows for the direct evaluation of this information
7f296bb3SBarry Smith   by the application developer, contains limited support for finite
7f296bb3SBarry Smith   difference approximations, and allows the uses of matrix-free
7f296bb3SBarry Smith   methods. The solvers can obtain this function and derivative
7f296bb3SBarry Smith   information through a simple interface while the details of its
7f296bb3SBarry Smith   computation are handled within the toolkit.
7f296bb3SBarry Smith3. TAO provides line searches, convergence tests, monitoring routines,
7f296bb3SBarry Smith   and other tools that are helpful in an optimization algorithm. The
7f296bb3SBarry Smith   availability of these tools means that the developers of the
7f296bb3SBarry Smith   optimization solver do not have to write these utilities.
7f296bb3SBarry Smith4. PETSc offers vectors, matrices, index sets, and linear solvers that
7f296bb3SBarry Smith   can be used by the solver. These objects are standard mathematical
7f296bb3SBarry Smith   constructions that have many different implementations. The objects
7f296bb3SBarry Smith   may be distributed over multiple processors, restricted to a single
7f296bb3SBarry Smith   processor, have a dense representation, use a sparse data structure,
7f296bb3SBarry Smith   or vary in many other ways. TAO solvers do not need to know how these
7f296bb3SBarry Smith   objects are represented or how the operations defined on them have
7f296bb3SBarry Smith   been implemented. Instead, the solvers apply these operations through
7f296bb3SBarry Smith   an abstract interface that leaves the details to PETSc and external
7f296bb3SBarry Smith   libraries. This abstraction allows solvers to work seamlessly with a
7f296bb3SBarry Smith   variety of data structures while allowing application developers to
7f296bb3SBarry Smith   select data structures tailored for their purposes.
7f296bb3SBarry Smith5. PETSc provides the user a convenient method for setting options at
7f296bb3SBarry Smith   runtime, performance profiling, and debugging.
7f296bb3SBarry Smith
7f296bb3SBarry Smith(header_file_1)=
7f296bb3SBarry Smith
7f296bb3SBarry Smith### Header File
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO solver implementation files must include the TAO implementation file
7f296bb3SBarry Smith`taoimpl.h`:
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry Smith#include "petsc/private/taoimpl.h"
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis file contains data elements that are generally kept hidden from
7f296bb3SBarry Smithapplication programmers, but may be necessary for solver implementations
7f296bb3SBarry Smithto access.
7f296bb3SBarry Smith
7f296bb3SBarry Smith### TAO Interface with Solvers
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO solvers must be written in C or C++ and include several routines
7f296bb3SBarry Smithwith a particular calling sequence. Two of these routines are mandatory:
7f296bb3SBarry Smithone that initializes the TAO structure with the appropriate information
7f296bb3SBarry Smithand one that applies the algorithm to a problem instance. Additional
7f296bb3SBarry Smithroutines may be written to set options within the solver, view the
7f296bb3SBarry Smithsolver, setup appropriate data structures, and destroy these data
7f296bb3SBarry Smithstructures. In order to implement the conjugate gradient algorithm, for
7f296bb3SBarry Smithexample, the following structure is useful.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry Smithtypedef struct{
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscReal beta;
7f296bb3SBarry Smith  PetscReal eta;
7f296bb3SBarry Smith  PetscInt  ngradtseps;
7f296bb3SBarry Smith  PetscInt  nresetsteps;
7f296bb3SBarry Smith  Vec X_old;
7f296bb3SBarry Smith  Vec G_old;
7f296bb3SBarry Smith
7f296bb3SBarry Smith} TAO_CG;
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis structure contains two parameters, two counters, and two work
7f296bb3SBarry Smithvectors. Vectors for the solution and gradient are not needed here
7f296bb3SBarry Smithbecause the TAO structure has pointers to them.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Solver Routine
7f296bb3SBarry Smith
7f296bb3SBarry SmithAll TAO solvers have a routine that accepts a TAO structure and computes
7f296bb3SBarry Smitha solution. TAO will call this routine when the application program uses
7f296bb3SBarry Smiththe routine `TaoSolve()` and will pass to the solver information about
7f296bb3SBarry Smiththe objective function and constraints, pointers to the variable vector
7f296bb3SBarry Smithand gradient vector, and support for line searches, linear solvers, and
7f296bb3SBarry Smithconvergence monitoring. As an example, consider the following code that
7f296bb3SBarry Smithsolves an unconstrained minimization problem using the conjugate
7f296bb3SBarry Smithgradient method.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithPetscErrorCode TaoSolve_CG(Tao tao)
7f296bb3SBarry Smith{
7f296bb3SBarry Smith  TAO_CG  *cg = (TAO_CG *) tao->data;
7f296bb3SBarry Smith  Vec x = tao->solution;
7f296bb3SBarry Smith  Vec g = tao->gradient;
7f296bb3SBarry Smith  Vec s = tao->stepdirection;
7f296bb3SBarry Smith  PetscInt     iter=0;
7f296bb3SBarry Smith  PetscReal  gnormPrev,gdx,f,gnorm,steplength=0;
7f296bb3SBarry Smith  TaoLineSearchConvergedReason lsflag=TAO_LINESEARCH_CONTINUE_ITERATING;
7f296bb3SBarry Smith  TaoConvergedReason reason=TAO_CONTINUE_ITERATING;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFunctionBegin;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscCall(TaoComputeObjectiveAndGradient(tao,x,&f,g));
7f296bb3SBarry Smith  PetscCall(VecNorm(g,NORM_2,&gnorm));
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscCall(VecSet(s,0));
7f296bb3SBarry Smith
7f296bb3SBarry Smith  cg->beta=0;
7f296bb3SBarry Smith  gnormPrev = gnorm;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  /* Enter loop */
7f296bb3SBarry Smith  while (1){
7f296bb3SBarry Smith
7f296bb3SBarry Smith    /* Test for convergence */
7f296bb3SBarry Smith    PetscCall(TaoMonitor(tao,iter,f,gnorm,0.0,step,&reason));
7f296bb3SBarry Smith    if (reason!=TAO_CONTINUE_ITERATING) break;
7f296bb3SBarry Smith
7f296bb3SBarry Smith    cg->beta=(gnorm*gnorm)/(gnormPrev*gnormPrev);
7f296bb3SBarry Smith    PetscCall(VecScale(s,cg->beta));
7f296bb3SBarry Smith    PetscCall(VecAXPY(s,-1.0,g));
7f296bb3SBarry Smith
7f296bb3SBarry Smith    PetscCall(VecDot(s,g,&gdx));
7f296bb3SBarry Smith    if (gdx>=0){     /* If not a descent direction, use gradient */
7f296bb3SBarry Smith      PetscCall(VecCopy(g,s));
7f296bb3SBarry Smith      PetscCall(VecScale(s,-1.0));
7f296bb3SBarry Smith      gdx=-gnorm*gnorm;
7f296bb3SBarry Smith    }
7f296bb3SBarry Smith
7f296bb3SBarry Smith    /* Line Search */
7f296bb3SBarry Smith    gnormPrev = gnorm;  step=1.0;
7f296bb3SBarry Smith    PetscCall(TaoLineSearchSetInitialStepLength(tao->linesearch,1.0));
7f296bb3SBarry Smith    PetscCall(TaoLineSearchApply(tao->linesearch,x,&f,g,s,&steplength,&lsflag));
7f296bb3SBarry Smith    PetscCall(TaoAddLineSearchCounts(tao));
7f296bb3SBarry Smith    PetscCall(VecNorm(g,NORM_2,&gnorm));
7f296bb3SBarry Smith    iter++;
7f296bb3SBarry Smith  }
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
7f296bb3SBarry Smith}
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe first line of this routine casts the second argument to a pointer to
7f296bb3SBarry Smitha `TAO_CG` data structure. This structure contains pointers to three
7f296bb3SBarry Smithvectors and a scalar that will be needed in the algorithm.
7f296bb3SBarry Smith
7f296bb3SBarry SmithAfter declaring an initializing several variables, the solver lets TAO
7f296bb3SBarry Smithevaluate the function and gradient at the current point in the using the
7f296bb3SBarry Smithroutine `TaoComputeObjectiveAndGradient()`. Other routines may be used
7f296bb3SBarry Smithto evaluate the Hessian matrix or evaluate constraints. TAO may obtain
7f296bb3SBarry Smiththis information using direct evaluation or other means, but these
7f296bb3SBarry Smithdetails do not affect our implementation of the algorithm.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe norm of the gradient is a standard measure used by unconstrained
7f296bb3SBarry Smithminimization solvers to define convergence. This quantity is always
7f296bb3SBarry Smithnonnegative and equals zero at the solution. The solver will pass this
7f296bb3SBarry Smithquantity, the current function value, the current iteration number, and
7f296bb3SBarry Smitha measure of infeasibility to TAO with the routine
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithPetscErrorCode TaoMonitor(Tao tao, PetscInt iter, PetscReal f,
7f296bb3SBarry Smith               PetscReal res, PetscReal cnorm, PetscReal steplength,
7f296bb3SBarry Smith               TaoConvergedReason *reason);
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithMost optimization algorithms are iterative, and solvers should include
7f296bb3SBarry Smiththis command somewhere in each iteration. This routine records this
7f296bb3SBarry Smithinformation, and applies any monitoring routines and convergence tests
7f296bb3SBarry Smithset by default or the user. In this routine, the second argument is the
7f296bb3SBarry Smithcurrent iteration number, and the third argument is the current function
7f296bb3SBarry Smithvalue. The fourth argument is a nonnegative error measure associated
7f296bb3SBarry Smithwith the distance between the current solution and the optimal solution.
7f296bb3SBarry SmithExamples of this measure are the norm of the gradient or the square root
7f296bb3SBarry Smithof a duality gap. The fifth argument is a nonnegative error that usually
7f296bb3SBarry Smithrepresents a measure of the infeasibility such as the norm of the
7f296bb3SBarry Smithconstraints or violation of bounds. This number should be zero for
7f296bb3SBarry Smithunconstrained solvers. The sixth argument is a nonnegative steplength,
7f296bb3SBarry Smithor the multiple of the step direction added to the previous iterate. The
7f296bb3SBarry Smithresults of the convergence test are returned in the last argument. If
7f296bb3SBarry Smiththe termination reason is `TAO_CONTINUE_ITERATING`, the algorithm
7f296bb3SBarry Smithshould continue.
7f296bb3SBarry Smith
7f296bb3SBarry SmithAfter this monitoring routine, the solver computes a step direction
7f296bb3SBarry Smithusing the conjugate gradient algorithm and computations using Vec
7f296bb3SBarry Smithobjects. These methods include adding vectors together and computing an
7f296bb3SBarry Smithinner product. A full list of these methods can be found in the manual
7f296bb3SBarry Smithpages.
7f296bb3SBarry Smith
7f296bb3SBarry SmithNonlinear conjugate gradient algorithms also require a line search. TAO
7f296bb3SBarry Smithprovides several line searches and support for using them. The routine
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoLineSearchApply(TaoLineSearch ls, Vec x, PetscReal *f, Vec g,
7f296bb3SBarry Smith                       TaoVec *s, PetscReal *steplength,
7f296bb3SBarry Smith                       TaoLineSearchConvergedReason *lsflag)
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithpasses the current solution, gradient, and objective value to the line
7f296bb3SBarry Smithsearch and returns a new solution, gradient, and objective value. More
7f296bb3SBarry Smithdetails on line searches can be found in
7f296bb3SBarry Smith{any}`sec_tao_linesearch`. The details of the
7f296bb3SBarry Smithline search applied are specified elsewhere, when the line search is
7f296bb3SBarry Smithcreated.
7f296bb3SBarry Smith
7f296bb3SBarry SmithTAO also includes support for linear solvers using PETSc KSP objects.
7f296bb3SBarry SmithAlthough this algorithm does not require one, linear solvers are an
7f296bb3SBarry Smithimportant part of many algorithms. Details on the use of these solvers
7f296bb3SBarry Smithcan be found in the PETSc users manual.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Creation Routine
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe TAO solver is initialized for a particular algorithm in a separate
7f296bb3SBarry Smithroutine. This routine sets default convergence tolerances, creates a
7f296bb3SBarry Smithline search or linear solver if needed, and creates structures needed by
7f296bb3SBarry Smiththis solver. For example, the routine that creates the nonlinear
7f296bb3SBarry Smithconjugate gradient algorithm shown above can be implemented as follows.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithPETSC_EXTERN PetscErrorCode TaoCreate_CG(Tao tao)
7f296bb3SBarry Smith{
7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG*)tao->data;
7f296bb3SBarry Smith  const char *morethuente_type = TAOLINESEARCH_MT;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFunctionBegin;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscCall(PetscNew(&cg));
7f296bb3SBarry Smith  tao->data = (void*)cg;
7f296bb3SBarry Smith  cg->eta = 0.1;
7f296bb3SBarry Smith  cg->delta_min = 1e-7;
7f296bb3SBarry Smith  cg->delta_max = 100;
7f296bb3SBarry Smith  cg->cg_type = CG_PolakRibierePlus;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  tao->max_it = 2000;
7f296bb3SBarry Smith  tao->max_funcs = 4000;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  tao->ops->setup = TaoSetUp_CG;
7f296bb3SBarry Smith  tao->ops->solve = TaoSolve_CG;
7f296bb3SBarry Smith  tao->ops->view = TaoView_CG;
7f296bb3SBarry Smith  tao->ops->setfromoptions = TaoSetFromOptions_CG;
7f296bb3SBarry Smith  tao->ops->destroy = TaoDestroy_CG;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscCall(TaoLineSearchCreate(((PetscObject)tao)->comm, &tao->linesearch));
7f296bb3SBarry Smith  PetscCall(TaoLineSearchSetType(tao->linesearch, morethuente_type));
7f296bb3SBarry Smith  PetscCall(TaoLineSearchUseTaoRoutines(tao->linesearch, tao));
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
7f296bb3SBarry Smith}
7f296bb3SBarry SmithEXTERN_C_END
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis routine declares some variables and then allocates memory for the
7f296bb3SBarry Smith`TAO_CG` data structure. Notice that the `Tao` object now has a
7f296bb3SBarry Smithpointer to this data structure (`tao->data`) so it can be accessed by
7f296bb3SBarry Smiththe other functions written for this solver implementation.
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis routine also sets some default parameters particular to the
7f296bb3SBarry Smithconjugate gradient algorithm, sets default convergence tolerances, and
7f296bb3SBarry Smithcreates a particular line search. These defaults could be specified in
7f296bb3SBarry Smiththe routine that solves the problem, but specifying them here gives the
7f296bb3SBarry Smithuser the opportunity to modify these parameters either by using direct
7f296bb3SBarry Smithcalls setting parameters or by using options.
7f296bb3SBarry Smith
7f296bb3SBarry SmithFinally, this solver passes to TAO the names of all the other routines
7f296bb3SBarry Smithused by the solver.
7f296bb3SBarry Smith
7f296bb3SBarry SmithNote that the lines `EXTERN_C_BEGIN` and `EXTERN_C_END` surround
7f296bb3SBarry Smiththis routine. These macros are required to preserve the name of this
7f296bb3SBarry Smithfunction without any name-mangling from the C++ compiler (if used).
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Destroy Routine
7f296bb3SBarry Smith
7f296bb3SBarry SmithAnother routine needed by most solvers destroys the data structures
7f296bb3SBarry Smithcreated by earlier routines. For the nonlinear conjugate gradient method
7f296bb3SBarry Smithdiscussed earlier, the following routine destroys the two work vectors
7f296bb3SBarry Smithand the `TAO_CG` structure.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithPetscErrorCode TaoDestroy_CG(TAO_SOLVER tao)
7f296bb3SBarry Smith{
7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG *) tao->data;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFunctionBegin;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscCall(VecDestroy(&cg->X_old));
7f296bb3SBarry Smith  PetscCall(VecDestroy(&cg->G_old));
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFree(tao->data);
7f296bb3SBarry Smith  tao->data = NULL;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
7f296bb3SBarry Smith}
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry SmithThis routine is called from within the `TaoDestroy()` routine. Only
7f296bb3SBarry Smithalgorithm-specific data objects are destroyed in this routine; any
7f296bb3SBarry Smithobjects indexed by TAO (`tao->linesearch`, `tao->ksp`,
7f296bb3SBarry Smith`tao->gradient`, etc.) will be destroyed by TAO immediately after the
7f296bb3SBarry Smithalgorithm-specific destroy routine completes.
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### SetUp Routine
7f296bb3SBarry Smith
7f296bb3SBarry SmithIf the SetUp routine has been set by the initialization routine, TAO
7f296bb3SBarry Smithwill call it during the execution of `TaoSolve()`. While this routine
7f296bb3SBarry Smithis optional, it is often provided to allocate the gradient vector, work
7f296bb3SBarry Smithvectors, and other data structures required by the solver. It should
7f296bb3SBarry Smithhave the following form.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithPetscErrorCode TaoSetUp_CG(Tao tao)
7f296bb3SBarry Smith{
7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG*)tao->data;
7f296bb3SBarry Smith  PetscFunctionBegin;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscCall(VecDuplicate(tao->solution,&tao->gradient));
7f296bb3SBarry Smith  PetscCall(VecDuplicate(tao->solution,&tao->stepdirection));
7f296bb3SBarry Smith  PetscCall(VecDuplicate(tao->solution,&cg->X_old));
7f296bb3SBarry Smith  PetscCall(VecDuplicate(tao->solution,&cg->G_old));
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
7f296bb3SBarry Smith}
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### SetFromOptions Routine
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe SetFromOptions routine should be used to check for any
7f296bb3SBarry Smithalgorithm-specific options set by the user and will be called when the
7f296bb3SBarry Smithapplication makes a call to `TaoSetFromOptions()`. It should have the
7f296bb3SBarry Smithfollowing form.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithPetscErrorCode TaoSetFromOptions_CG(Tao tao, void *solver);
7f296bb3SBarry Smith{
7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG*)solver;
7f296bb3SBarry Smith  PetscFunctionBegin;
7f296bb3SBarry Smith  PetscCall(PetscOptionsReal("-tao_cg_eta","restart tolerance","",cg->eta,&cg->eta,0));
7f296bb3SBarry Smith  PetscCall(PetscOptionsReal("-tao_cg_delta_min","minimum delta value","",cg->delta_min,&cg->delta_min,0));
7f296bb3SBarry Smith  PetscCall(PetscOptionsReal("-tao_cg_delta_max","maximum delta value","",cg->delta_max,&cg->delta_max,0));
7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
7f296bb3SBarry Smith}
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### View Routine
7f296bb3SBarry Smith
7f296bb3SBarry SmithThe View routine should be used to output any algorithm-specific
7f296bb3SBarry Smithinformation or statistics at the end of a solve. This routine will be
7f296bb3SBarry Smithcalled when the application makes a call to `TaoView()` or when the
7f296bb3SBarry Smithcommand line option `-tao_view` is used. It should have the following
7f296bb3SBarry Smithform.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithPetscErrorCode TaoView_CG(Tao tao, PetscViewer viewer)
7f296bb3SBarry Smith{
7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG*)tao->data;
7f296bb3SBarry Smith
7f296bb3SBarry Smith  PetscFunctionBegin;
7f296bb3SBarry Smith  PetscCall(PetscViewerASCIIPushTab(viewer));
7f296bb3SBarry Smith  PetscCall(PetscViewerASCIIPrintf(viewer,"Grad. steps: %d\n",cg->ngradsteps));
7f296bb3SBarry Smith  PetscCall(PetscViewerASCIIPrintf(viewer,"Reset steps: %d\n",cg->nresetsteps));
7f296bb3SBarry Smith  PetscCall(PetscViewerASCIIPopTab(viewer));
7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
7f296bb3SBarry Smith}
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smith#### Registering the Solver
7f296bb3SBarry Smith
7f296bb3SBarry SmithOnce a new solver is implemented, TAO needs to know the name of the
7f296bb3SBarry Smithsolver and what function to use to create the solver. To this end, one
7f296bb3SBarry Smithcan use the routine
7f296bb3SBarry Smith
7f296bb3SBarry Smith```
7f296bb3SBarry SmithTaoRegister(const char *name,
7f296bb3SBarry Smith                const char *path,
7f296bb3SBarry Smith                const char *cname,
7f296bb3SBarry Smith                PetscErrorCode (*create) (Tao));
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smithwhere `name` is the name of the solver (i.e., `tao_blmvm`), `path`
7f296bb3SBarry Smithis the path to the library containing the solver, `cname` is the name
7f296bb3SBarry Smithof the routine that creates the solver (in our case, `TaoCreate_CG`),
7f296bb3SBarry Smithand `create` is a pointer to that creation routine. If one is using
7f296bb3SBarry Smithdynamic loading, then the fourth argument will be ignored.
7f296bb3SBarry Smith
7f296bb3SBarry SmithOnce the solver has been registered, the new solver can be selected
7f296bb3SBarry Smitheither by using the `TaoSetType()` function or by using the
7f296bb3SBarry Smith`-tao_type` command line option.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```{rubric} Footnotes
7f296bb3SBarry Smith```
7f296bb3SBarry Smith
7f296bb3SBarry Smith[^mpi]: For more on MPI and PETSc, see {any}`sec_running`.
7f296bb3SBarry Smith
7f296bb3SBarry Smith```{eval-rst}
7f296bb3SBarry Smith.. bibliography:: /petsc.bib
7f296bb3SBarry Smith   :filter: docname in docnames
7f296bb3SBarry Smith```