First-order optimality is a measure of how close a point x is to optimal. Most Optimization Toolbox™ solvers use this measure, though it has different definitions for different algorithms. First-order optimality is a necessary condition, but it is not a sufficient condition. In other words:
The first-order optimality measure must be zero at a minimum.
A point with first-order optimality equal to zero is not necessarily a minimum.
For general information about first-order optimality, see Nocedal and Wright [31]. For specifics about the first-order optimality measures for Optimization Toolbox solvers, see Unconstrained Optimality, Constrained Optimality Theory, and Constrained Optimality in Solver Form.
The OptimalityTolerance
tolerance relates
to the first-order optimality measure. Typically, if the first-order
optimality measure is less than OptimalityTolerance
,
solver iterations end.
Some solvers or algorithms use relative first-order
optimality as a stopping criterion. Solver iterations end if the first-order
optimality measure is less than μ times OptimalityTolerance
,
where μ is either:
The infinity norm (maximum) of the gradient of the
objective function at x0
The infinity norm (maximum) of inputs to the solver,
such as f
or b
in linprog
or H
in quadprog
A relative measure attempts to account for the scale of a problem. Multiplying an objective function by a very large or small number does not change the stopping condition for a relative stopping criterion, but does change it for an unscaled one.
Solvers with enhanced exit messages state, in the stopping criteria details, when they use relative first-order optimality.
For a smooth unconstrained problem,
This section summarizes the theory behind the definition of first-order optimality measure for constrained problems. The definition as used in Optimization Toolbox functions is in Constrained Optimality in Solver Form.
For a smooth constrained problem, let g and h be vector functions representing all inequality and equality constraints respectively (meaning bound, linear, and nonlinear constraints):
The KKT conditions use the auxiliary Lagrangian function:
(3-1) |
The KKT conditions are:
(3-2) |
(3-3) |
(3-4) |
The optimality measure associated with Equation 3-2 is
(3-5) |
(3-6) |
The combined optimality measure is the maximum of the values
calculated in Equation 3-5 and Equation 3-6. Solvers that
accept nonlinear constraint functions report constraint violations g(x) > 0 or |h(x)| > 0 as ConstraintTolerance
violations.
See Tolerances and Stopping Criteria.
Most constrained toolbox solvers separate their calculation of first-order optimality measure into bounds, linear functions, and nonlinear functions. The measure is the maximum of the following two norms, which correspond to Equation 3-5 and Equation 3-6:
(3-7) |
(3-8) |
where the norm of the
vectors in Equation 3-7 and Equation 3-8 is the infinity
norm (maximum). The subscripts on the Lagrange multipliers correspond
to solver Lagrange multiplier structures. See Lagrange Multiplier Structures. The summations in Equation 3-7 range over all
constraints. If a bound is ±Inf
, that term
is not constrained, so it is not part of the summation.
For some large-scale problems with only linear equalities, the
first-order optimality measure is the infinity norm of the projected gradient.
In other words, the first-order optimality measure is the size of
the gradient projected onto the null space of Aeq
.
For least-squares solvers and trust-region-reflective algorithms, in problems with bounds alone, the first-order optimality measure is the maximum over i of |vi*gi|. Here gi is the ith component of the gradient, x is the current point, and
If xi is at a bound, vi is zero. If xi is not at a bound, then at a minimizing point the gradient gi should be zero. Therefore the first-order optimality measure should be zero at a minimizing point.