Many Optimization Toolbox™ solvers minimize a scalar function of a multidimensional vector. The objective function is the function the solvers attempt to minimize. Several solvers accept vector-valued objective functions, and some solvers use objective functions you specify by vectors or matrices.
Objective Type | Solvers | How to Write Objectives |
---|---|---|
Scalar |
| Writing Scalar Objective Functions |
Nonlinear least squares |
| Writing Vector and Matrix Objective Functions |
Multivariable equation solving |
| |
Multiobjective |
| |
Linear programming |
| Writing Objective Functions for Linear or Quadratic Problems |
Mixed-integer linear programming |
| |
Linear least squares |
| |
Quadratic programming |
|
A scalar objective function file accepts one input, say x
,
and returns one scalar output, say f
. The input x
can
be a scalar, vector, or matrix.
A function file can return more outputs (see Including Derivatives).
For example, suppose your objective is a function of three variables, x, y, and z:
f(x) = 3*(x – y)4 + 4*(x + z)2 / (1 + x2 + y2 + z2) + cosh(x – 1) + tanh(y + z).
Write this function as a file that accepts the vector xin
= [x;y;z]
and returns f:
function f = myObjective(xin) f = 3*(xin(1)-xin(2))^4 + 4*(xin(1)+xin(3))^2/(1+norm(xin)^2) ... + cosh(xin(1)-1) + tanh(xin(2)+xin(3));
Save it as a file named myObjective.m
to
a folder on your MATLAB® path.
Check that the function evaluates correctly:
myObjective([1;2;3]) ans = 9.2666
For information on how to include extra parameters, see Passing Extra Parameters. For more complex examples of function files, see Minimization with Gradient and Hessian Sparsity Pattern or Minimization with Bound Constraints and Banded Preconditioner.
Local Functions and Nested Functions. Functions can exist inside other files as local functions or nested functions. Using local functions or nested functions can lower the number of distinct files you save. Using nested functions also lets you access extra parameters, as shown in Nested Functions.
For example, suppose you want to minimize the myObjective.m
objective
function, described in Function Files,
subject to the ellipseparabola.m
constraint, described
in Nonlinear Constraints. Instead
of writing two files, myObjective.m
and ellipseparabola.m
,
write one file that contains both functions as local functions:
function [x fval] = callObjConstr(x0,options) % Using a local function for just one file if nargin < 2 options = optimoptions('fmincon','Algorithm','interior-point'); end [x fval] = fmincon(@myObjective,x0,[],[],[],[],[],[], ... @ellipseparabola,options); function f = myObjective(xin) f = 3*(xin(1)-xin(2))^4 + 4*(xin(1)+xin(3))^2/(1+sum(xin.^2)) ... + cosh(xin(1)-1) + tanh(xin(2)+xin(3)); function [c,ceq] = ellipseparabola(x) c(1) = (x(1)^2)/9 + (x(2)^2)/4 - 1; c(2) = x(1)^2 - x(2) - 1; ceq = [];
Solve the constrained minimization starting from the point [1;1;1]
:
[x fval] = callObjConstr(ones(3,1)) Local minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the default value of the function tolerance, and constraints are satisfied to within the default value of the constraint tolerance. x = 1.1835 0.8345 -1.6439 fval = 0.5383
Use anonymous functions to write simple objective functions. For more information about anonymous functions, see What Are Anonymous Functions? in the MATLAB Programming Fundamentals documentation. Rosenbrock's function is simple enough to write as an anonymous function:
anonrosen = @(x)(100*(x(2) - x(1)^2)^2 + (1-x(1))^2);
anonrosen
evaluates correctly at [-1
2]
:anonrosen([-1 2]) ans = 104
anonrosen
with fminunc
yields
the following results:options = optimoptions(@fminunc,'Algorithm','quasi-newton'); [x fval] = fminunc(anonrosen,[-1;2],options) Local minimum found. Optimization completed because the size of the gradient is less than the default value of the function tolerance. x = 1.0000 1.0000 fval = 1.2262e-10
For fmincon
and fminunc
,
you can include gradients in the objective function. You can also
include Hessians, depending on the algorithm. The Hessian matrix Hi,j(x) = ∂2f/∂xi∂xj.
The following table shows which algorithms can use gradients and Hessians.
Solver | Algorithm | Gradient | Hessian |
---|---|---|---|
fmincon | active-set | Optional | No |
interior-point | Optional | Optional separate function (see Hessian) | |
sqp | Optional | No | |
trust-region-reflective | Required | Optional | |
fminunc | trust-region | Required | Optional |
quasi-newton | Optional | No |
Benefits of Including Derivatives. If you do not provide gradients, solvers estimate gradients via finite differences. If you provide gradients, your solver need not perform this finite difference estimation, so can save time and be more accurate. Furthermore, solvers use an approximate Hessian, which can be far from the true Hessian. Providing a Hessian can yield a solution in fewer iterations.
For constrained problems, providing a gradient has another advantage.
A solver can reach a point x
such that x
is
feasible, but, for this x
, finite differences around x
always
lead to an infeasible point. Suppose further that the objective function
at an infeasible point returns a complex output, Inf
, NaN
,
or error. In this case, a solver can fail or halt prematurely. Providing
a gradient allows a solver to proceed. To obtain this benefit, you
might also need to include the gradient of a nonlinear constraint
function, and set the GradConstr
option to 'on'
.
See Nonlinear Constraints.
Choose Input Hessian for interior-point fmincon. The fmincon
interior-point
algorithm
has many options for selecting an input Hessian. For syntax details,
see Hessian. Here are the options,
along with estimates of their relative characteristics.
Hessian | Relative Memory Usage | Relative Efficiency |
---|---|---|
'bfgs' (default) | High (for large problems) | High |
'lbfgs' | Low to Moderate | Moderate |
'fin-diff-grads' | Low | Moderate |
'user-supplied' with 'HessMult' | Low (can depend on your code) | Moderate |
'user-supplied' with 'HessFcn' | ? (depends on your code) | High (depends on your code) |
Use the default 'bfgs'
Hessian unless you
Run out of memory — Try 'lbfgs'
instead
of 'bfgs'
. If you can provide your own gradients,
try 'fin-diff-grads'
, and set the GradObj
and GradConstr
options
to 'on'
.
Want more efficiency — Provide your own gradients and Hessian. See fmincon Interior-Point Algorithm with Analytic Hessian and Symbolic Math Toolbox Calculates Gradients and Hessians.
The reason 'lbfgs'
has only moderate efficiency
is twofold. It has relatively expensive Sherman-Morrison updates.
And the resulting iteration step can be somewhat inaccurate due to
the 'lbfgs'
limited memory.
The reason 'fin-diff-grads'
and HessMult
have
only moderate efficiency is that they use a conjugate gradient approach.
They accurately estimate the Hessian of the objective function, but
they do not generate the most accurate iteration step. For more information,
see fmincon Interior Point Algorithm,
and its discussion of the LDL approach and the conjugate gradient
approach to solving Equation 6-52.
How to Include Derivatives.
Write code that returns:
The objective function (scalar) as the first output
The gradient (vector) as the second output
Optionally, the Hessian (matrix) as the third output
Set the GradObj
option to 'on'
with optimoptions
.
Optionally, set the Hessian
option
to 'on'
or 'user-supplied'
.
For the fmincon
interior-point
solver,
set the Hessian
option to 'user-supplied'
and
set the 'HessFcn'
option to @
,
where hessianfcn
is
a function that computes the Hessian of the Lagrangian. For details,
see Hessian. For an example, see fmincon Interior-Point Algorithm with Analytic Hessian.hessianfcn
Optionally, check if your gradient function matches a finite-difference approximation. See Checking Validity of Gradients or Jacobians.
Tip
For most flexibility, write conditionalized code.
Conditionalized means that the number of function outputs can vary,
as shown in the following example. Conditionalized code does not error
depending on the value of the |
For example, consider Rosenbrock's function
which is described and plotted in Solve a Constrained Nonlinear Problem. The gradient of f(x) is
and the Hessian H(x) is
rosenthree
is an unconditionalized function
that returns the Rosenbrock function with its gradient and Hessian:
function [f g H] = rosenthree(x) % Calculate objective f, gradient g, Hessian H f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2; g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1)); 200*(x(2)-x(1)^2)]; H = [1200*x(1)^2-400*x(2)+2, -400*x(1); -400*x(1), 200];
rosenboth
is a conditionalized function that
returns whatever the solver requires:
function [f g H] = rosenboth(x) % Calculate objective f f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2; if nargout > 1 % gradient required g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1)); 200*(x(2)-x(1)^2)]; if nargout > 2 % Hessian required H = [1200*x(1)^2-400*x(2)+2, -400*x(1); -400*x(1), 200]; end end
nargout
checks the number of arguments
that a calling function specifies. See Find Number of Function Arguments in the MATLAB Programming
Fundamentals documentation.
The fminunc
solver, designed for unconstrained
optimization, allows you to minimize Rosenbrock's function. Tell fminunc
to
use the gradient and Hessian by setting options
:
options = optimoptions(@fminunc,'Algorithm','trust-region',... 'GradObj','on','Hessian','on');
Run fminunc
starting at [-1;2]
:
[x fval] = fminunc(@rosenboth,[-1;2],options) Local minimum found. Optimization completed because the size of the gradient is less than the default value of the function tolerance. x = 1.0000 1.0000 fval = 1.9310e-017
If you have a Symbolic Math Toolbox™ license, you can calculate gradients and Hessians automatically, as described in Symbolic Math Toolbox Calculates Gradients and Hessians.
Some solvers, such as fsolve
and lsqcurvefit
,
have objective functions that are vectors or matrices. The main difference
in usage between these types of objective functions and scalar
objective functions is the way to write their derivatives.
The first-order partial derivatives of a vector-valued or matrix-valued
function is called a Jacobian; the first-order partial derivatives
of a scalar function is called a gradient.
If x is a vector of independent variables, and F(x) is a vector function, the Jacobian J(x) is
If F has m components, and x has k components, J is an m-by-k matrix.
For example, if
then J(x) is
The function file associated with this example is:
function [F jacF] = vectorObjective(x) F = [x(1)^2 + x(2)*x(3); sin(x(1) + 2*x(2) - 3*x(3))]; if nargout > 1 % need Jacobian jacF = [2*x(1),x(3),x(2); cos(x(1)+2*x(2)-3*x(3)),2*cos(x(1)+2*x(2)-3*x(3)), ... -3*cos(x(1)+2*x(2)-3*x(3))]; end
The Jacobian of a matrix F(x) is defined by changing the matrix to a vector, column by column. For example, rewrite the matrix
as a vector f:
The Jacobian of F is as the Jacobian of f,
If F is an m-by-n matrix, and x is a k-vector, the Jacobian is an mn-by-k matrix.
For example, if
then the Jacobian of F is
If x is a matrix, define the Jacobian of F(x) by changing the matrix x to a vector, column by column. For example, if
then the gradient is defined in terms of the vector
With
and with f the vector form of F as above, the Jacobian of F(X) is defined as the Jacobian of f(x):
So, for example,
If F is an m-by-n matrix and x is a j-by-k matrix, then the Jacobian is an mn-by-jk matrix.
The following solvers handle linear or quadratic objective functions:
linprog
and intlinprog
:
minimize
f'x
= f(1)*x(1) + f(2)*x(2)
+...+ f(n)*x(n)
.
Input the vector f
for the objective. See
the examples in Linear Programming and Mixed-Integer Linear Programming.
lsqlin
and lsqnonneg
:
minimize
∥Cx - d
∥.
Input the matrix C
and the vector d
for
the objective. See Linear Least Squares with Bound Constraints.
quadprog
: minimize
1/2 * x'Hx
+ f'x
= 1/2 * (x(1)*H(1,1)*x(1) + 2*x(1)*H(1,2)*x(2)
+...
.
+ x(n)*H(n,n)*x(n)) + f(1)*x(1) + f(2)*x(2) +...+
f(n)*x(n)
Input both the vector f
and the symmetric
matrix H
for the objective. See Quadratic Programming.
All solvers attempt to minimize an objective function. If you have a maximization problem, that is, a problem of the form
then define g(x) = –f(x), and minimize g.
For example, to find the maximum of tan(cos(x)) near x = 5, evaluate:
[x fval] = fminunc(@(x)-tan(cos(x)),5) Local minimum found. Optimization completed because the size of the gradient is less than the default value of the function tolerance. x = 6.2832 fval = -1.5574
fval
),
and occurs at x = 6.2832. This answer is correct since,
to five digits, the maximum is tan(1) = 1.5574, which occurs at x = 2π = 6.2832.