MATLAB Program to Find A Function Minimum

Using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method

by Namir Shammas

The following program uses the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method to find the minimum of a function. This method is a quasi-Newton method. That is, the BFGS method is based on Newton's method but performs different calculations to update the guess refinements.

The function bfgs has the following input parameters:

N - number of variables
X - array of initial guesses
gradToler - tolerance for the norm of the slopes
fxToler - tolerance for function
DxToler - array of delta X tolerances
MaxIter - maximum number of iterations
myFx - name of the optimized function

The function generates the following output:

X - array of optimized variables
F - function value at optimum
Iters - number of iterations

Here is a sample session to find the optimum for the following function:

y = 10 + (X(1) - 2)^2 + (X(2) + 5)^2

The above function resides in file fx1.m. The search for the optimum 2 variables has the initial guess of [0 0] and with a minimum guess refinement vector [1e-5 1e-5]. The search employs a maximum of 100 iterations, a function tolerance of 1e-7, and a gradient tolerance of 1e-7:

>> [X,F,Iters] = bfgs(2, [0 0], 1e-7, 1e-7, [1e-5 1e-5], 100, 'fx1')

X =

2.0000 -5.0000

F =

Iters =

Here is the MATLAB listing:

function y=fx1(X, N)
  y = 10 + (X(1) - 2)^2 + (X(2) + 5)^2;
end

function [X,F,Iters] = bfgs(N, X, gradToler, fxToler, DxToler, MaxIter, myFx)
% Function bfgs performs multivariate optimization using the
% Broyden-Fletcher-Goldfarb-Shanno method.
%
% Input
%
% N - number of variables
% X - array of initial guesses
% gradToler - tolerance for the norm of the slopes
% fxToler - tolerance for function
% DxToler - array of delta X tolerances
% MaxIter - maximum number of iterations
% myFx - name of the optimized function
%
% Output
%
% X - array of optimized variables
% F - function value at optimum
% Iters - number of iterations
%

B = eye(N,N);

bGoOn = true;
Iters = 0;
% calculate initial gradient
grad1 =  FirstDerivatives(X, N, myFx);
grad1 = grad1';

while bGoOn

  Iters = Iters + 1;
  if Iters > MaxIter
    break;
  end

  S = -1 * B * grad1;
  S = S' / norm(S); % normalize vector S

  lambda = 1;
  lambda = linsearch(X, N, lambda, S, myFx);
  % calculate optimum X() with the given Lambda
  d = lambda * S;
  X = X + d;
  % get new gradient
  grad2 =  FirstDerivatives(X, N, myFx);

  grad2 = grad2';
  g = grad2 - grad1;
  grad1 = grad2;

  % test for convergence
  for i = 1:N
    if abs(d(i)) > DxToler(i)
      break
    end
  end

  if norm(grad1) < gradToler
    break
  end

  d = d';
  x1 = d * d';
  x2 = d' * g;
  x3 = d * g';
  x4 = g * d';
  x5 = g' * B * g;
  x6 = d * g' * B;
  x7 = B * g * d';
  B = B + (1 + x5 / x2) * x1 / x2 - x6 / x2 - x7 / x2;
  % break
end

F = feval(myFx, X, N);

% end

function y = myFxEx(N, X, DeltaX, lambda, myFx)

  X = X + lambda * DeltaX;
  y = feval(myFx, X, N);

% end

function FirstDerivX = FirstDerivatives(X, N, myFx)

for iVar=1:N
  xt = X(iVar);
  h = 0.01 * (1 + abs(xt));
  X(iVar) = xt + h;
  fp = feval(myFx, X, N);
  X(iVar) = xt - h;
  fm = feval(myFx, X, N);
  X(iVar) = xt;
  FirstDerivX(iVar) = (fp - fm) / 2 / h;
end

% end

function lambda = linsearch(X, N, lambda, D, myFx)

  MaxIt = 100;
  Toler = 0.000001;

  iter = 0;
  bGoOn = true;
  while bGoOn
    iter = iter + 1;
    if iter > MaxIt
      lambda = 0;
      break
    end

    h = 0.01 * (1 + abs(lambda));
    f0 = myFxEx(N, X, D, lambda, myFx);
    fp = myFxEx(N, X, D, lambda+h, myFx);
    fm = myFxEx(N, X, D, lambda-h, myFx);
    deriv1 = (fp - fm) / 2 / h;
    deriv2 = (fp - 2 * f0 + fm) / h ^ 2;
    if deriv2 == 0
      break
    end
    diff = deriv1 / deriv2;
    lambda = lambda - diff;
    if abs(diff) < Toler
      bGoOn = false;
    end
  end

% end

BACK