rpb011

Algorithms for Minimization without Derivatives

11. R. P. Brent, Algorithms for Minimization without Derivatives, Prentice-Hall, Englewood Cliffs, New Jersey, 1973, 195 pp. ISBN 0-13-022335-2. CR 15#26544; MR 49#4251, 51#7283. Reprinted in paperback by Dover Publications, Mineola, New York, January 2002. ISBN 0-486-41998-3. Reissued 2013. ISBN-13: 978-0-486-41998-5 (pbk.), ISBN-10: 0-486-41998-3 (pbk.).

Available from Dover Publications.

Abstract: dvi (3K), pdf (76K), ps (27K).

Preface to the Dover editions: html (3K).

Erratum: Mathematics of Computation TE 520 (1975), 1166: pdf (22K), or see below.

Review by D. Goldfarb in Mathematics of Computation 28 (1974), 865-866: pdf (218K).

Excerpt from Goldfarb's review:

"This well written very readable book should be of particular interest to numerical analysts working on methods for finding zeros and extrema of functions. The book is concerned primarily with functions of a single variable and exclusively with methods which use only function values; no derivative evaluations are required by any of the algorithms presented. It also emphasizes algorithmic details that are extremely important for developing reliable computer codes ... This book is an excellent reference work for optimizers and root finders who should find the programs in it of great value."

Abstract

This monograph describes and analyzes some practical methods for finding approximate zeros and minima of functions, using only function (not derivative) evaluations. Contents include:

The use of successive interpolation for finding simple zeros of a function and its derivatives.
An algorithm with guaranteed convergence for finding a zero of a function.
An algorithm with guaranteed convergence for finding a minimum of a function of one variable.
Global minimization given an upper bound on the second derivative.
A new algorithm for minimizing a function of several variables without calculating derivatives.
Computer programs which implement these algorithms.

Particular attention is paid to giving implementations which are reliable in the presence of rounding errors and floating-point under/overflow.

Summary

[This is an abbreviated version of the summary given in Chapter 1.]

Chapter 2 collects some results on Taylor series, Lagrange interpolation, and divided differences. Most of these results are needed later, and in some cases are slight generalizations of classical results.

Chapter 3 provides a theoretical foundation for the algorithms on zero-finding and minimization described in Chapters 4 and 5. In particular, it is shown when the algorithms will converge superlinearly, and what the order of convergence will be. The assumptions about differentiability are weaker than those of previous authors such as Ostrowski (1966). The final sections of Chapter 3 are devoted to methods for accelerating convergence, and include some numerical examples.

Chapter 4 describes an algorithm for finding a zero of a function which changes sign in a given interval. The algorithm is based on a combination of successive linear interpolation and bisection, in much the same way as Dekker's algorithm. However, the new algorithm never converges much more slowly than bisection, whereas Dekker's algorithm may converge extremely slowly in certain cases.

It is well known that bisection is optimal, in a minimax sense, for finding zeros of functions which change sign in an interval. The motivation for Dekker's algorithm and the new algorithm of Chapter 4 is that bisection is not optimal if the class of allowable functions is suitable restricted. Both algorithms exhibit superlinear convergence to a simple zero of a C¹ function (by the results of Chapter 3). Thus, in practice convergence is usually much faster than for bisection. The new algorithm incorporates inverse quadratic interpolation as well as linear interpolation, so it is often slightly faster than Dekker's example on well-behaved functions (examples are given in Section 4.4).

An algorithm for finding a local minimum of a function of one variable is described in Chapter 5. The algorithm combines golden section search and successive parabolic interpolation, in the same way as bisection and successive linear interpolation are combined in the zero-finding algorithm of Chapter 4. Convergence in a reasonable number of function evaluations is guaranteed (Section 5.5). For a C² function with positive curvature at the minimum, the results of Chapter 3 show that convergence is superlinear (provided that the minimum is at an interior point of the interval). Other algorithms given in the literature either fail to have these two desirable properties, or their order of convergence in typical cases is less than that for the algorithm of Chapter 5.

Sections 5.2 and 5.3 consider the effect of rounding errors, and analyse the limitations on the accuracy of any algorithm which is based entirely on limited-precision function evaluations.

Chapter 6 considers the problem of finding an approximation to the global minimum of a function f, defined on a finite interval, if some a priori information about f is given. Usually this information is an upper bound on f". A good (in fact close to optimal) algorithm is described. Particular attention is paid to the problem of giving guaranteed bounds in the presence of rounding errors. Insight into the behaviour of the algorithm is obtained by considering some tractable special cases. Finally, the algorithm is generalized to handle functions of several real variables, given suitable a priori information.

Chapter 7 describes a modification of Powell's (1964) algorithm for finding a local minimum of a function of several variables without calculating derivatives. The modification is designed to ensure quadratic convergence, and to avoid the difficulties with Powell's criterion for accepting new search directions.

The book includes a bibliography of over 300 items, and an Appendix contains Fortran translations of the Algol procedures given in Chapters 4-6.

Erratum (Prentice-Hall edition)

On page 163, immediately before the line reading

COMMENT: TRANSPOSE V FOR MINFIT;
the following was inadvertently omitted:

FOR J := 1 UNTIL N DO V(I,J) := SL*V(I,J) END END;

Erratum (Dover edition, first printing, 2002)

On page 99, there is a glitch in the reproduction of Diagram 4.1. Between the x-axis and the caption, the Greek letters alpha₀, alpha₁, ... , alpha₅ are missing (they should label the six local minima of the curve). The strange black and white stripe above the caption should be ignored. Here is the correct version.

Errata (all Dover and Prentice-Hall editions)

On page 80, line 11, "p < q × (a - x)" should be "p > q × (a - x)".
The corresponding Fortran code on page 189 is correct, as is the ALGOL W code on page 137 of [6].
Thanks to Jason M. Lenthe for finding this error.

On page 80, line 9, it appears that the variable "d" might be uninitialized (though this is not possible if "tol" is non-negative, as it should be).
To avoid a possible compiler diagnostic, initialize "d := 0;" at line -5 on page 79.
Similarly, set "D = 0.0" after line 5 of the Fortran code on page 189.
Thanks to Roy Lowrance for pointing out this problem.

Excerpts from the book

These are mainly gif page or double-page images:

front cover of the Prentice-Hall edition: 1;
front cover of the 2002 Dover edition: 1;
front cover of the 2013 Dover edition: 1;
dustjacket blurb (Prentice-Hall edition): 1-2;
back cover of the 2002 Dover edition: 1;
back cover of the 2013 Dover edition: 1;
inside covers of the 2002 Dover edition: 1, 2;
frontmatter: 1, 2, 3, 4;
table of contents: 1, 2, 3;
preface to the Dover editions: 1;
preface: 1, 2;
page 99: 1;
index: 1, 2.

All the above gif files are available in a single pdf file (859K).

Comment

The book is a revision of the author's Ph.D. thesis. Some of the material also appeared in Brent [5, 7].

Download

Since the Prentice-Hall edition has been out of print for many years, we provide a scanned copy here (pdf, 9948K).

Software

Fortran and C versions of some of the zero-finding and minimization routines given in the book are available -- please contact the author.

Update

Since the first edition of Algorithms for Minimization without Derivatives was published in 1973, there has been a great deal of research on algorithms for optimization of functions of several variables, the topic of Chapter 7. Also, techniques for computing derivatives analytically have been refined by my former student Andreas Griewank and others, so there is now less reason to consider algorithms which use only function values.

If you are interested in developments since 1973, a good place to start would be with one or more of the following books. Reviews of most of them can be found at amazon.com .

A. R. Conn, K. Scheinberg and L. N. Vicente, Introduction to Derivative-Free Optimization, SIAM, Philadelphia, 2009. ISBN 9780898716689. (Reviewed in SIAM Review 53, 2 (2011), 395-396.)

J. E. Dennis, Jr. and R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Classics in Applied Mathematics, 16, SIAM, 1996, ISBN 0898713641. (Material dates from 1983.)

R. Fletcher, Practical Methods of Optimization, second edition, John Wiley and Sons, 2000, ISBN 0471494631. (Material dates from 1987.)

C. T. Kelley, Iterative Methods for Optimization, Frontiers in Applied Mathematics, 18, SIAM, 1999, ISBN 0898714338.

J. Nocedal and S. J. Wright, Numerical Optimization, Springer Series in Operations Research, Springer-Verlag, 1999, ISBN 0387987932.

M. H. Wright and P. E. Gill, Practical Optimization, Academic Press, reprinted 1997, ISBN 0122839528. (Material dates from 1981.)

Go to next publication

Return to Richard Brent's index page