The Simplex Method


The simplex method generates a sequence of feasible iterates by repeatedly moving from one vertex of the feasible set to an adjacent vertex with a lower value of the objective function . When it is not possible to find an adjoining vertex with a lower value of , the current vertex must be optimal, and termination occurs.

After its discovery by Dantzig in the 1940s, the simplex method was unrivaled, until the late 1980s, for its utility in solving practical linear programming problems. Although never observed on practical problems, the poor worst-case behavior of the algorithm---the number of iterations may be exponential in the number of unknowns---led to an ongoing search for algorithms with better computational complexity. This search continued until the late 1970s, when the first polynomial-time algorithm (Khachiyan's ellipsoid method) appeared. Most interior-point methods, which we describe later, also have polynomial complexity.

Algebraically speaking, the simplex method is based on the observation that at least of the components of are zero if is a vertex of the feasible set. Accordingly, the components of can be partitioned at each vertex into a set of basic variables---all nonnegative---and a set of nonbasic variables-all zero. If we gather the basic variables into a subvector, , and the nonbasics into another subvector, , we can partition the columns of as , where contains the columns that correspond to . (Note that is a square matrix.)

At each iteration of the simplex method, a basic variable (a component of ) is reclassified as nonbasic, and vice versa. In other words, and exchange a component. Geometrically, this swapping process corresponds to a move from one vertex of the feasible set to an adjacent vertex. We therefore need to choose which component of should enter (that is, be allowed to move off its zero bound) and which component of should enter (that is, be driven to zero). In fact, we need make only the first of these choices, since the second choice is implied by the feasibility constraints and . In selecting the entering component, we note that can be expressed as a function of alone. We can express in terms of by noting that implies that

Hence, partitioning into and in the obvious way, we have

The vector

is the reduced-cost vector. If all components of are strictly positive and some component (say, the th component) of is negative, we can decrease the value of by allowing component of to become positive while adjusting to maintain feasibility. Unless there exist feasible points that make arbitrarily negative, the requirement imposes an upper bound on , the th component of .

In principle, we can choose any component with as an entering variable. If there are no negative entries in , the current point is optimal. If there is more than one, we would ideally pick the component that will lead to the largest reduction in on the current iteration. Heuristics for making this selection are discussed below.

It follows from and the fact that the remaining elements of are held at zero that

where denotes the column of that corresponds to . We choose the new value of to be the largest value that maintains . To obtain explicitly, we can rearrange the previous equation to obtain

The index that achieves the minimum in this formula indicates the basic variable that is to become nonbasic. If more than one such component achieves the minimum simultaneously, the one with the largest value of is usually selected.

Most of the computational cost in simplex algorithms arises from the need to compute the vectors and and the need to keep track of the changes in and resulting from the changes in the basis at each iteration. We could simply recompute and store explicitly after each step. This strategy is undesirable for two reasons. First, the matrix is usually dense even though the original is sparse. Hence, explicit calculation of requires prohibitive amounts of computing time and storage. Second, since changes only slightly from one iteration to the next, we should be able to update information about and rather than recompute it anew at every step.

A technique that is used in many commercial codes is to store an factorization of . That is, to keep track of matrices , , , and such that

where and are permutation matrices (identity matrices whose rows have been reordered), while and are lower- and upper-triangular matrices, respectively. Since and , we can calculate by performing the following sequence of operations:

The first two operations are back- and forward-substitutions with triangular matrices, which can be performed efficiently, while the final operation is a simple rearrangement of the elements of . Calculation of proceeds similarly. The permutation matrices and are chosen so that the factorization is reasonably stable and the factors and are not too much denser than the original matrix .

When is changed by a single column, the factorization can be updated by applying a number of elementary transformations; that is, additions of multiples of one row of the matrix to another row. Rather than applying these transformations explicitly to the existing factors, they are usually stored in a compact form. When the storage occupied by the elementary transformations becomes excessive, they are discarded, and the current basis matrix is refactored from scratch.

We return to strategies for choosing the component of to enter the basis, an operation that is known as pricing in linear programming parlance. The simplest strategy is to choose to correspond to the most negative component of the reduced-cost vector . This approach, known as Dantzig's rule, gives the fastest decrease in the objective function per unit increase in the entering variable. However, change in the entering variable often does not give a good indication of how far we actually have to move; it could be that a small perturbation in the entering component corresponds to a huge step along the corresponding edge of the feasible polytope, so we actually have to move a long way to get the benefits promised by Dantzig's rule. This observation is the motivation behind the steepest-edge strategy, in which we choose the edge along which the objective function decreases most rapidly per unit of distance along the edge. The extra computation needed to identify the steepest edge is often more than offset by a reduction in the number of iterations, and this strategy is an option in packages such as CPLEX and OSL.

When the linear program is too large for the data to be stored in core memory, the cost of computing the complete reduced-cost vector at each iteration may require too much traffic with secondary storage, and it may take too long. In this situation, a partial pricing strategy may be appropriate. This strategy finds only a subvector of and chooses the entering variable from those components that are actually computed. Of course, the subset of indices that defines the subvector of should be changed frequently.

A problem with all of these strategies is that they do not predict the actual decrease in the objective function that will occur on this iteration. It may happen that we are able to move only a short distance along the chosen edge before encountering another vertex, so the reduction may be minimal. A multiple pricing strategy selects a small group of columns with negative reduced costs and computes the actual reduction that would be achieved if any one of the corresponding variables entered the basis. This process is expensive, since it requires the calculation of for each candidate . One of the candidates is chosen, and the remainder are retained as candidates for the next iteration, since the marginal cost of updating the column to correspond to the new basis matrix is not too high. Of course, the candidate list must be refreshed frequently.

The CPLEX, C-WHIZ, FortLP, LAMPS, LINDO, MINOS, OSL, and PC-PROG packages can be used to solve large-scale problems. Each of these packages accepts input in the industry-standard MPS format. Additionally, some have their own customized input format (for example, CPLEX LP format for CPLEX, direct screen input for PC-PROG). Others can be operated in conjunction with modeling languages ( CPLEX, LAMPS, MINOS, and OSL interface with GAMS; LINDO and OSL interface with the LINGO modeling language; CPLEX, MINOS, and OSL interface with AMPL). Recently, interfaces between spreadsheet programs and linear programming packages have become available. The What's Best! package links a wide range of standard spreadsheets (including Lotus 1-2-3 and Quattro-Pro) to LINDO.

The IMSL, NAG (FORTRAN) and NAG (C) libraries contain simplex-based subroutines. These codes use dense linear algebra techniques to manipulate the basis matrix and hence are suited for small- to medium-scale problems, rather than large-scale problems. The linear programming problem must be specified in the form

where is an matrix. The user must set up the problem data and pass it to the appropriate subroutine.

The BQPD package is aimed primarily at quadratic programming problems, but it does solve linear programs as a special case. BQPD can take advantage of sparsity. As with the libraries above, it is the user's responsibility to supply the problem data through subroutine arguments.

The packages LSSOL and QPOPT are aimed at linear least squares or quadratic programming problems but, as part of their capability, they can solve small- to medium-sized linear programs.


Up To:

Linear Programming.


 

treesig.gif (5961 bytes)

[ OTC Home Page | NEOS Guide | NEOS Server | Optimization Tree ]


Updated 28 March 1996