# GLPK/Modeling tips

This page provides suggestions for formulating optimization problems in GLPK — be they expressed as MathProg statements plus any required GLPSOL options or coded up as custom programs which use GLPK functions either directly or via bindings.

## Contents

## Big M[edit]

The big M method is a method of formulating certain types of problem by introducing an artificial quantity and assigning it a sufficiently large constant value. Take, for instance, the following MathProg model whereby has to be **either** less than parameter **or** greater than parameter and also where :

param x1; param x2; param M; var x; var b, binary; # constraint is active when b = 0 s.t. c1 : x <= x1 + b * M; # constraint is active when b = 1 s.t. c2 : x >= x2 - (1 - b) * M;

should be chosen such that it is close to the **smallest** value that does not reduce the solution space. Selecting too large an in the circumstances may result in a low quality solution due to numerical imprecision. Further information on the big M method can be found in textbooks.

## Identifying all solutions[edit]

From time to time, the question arises of identifying all feasible solutions or all optimal solutions to a given LP, MIP, or IP optimization problem. GLPK can only return one solution, which incidentally is guaranteed to be globally optimal by virtue of the feasible region being both convex and bounded.

This April 2011 thread (one of several) discusses the merits of adding solution enumeration features to GLPK. One reason for providing all optimal (or even close to optimal) solutions is to allow subsequent human intervention to select which solution to deploy, based on secondary criteria not readily amenable to formal modeling.

### All optimal solutions[edit]

In some cases, the optimal solution is not unique.

GLPK cannot enumerate all the optimal solutions. Nevertheless, with the use of external coding, one can create an iterative method by which the current optimal solution is made infeasible by applying a spurious binary constraint and the modified problem then resubmitted. On termination, the various solutions can be collected for further analysis. This method is not deemed inelegant by any means. Khachiyan et al (2008)^{[1]} and Provan (1994)^{[2]} provide further general background. See also this January 2010 posting which suggests the use of symphony.

### The *N* best solutions[edit]

A variation on this theme for IP and MIP problems is the requirement to identify the best solutions. One strategy involves keeping a dynamic pool of good solutions, discarding and replacing these as better solutions are revealed. The feasibility of this method has yet to be established but this May 2011 thread should offer insights.

### All feasible solutions[edit]

In some circumstances, all the feasible solutions are sought.

GLPK cannot enumerate all the feasible solutions. A similar method to that described above can be used, but with the objective function set to null (or omitted). GLPK will naturally stop at the first feasible solution when the objective function is null.

An alternative is to use SCIP, which is also capable of counting the number of feasible solutions. The project homepage has more details. SCIP is released under the ZIB academic license, which precludes commercial use.

## Close-to-zero values[edit]

Automatically generated problems can easily become badly scaled if the routine that generates them also creates coefficients that are numerically close-to-zero — say around 1.0e−10 in value — either as an artifact of floating point arithmetic, by poor design, or both. Equally, a solution, whether from a hand crafted or generated problem, can be hard to interpret when the results are peppered with close-to-zero values.

GLPK does not offer support for input rounding — the setting of any close-to-zero values to true zero on problem building. That task remains the responsibility of the client code. Neither does GLPK offer support for output rounding — the setting of close-to-zero values to true zero on reporting.^{[3]}

The choice of a suitable tolerance at which to test and reset close-to-zero values is a decision for the client, but 1.0e−09 is sometimes suggested.

### MathProg output[edit]

To omit close-to-zero valued variables when writing out model results with MathProg — in this case, from the matrix `x[i,j]` at a tolerance of 1.0e−03 — use the following statement as a template:

printf{i in I, j in J: abs(x[i,j]) >= 0.001} "...", x[i,j], ... ;

You can also invoke other conditionals to select or suppress output. And you can call `display` instead of `printf`.

## Non-convex functions[edit]

A non-convex function — be it a constraint or an objective — *cannot* be represented directly using mixed-integer (linear) programming. But in some circumstances, a non-convex function may be captured satisfactorily using a piecewise linear approximation. Croxton et al (2002) describe three equivalent formulations.^{[4]}

The general method uses binary variables to "switch in" the required piecewise segment and relies on special ordered set of type 2 (SOS2) constraints. The following document, glpk-sos2.pdf, explains in detail how to model non-convex piecewise linear functions in GLPK.

## Nonlinear objective functions[edit]

### Extremum terms[edit]

A nonlinear objective function in the form

can be modeled as an LP (the objective function is still convex) using GLPK MathProg as follows:

z = max_x1_x2 + max_x3_x4 + ... s.t. max_x1_x2 >= x1 max_x1_x2 >= x2 max_x3_x4 >= x3 max_x3_x4 >= x4 . . .

The more complicated case

can be tackled by substituting for as the two are equivalent. Using a similar idea, the case of maximization

can be similarly transformed by multiplying through by minus one (not shown). Refer to this April 2011 thread for more information.

### SLP techniques[edit]

Successive linear programming (SLP) employs Taylor series expansions to approximate nonlinear objective functions. See this April 2011 thread for a discussion on SLP techniques and GLPK and, in particular, the management of GLPK problem objects and use of warm starts. Note also the C# example file `t1.cs` in the GLPK examples directory.

## Special ordered sets[edit]

### SOS1 : special ordered set of type 1[edit]

A special ordered set of type one (SOS1), sometimes notated , means that, at most, only one of the variables can be non-zero.

If all the variables are binary, SOS1 is equivalent to, in MathProg:

x[1] + x[2] + ... + x[n] <= 1

If the variables are continuous, assuming that `0 <= x[j] <= u[j]`, where `u[j]` is an upper bound of `x[j]`, SOS1 can be modeled in MathProg thus:

x[j] <= u[j] * z[j] for j = 1,...,n, z[1] + ... + z[n] <= 1,

where `z[j]` are auxiliary binary variables.

### SOS2 : special ordered sets of type 2[edit]

Please refer here.

As an example of SOS2 modeling of a piecewise linear function in mathprog, here is a simple example of creating a piecewise linear model of electricity pool price based on published price sensitivities in the Queensland region in the Australian National Electricity Market. Each half hour in the market, the price sensitivity is calculated for every half hour trading interval prior to the actual trading interval. For the period ending 2012-11-22 13:00 the following price sensitivities were published:

Demand offset (MW) | Price |
---|---|

-500 | $49.47 |

-200 | $51.54 |

-100 | $51.97 |

+100 | $56.09 |

+200 | $57.25 |

+500 | $155.01 |

+1000 | $450.34 |

To determine the price at any demand offset **-500 MW <= x <= +1000 MW** then the following GLPK mathprog model can be run:

# # SOS2 example # # Piecewise linear approximation of 2012-11-22 13:00 # QLD price sensitivities # # Dr. H J Mackenzie # HARD software # hjm@hardsoftware.com # param n >= 0; set NO_POINTS := 1..n; set NO_SEGMENTS := 1..(n-1); param x{NO_POINTS}; param y{NO_POINTS}; param x_value; var z{i in NO_SEGMENTS} binary; var s{i in NO_SEGMENTS} >= 0; var y_value; s.t. z_constraint : 1 = sum {i in NO_SEGMENTS} z[i]; s.t. s_constraint {i in NO_SEGMENTS} : s[i] <= z[i]; s.t. x_constraint : x_value = sum {i in NO_SEGMENTS} ((z[i] * x[i]) + ((x[i+1] - x[i]) * s[i])); s.t. y_constraint : y_value = sum {i in NO_SEGMENTS} ((z[i] * y[i]) + ((y[i+1] - y[i]) * s[i])); maximize total : y_value; solve; printf "\nx value = %10.2f\n", x_value; printf "y value = %10.2f\n\n", y_value; printf "\n\n"; data; param n := 7; param : x y := 1 -500 49.47 2 -200 51.54 3 -100 51.97 4 100 56.09 5 200 57.25 6 500 155.01 7 1000 450.34 ; param x_value := 260; end;

## Network programming problems[edit]

The term "network programming" derives from operations research. The underlying mathematics falls under graph theory and, in particular, that part devoted to digraphs.

One commonly encountered network programming task is the minimum cost flow problem (mincost) which seeks a set of network flows that minimize a given set of flow-dependent specific costs. The mincost problem can be used to model a surprising wide range of practical optimization questions.

GLPK offers specialist support for network programming through the DIMACS graph data format and various network algorithms.

### DIMACS graph format[edit]

GLPK supports a variation of the DIMACS format for graph theory problems. The details are covered comprehensively in the official documentation and are therefore not repeated here. Additional background can be found on the interoperability page.

### API calls[edit]

GLPK supports the out-of-kilter mincost algorithm through the function `glp_mincost_okalg`. Again consult the official documentation for more details.

In terms of relative complexity, one user reports (April 2011) that, in round terms, out-of-kilter runs about 5 times faster than simplex, for the same problem, and uses 2/3 the memory.

Version 4.49 of GLPK adds the relaxation method from Bertsekas and Tseng for solving minimum cost flow problems. On large instances, RELAX-IV should be 100–1000 times faster than the standard primal simplex method. This feature is a translation from the original Fortran code and is now licensed under GPLv3. The API call is through function `glp_mincost_relax4`. The algorithm itself was developed by Dimitri Bertsekas at the Massachusetts Institute of Technology.^{[5]}

Neither `glp_mincost_relax4` nor `glp_mincost_okalg` allow non-integer data — all flow lower bounds, arc capacities, and per-unit costs should be integer-valued (although they are passed to their respective functions as type double).

### Minimum cost flow (MCF) problem benchmarks[edit]

Below are some benchmarks for a subset of Klingman's standard minimum cost flow instances (generated by `glp_netgen` and `glp_netgen_prob`).

Problem | Nodes | Arcs | Optimum | Primal iterations | simplex time [s] | OKALG time [s] | RELAX-IV time [s] |
---|---|---|---|---|---|---|---|

101 | 5000 | 25536 | 6191726 | 17560 | 104.4 | 32.5 | 0.1 |

102 | 5000 | 25387 | 72337144 | 23633 | 152.3 | 49.5 | 0.2 |

103 | 5000 | 25355 | 218947553 | 28101 | 186.4 | 68.4 | 0.3 |

104 | 5000 | 25344 | −19100371 | 16808 | 104.3 | 117.6 | 0.2 |

105 | 5000 | 25332 | 31192578 | 17239 | 107.4 | 28.0 | 0.2 |

106 | 5000 | 12870 | 4314276 | 11401 | 44.2 | 16.6 | 0.1 |

107 | 5000 | 37832 | 7393769 | 20315 | 171.0 | 48.5 | 0.2 |

108 | 5000 | 50309 | 8405738 | 22806 | 244.5 | 76.7 | 0.3 |

109 | 5000 | 75299 | 9190300 | 27267 | 411.9 | 110.7 | 0.5 |

110 | 5000 | 12825 | 8975048 | 11215 | 44.2 | 17.3 | 0.1 |

OKALG is the out-of-kilter algorithm.

See this posting for more details.

### GLPSOL[edit]

Unfortunately, GLPSOL — the command-line interface to the GLPK solver — does not provide access to the network programming algorithms. The `--mincost` option currently only indicates that the data to be input is in DIMACS format. As of April 2013, GLPSOL solves a mincost problem by converting it to a linear program and then calling the simplex solver. The out-of-kilter and RELAX-IV algorithms are not available in this context.

### Third-party initiatives[edit]

LEMON is an open source graph library written in C++ which, because it is part of the COIN-OR project, can communicate with the GLPK solver. The LEMON interface supports combinatorial optimization tasks related specifically to graphs and networks and can recognize an number common graph-based data structures. LEMON is release under the Boost free software license.

## Sets and numbers[edit]

A set contains objects — such as numbers, strings, and other sets. A set cannot be said to hold numeric entries, but `param`'s can. So the trick is to create a `param` array that translates the number objects in a set into their numeric value. Consider the following code:

model; param tmax; set T:= 1..tmax; param tt{t in T} := t; data; param tmax := 720; end;

Later on in the model section, use the construct:

tt[t]

to recover a genuine value — one that can be manipulated or printed as a floating point number.

## References[edit]

- ↑ Khachiyan, Leonid; Boros, Endre; Borys, Konrad; Elbassioni, Khaled M; Gurvich, Vladimir (2008). "Generating all vertices of a polyhedron is hard".
*Discrete and Computational Geometry***39**: 174-190. doi:10.1007/s00454-008-9050-5. ISSN 0179-5376. http://www.springerlink.com/content/9273811760307743/. - ↑ Provan, J Scott (1994). "Efficient enumeration of the vertices of polyhedra associated with network LP's".
*Mathematical Programming***63**(1-3): 47-64. doi:10.1007/BF01582058. ISSN 0025-5610. http://www.springerlink.com/content/g0jh04630w248865/. - ↑ While a call to the deprecated
`lpx_set_int_parm(lp, LPX_K_ROUND, 1)`will still compile, it has no effect. - ↑ Croxton, Keely L; Gendron, Bernard; Magnanti, Thomas L (July 2002).
*A comparison of mixed-integer programming models for non-convex piecewise linear cost minimization problems — Operations Research Center Working Paper OR 363-02*. Boston, USA: Operations Research Center, Massachusetts Institute of Technology. http://hdl.handle.net/1721.1/5233. - ↑ Bertsekas, Dimitri P; Tseng, Paul (November 1994).
*RELAX-IV: a faster version of the RELAX code for solving minimum cost flow problems*. Boston, USA: LIDS, Massachusetts Institute of Technology. http://web.mit.edu/dimitrib/www/RELAX4_doc.pdf.