MAST20018-离散数学 - Father Essays

MAST20018
Discrete Maths and Operations Research:
Linear Programming
1 / 423
Lecture Outline
Introduction to Linear Programming
The Geometry of Linear Programming
Geometry of LP in Higher Dimensions
Basic Feasible Solutions
Fundamental Theorem of Linear Programming
Preview of the Simplex Method
The Simplex Method
Solution Possibilities
Non-Standard Formulations
Duality Theory
Sensitivity Analysis
2 / 423
§1 – Introduction to Linear Programming
Synopsis
What is mathematical optimisation
What is linear programming
What do we mean by linear constraints and a linear objective
function
3 / 423
§1.1 – Applications of Mathematical Optimisation
Examples of optimisation problems:
Airline scheduling;
Shipping;
Public transportation;
Supply chain and logistics;
Defence;
Mining;
Electricity and water resources.
Emergency and disaster management;
4 / 423
§1.2 – What is Mathematical Optimisation
There are two aspects to optimisation: modelling and solving.
An optimisation model contains:
Decision variables
Objective function
Constraints
5 / 423
§1.3 – Types of Optimisation Formulations
Optimisation models can be placed into distinct classes depending
on the best way to express the objective function or the
constraints.
For deterministic problems, we are most interested in determining
whether the objective function or constraints can be expressed
linearly or nonlinearly, and whether the variables themselves can be
continuous or integral.
These factors determine the class of algorithms available to solve
the problem.
6 / 423
Types of Optimisation Formulations
NLP: Nonlinear program LP: Linear Program ILP: Integer LP
max f(x) or min f(x) min cTx min cTx
gi(x) = 0 i = 1, . . . , k Ax = a Ax = a
hj(x) ≤ 0 j = 1, . . . ,m Bx ≤ b Bx ≤ b
x ≥ 0 x ≥ 0
x ∈ Rn x ∈ Zn
In LP and ILP, c ∈ Rn, and A and B are real matrices with n columns.
7 / 423
§1.4 – Solving Linear Programs
There are many linear programming solvers available both
commercially and free for academic use.
Examples of commercial solvers (also with free academic licences)
are
CPLEX,
Gurobi, and
FICO Xpress
Common mathematical software, such as MATLAB, have packages
for solving linear programs.
Microsoft Excel Solver also solves linear (integer) and nonlinear
optimization problems.
8 / 423
Computational speed-up
Bixby1 tested the various versions of CPLEX on the same
computer. One of the data sets was a linear program with 410,640
constraints and 1,584,000 variables.
1988 CPLEX 1.0 29.8 days
1997 CPLEX 5.0 1.5 hours
2002 CPLEX 8.0 86.7 seconds
2003 CPLEX 9.0 59.1 seconds
According to Bixby, the algorithmic improvement (machine
independent) is 3,300 times and machine improvement is 1,600
times, giving a total improvement of 5.2 million times in a period
of 16 years (1988 to 2004)!!
1Bob Bixby, Operations Research, Jan 2002, pp. 1-13, updated in 2004.
9 / 423
§1.5 – Linear Programming
These are optimisation problems whose objective functions and
constraints are linear.
Definition 1.1
A real-valued function f , defined on Rn is a linear function if
and only if there exists a c ∈ Rn such that
f(x) = c1x1 + c2x2 + …+ cnxn.
10 / 423
Definition 1.2
A linear equation is an expression of the form
c1x1 + c2x2 + …+ cnxn = b
A linear inequality is an expression of the form
c1x1 + c2x2 + …+ cnxn ≤ b
or
c1x1 + c2x2 + …+ cnxn ≥ b.
11 / 423
An example of a linear optimisation problem with a linear objective
function and linear constraints is:
Example 1.1
z := max
x
4×1 + 3×2
subject to
x1 + x2 ≤ 30
x1 ≥ 0
x2 ≥ 0.
12 / 423
§1.6 – The Diet Problem
A student wants to plan a nutritious diet, but they are on a limited
budget: they want to spend as little money as possible.
Their minimum nutritional daily requirements are as follows: 2000
calories, 55g protein, 800mg calcium. The student is considering
the following foods2:
Food Serving size Energy (Cal) Protein (g) Calcium (mg) Price per serving ($)
Oatmeal 28g 110 4 2 0.30
Chicken 100g 205 32 12 2.40
Eggs 2 large 160 13 54 1.30
Whole milk 250ml 160 8 285 0.90
Cherry pie 170g 420 4 22 0.20
Pork and beans 260g 260 14 80 1.90
2Taken from Bob Bixby’s notes from the Combinatorial Optimisation at
Work meeting
http://co-at-work.zib.de/berlin2009/downloads/2009-09-24/
13 / 423
The Diet Problem
We can represent the number of servings of each type of food in
the diet by the variables:
x1 the daily number of servings of Oatmeal
x2 the daily number of servings of Chicken
x3 the daily number of servings of Eggs
x4 the daily number of servings of Milk
x5 the daily number of servings of Cherry Pie
x6 the daily number of servings of Pork and Beans
14 / 423
The Diet Problem
and can express the problem as a linear program as follows:
Minimise 0.3×1 + 2.40×2 + 1.3×3 + 0.9×4 + 2.0×5 + 1.9×6,
subject to
110×1 + 205×2 + 160×3 + 160×4 + 420×5 + 260×6 ≥ 2000
4×1 + 32×2 + 13×3 + 8×4 + 4×5 + 14×6 ≥ 55
2×1 + 12×2 + 54×3 + 285×4 + 22×5 + 80×6 ≥ 800
x1, x2, x3, x4, x5, x6 ≥ 0.
Here the first three inequalities are functional constraints.
The constraints x1, x2, x3, x4, x5, x6 ≥ 0 are not functional
constraints.
15 / 423
§1.7 – Formulation of a Linear Program
We are moving towards a convenient form for expressing linear
programming problems. Let
m denote the number of functional constraints
n denote the number of decision variables
bi denote the available level of resource i, i = 1, 2, 3, …,m
xj denote the level of activity j, j = 1, 2, 3, …, n
z denote the value of the objective function
cj denote the return/cost per unit of activity j
aij denote the amount of resource i consumed /produced by
each unit of activity j.
16 / 423
Assumptions of LP Problems
Activity levesl are continuous (divisibility). For activity j, the
activity level xj is a continuous variable.
(However, quite often a problem requires that the activity
levels be integers. In this case, a continuous formulation of
such a problem is only an approximation.)
The contribution of the objective function (and the LHS of
each constraint) from decision variable xj is proportional to
the level of activity j.
Independence of the effects of activities (additivity):
f(x1, …, xn) = c1x1 + …+ cjxj + …+ cnxn,
ai,1×1 + …+ ai,jxj + …+ ai,nxn ≤ bi,
or ai,1×1 + …+ ai,jxj + …+ ai,nxn ≥ bi.
The input parameters are known with certainty.
17 / 423
Steps to Take
The steps that need to be taken are:
1. Choose decision variables
These are the items about which we need to make decisions.
For linear programs they are continuous. For example, we may
need to determine the start times for several activities. We
could then let our decision variables be: xi := the start time
for activity i. You should be remember to include units in
your definition of the decision variables.
2. Write down the objective function
This is a linear expression that we wish to either maximise or
minimise. For example, we wish to minimise the makespan of
a project, or we wish to maximise profit.
3. Write down the constraints (and do not forget to write that
the variables are non-negative).
18 / 423
Steps to Take
Example 1.2
A steel company must decide how to allocate next week’s time on
a rolling mill, which is a machine that takes slabs of steel and can
produce either bands (at 200 tonnes/hour) or coils (at 140
tonnes/hour).
Bands and coils can be sold for $25/tonne and $30/tonne
respectively.
Based on currently booked orders, the company must produce at
least 6000 tonnes of bands and 4000 tonnes of coils.
Given that there are 40 hours of production time this week, how
many tonnes of bands and coils should be produced to yield the
greatest revenue
19 / 423
Steps to Take
Choose the decision variables:
Let x1 be the amount of time (in hours) devoted to making
bands
Let x2 be the amount of time (in hours) devoted to making
coils.
Write down the objective function:
This is max(25× 200)x1 + (30× 140)x2
Write down the constraints:
We need at least 6000 tonnes of bands, so 200×1 ≥ 6000 and
4000 tonnes of coils, so 140×2 ≥ 4000. We are limited to 40
hours of production time, so x1 + x2 ≤ 40. The decision
variables must be nonnegative, i.e. x1, x2 ≥ 0.
Question: Is this LP feasible !
20 / 423
Example 1.3 (The (balanced) transportation problem)
This is a classical Operations Research problem with
a supply of some commodity;
destinations where the commodity is needed.
The total available supply equals total demand.
The formulation can be adapted for general allocation and
scheduling problems.
21 / 423
(The facts in this example have been amended to simplify the
problem.)
Adelaide has two water catchment storage facilities:
Storage 1 can store up to 400 megalitres per day
Storage 2 can store up to 500 megalitres per day
We assume that the storage facilities are topped up by a third dam
with infinite capacity – the storage facilities can therefore supply
any amount up to their maximum on demand.
22 / 423
Three secondary dams are supplied from these two facilities:
Barossa (Dam 1) needs at least 300 megalitres per day
Happy Valley (Dam 2) needs at least 200 megalitres per day
Kangaroo Creek (Dam 3) needs at least 400 megalitres per
day
23 / 423
The distances between storage facilities and the secondary dams
are (in kilometres).
Dam 1 Dam 2 Dam 3
Storage Facility 1 75 40 85
Storage Facility 2 20 55 75
24 / 423
Task 1
Formulate a linear program that minimises the total transportation
distances to meet the demands of the secondary dams, such that
capacities of the storage facilities are not violated.
25 / 423
Analysis
Decision Variables:
Let xij be the quantity of water delivered from catchment i to
location j, i = 1, 2; j = 1, 2, 3.
Objective function:
f(x) := 75×11 + 40×12 + 85×13
+20×21 + 55×22 + 75×23.
26 / 423
Constraints:
Production capacities:
x11 + x12 + x13 ≤ 400 (storage 1)
x21 + x22 + x23 ≤ 500 (storage 2).
27 / 423
Location requirements:
x11 + x21 ≥ 300 (location 1)
x12 + x22 ≥ 200 (location 2)
x13 + x23 ≥ 400 (location 3).
Non-negativity:
x11, x12, x13, x21, x22, x23 ≥ 0.
28 / 423
Complete Formulation
z := min z
= 75×11 + 40×12 + 85×13 + 20×21 + 55×22 + 75×23,
such that
x11 + x12 + x13 ≤ 400
x21 + x22 + x23 ≤ 500
x11 + x21 ≥ 300
x12 + x22 ≥ 200
x13 + x23 ≥ 400
x11, x12, x13, x21, x22, x23 ≥ 0.
29 / 423
§2 – The Geometry of Linear Programming
Synopsis
What is a feasible region
How do we obtain the optimal solution via the graphical
method
Are there pathological cases
30 / 423
§2.1 – The Feasible Region
There are strong relationships between the geometric and algebraic
features of LP problems.
We shall first examine this aspect in two dimensions (n = 2) and
then consider higher dimensions. This latter step will involve an
understanding of the relationship between geometric and algebraic
aspects of linear programming problems.
31 / 423
Example 2.1
Consider the problem
z := max z = 4×1 + 3×2
subject to
2×1 + x2 ≤ 40 (production machine time)
x1 + x2 ≤ 30 (production packaging time)
x1 ≤ 15 (market demand)
x1 ≥ 0
x2 ≥ 0.
32 / 423
First constraint:
2×1 + x2 ≤ 40
Corresponding hyperplane:
2×1 + x2 = 40
Corresponding halfspace:
22x + x = 401
33 / 423
Second constraint:
x1 + x2 ≤ 30
Corresponding hyperplane:
x1 + x2 = 30
Corresponding halfspace:
x + x = 30
1 2
34 / 423
Third constraint:
x1 ≤ 15
Corresponding hyperplane:
x1 = 15
Corresponding halfspace:
x = 15
1
35 / 423
Taking note of the nonnegativity constraints, the overall feasible
region is therefore:
feasible region
36 / 423
Definition 2.1
For an LP problem with n variables, a vector x ∈ Rn is called a
feasible solution if it satisfies all constraints of the problem.
Definition 2.2
The set of all feasible solutions of an LP problem is called the
feasible region.
37 / 423
§2.2 – The Graphical Method
We can use a graphical method to solve LP problems in two
dimensions.
The objective function is
f(x) = z = 4×1 + 3×2
Hence
x2 = (z 4×1)/3
so that, for a given value of z, the level curve is a straight
line with slope 4/3. We can plot it for various values of z.
We can identify the (x1, x2) pair that yields the largest
feasible value for z.
38 / 423
39 / 423
Important Observations
The graphical method can be used to identify the hyperplanes
specifying the optimal solution.
The optimal solution itself is determined by solving the
respective equations.
Don’t be tempted to “read” the optimal solution directly from
the graph.
The optimal solution in this example is an extreme point of
the feasible region.
40 / 423
Questions
What guarantee is there that an optimal solution exists
In fact, is there any a priori guarantee that a feasible solution
exists
Could there be more than one optimal solution
41 / 423
§2.3 – ‘Pathological’ Cases (infinitely many solutions)
For two-variable problems the above method works well. However
‘pathological’ cases still exist.
What if z = 4×1 + 2×2 in the previous example
There are infinitely many points in the feasible region where
z = 80.
42 / 423
§2.3 – “Pathological” Cases (infeasibility)
What if we had another constraint 4×1 + 3×2 ≥ 150
In this case the feasible region is empty. Since there is no feasible
solution, there is definitely no optimal solution.
43 / 423
§2.3 – “Pathological” Cases (LP is unbounded)
What if we only had the constraint 4×1 + 3×2 ≥ 101
In this case the feasible region is unbounded. For certain
optimization problems (for example, if we are minimizing
z = 4×1 + 3×2) there still can be an optimal solution. However, if
the objective function increases as we ‘head further into the
unbounded region’, no optimal solution exists.
44 / 423
§2.3 – “Pathological” Cases (feasible region is not closed)
Sometimes the feasible region is not closed. For example,
2×1 + x2 < 40 x1 + x2 < 30 x1 ≤ 15 x1 ≥ 0 x2 ≥ 0 feasible region The ‘corner point’ at (10, 20) is not feasible. BUT, we will only ever consider closed regions in practice. 45 / 423 §3 – The Geometry of LP in Higher Dimensions Synopsis What are convex sets What is a hyperplane What is a halfspace What is linear independence What is a polytope How do we get from standard form to canonical form What are slack variables 46 / 423 §3.1 – The Geometry in Higher Dimensions Let x(1), ..., x(k) be a collection of points in Rn. Definition 3.1 A point w such that w = α1x (1) + α2x (2) + ...+ αkx (k), is a linear combination of x(1), ..., x(k). Definition 3.2 A point w such that w = α1x (1) + α2x (2) + ...+ αkx (k), where 0 ≤ αi ≤ 1 and ∑ i αi = 1, is a convex combination of x(1), ..., x(k). 47 / 423 Example 3.1 The point x = (3, 2, 1) is a linear combination of x(1) = (1, 0, 0) x(2) = (0, 1, 0) x(3) = (0, 0, 1), using the coefficients α1 = 3, α2 = 2 and α3 = 1. The point y = (9, 4, 1) is not a linear combination of x(1) = (3, 1, 0) x(2) = (2, 4, 0) x(3) = (4, 3, 0). Why 48 / 423 Example 3.2 The point x = (1/2, 1/3, 1/6) is a convex combination of x(1) = (1, 0, 0) x(2) = (0, 1, 0) x(3) = (0, 0, 1), using the coefficients α1 = 1/2, α2 = 1/3 and α3 = 1/6. The point y = (3, 3, 2) is a convex combination of x(1) = (3, 1, 0) x(2) = (2, 4, 0) x(3) = (4, 4, 6). Why 49 / 423 Definition 3.3 The line segment joining two points y, z in Rn is the collection of all points x such that x = λy + (1 λ)z for some 0 ≤ λ ≤ 1. z y 50 / 423 Definition 3.4 A subset C of Rn is convex if for any two points y, z ∈ C and any 0 ≤ λ ≤ 1, the point λy + (1 λ)z is also in C. Geometrically this means that, for any pair of points y, z ∈ C, the line segment connecting these points is in C. 51 / 423 Theorem 3.1 Let C1, . . . ,Cn be a collection of convex sets. Then n i=1 Ci is a convex set. That is, the intersection of any finite number of convex sets is a convex set. Proof. 52 / 423 Definition 3.5 A set of points H Rn satisfying a linear equation of the form a1x1 + a2x2 + ...+ anxn = b, for (a1, a2, . . . , an) 6= (0, 0, . . . , 0), is a hyperplane. Hyperplanes are of dimension n 1. (Why ) 53 / 423 Definition 3.6 The two closed half-spaces defined by the hyperplane a1x1 + a2x2 + ...+ anxn = b are the set defined by a1x1 + a2x2 + ...+ anxn ≥ b (positive half space) and a1x1 + a2x2 + ...+ anxn ≤ b (negative half space). 54 / 423 Theorem 3.2 Hyperplanes and their half-spaces are convex sets. Proof. 55 / 423 Definition 3.7 A polytope is a set that can be expressed as the intersection of a finite number of closed half-spaces. 56 / 423 Definition 3.8 A polyhedron is a non-empty bounded polytope. 57 / 423 Definition 3.9 A point x of a convex set C is said to be an extreme point or vertex of C if whenever we write x = λy + (1 λ)z, with y, z ∈ C and 0 < λ < 1, then y = z = x. Geometrically, this means that there are no points y and z in C (different from x) with x lying on the line segment connecting y and z. 58 / 423 §3.2 – Linear Independence Definition 3.10 A collection of vectors x(1), ..., x(s) in Rn is said to be linearly independent if α1x (1) + α2x (2) + ...+ αkx (k) = 0, implies that α1 = α2 = ... = αk = 0. Geometrically this means that no vector in the collection can be expressed as a linear combination of the other vectors in the collection. 59 / 423 Example 3.3 The vectors (1, 0, 0), (0, 1, 0), (0, 0, 1) are linearly independent. The vectors (2, 4, 3), (1, 2, 3), (1, 2, 0) are not linearly independent. 60 / 423 Theorem 3.3 The set of feasible solutions of the standard LP problem defined by a11x1 + a12x2 + ...+ a1nxn ≤ b1 a21x1 + a22x2 + ...+ a2nxn ≤ b2 ... ... ... am1x1 + am2x2 + ...+ amnxn ≤ bm xj ≥ 0, j = 1, ..., n is a convex polytope. 61 / 423 Proof. By definition, a polytope is the intersection of finitely many half-spaces. We have seen a half-space is a convex set, the intersection of any finite number of convex sets is a convex set. The fact that the polytope is convex follows. 62 / 423 Theorem 3.4 If a linear programming problem has exactly one optimal solution, then this solution must be an extreme point of the feasible region. Proof. We shall prove this theorem by contradiction. 63 / 423 Assume, to the contrary, that the problem has exactly one optimal solution, call it x, that is not an extreme point of the feasible region. Since x is not an extreme point, there are two distinct feasible solutions, say y and z, which are not equal to x and a scalar λ, 0 < λ < 1, such that x = λy + (1 λ)z. 64 / 423 If we rewrite the objective function in terms of y and z rather than x, we obtain f(x) = f(λy + (1 λ)z). Hence f(x) = n∑ j=1 cjxj = n∑ j=1 cj [λyj + (1 λ)zj ] = λ n∑ j=1 cjyj + (1 λ) n∑ j=1 cjzj = λf(y) + (1 λ)f(z). 65 / 423 Now, because 0 < λ < 1, there are only three possibilities 1. f(y) < f(x) < f(z) 2. f(z) < f(x) < f(y) 3. f(y) = f(x) = f(z) Since x is an optimal solution, the first two possibilities cannot occur (why ). Thus, the third case must hold. But this contradicts the assertion that x is the only optimal solution. Our initial assumption that x is not an extreme point must be wrong. 66 / 423 As an exercise, prove the following: Theorem 3.5 If a linear programming problem has more than one optimal solution, it must have infinitely many optimal solutions. Furthermore, the set of optimal solutions is convex. 67 / 423 §3.3 – Standard Form We would like to write our linear programs in a fixed format so that it is easier to apply algorithms, such as the Simplex and Interior Point methods. One such format is the standard form, which means that It must be a maximisation problem; the constraints must be ≤; bi ≥ 0, for all i, the RHS must be positive; xi ≥ 0 for all i, all variables are non-negative. 68 / 423 In some cases, we are not be able to achieve bi ≥ 0 and ≤ simultaneously, and therefore the LP cannot be written in standard form. Later we will introduce the so-called canonical form, and show that any LP can be written in this form. For the standard form conversion you should familiarise yourself with converting from: a minimum objective function to a maximum; the case where some RHS constants are negative; some constraints are ≥; some variables are unrestricted. 69 / 423 Definition 3.11 An LP problem is in standard form if it is of the form: max x z = n∑ j=1 cjxj a11x1 + a12x2 + ...+ a1nxn ≤ b1 a21x1 + a22x2 + ...+ a2nxn ≤ b2 ... ... am1x1 + am2x2 + ...+ amnxn ≤ bm xj ≥ 0, j = 1, ..., n where bi ≥ 0 for all i. 70 / 423 With A = a11 a12 · · · a1n a21 a22 · · · a2n ... ... ... ... am1 am2 · · · amn , b = b1 b2 ... bm , c = c1 c2 ... cn , x = x1 x2 ... xn , an LP in standard form can be written as: max x z = cTx Ax ≤ b x ≥ 0 with b ≥ 0. The matrix A is called the coefficient matrix of the linear program. Often we write A = (aij). 71 / 423 §3.4 – Canonical Form If we have an LP in standard form, we can turn it into an LP in canonical form by introducing new variables xj ≥ 0, j = n+ 1, ..., n+m that turn the inequalities into equations: max z = n∑ j=1 cjxj a11x1 + a12x2 + ...+ a1nxn + xn+1 = b1 a21x1 + a22x2 + ...+ a2nxn + xn+2 = b2 ... ... ... am1x1 + am2x2 + ...+ amnxn + xn+m = bm xj ≥ 0, j = 1, ..., n+m As in the standard form, bi ≥ 0 for all i. 72 / 423 The variables xj ≥ 0, j = n+ 1, ..., n+m are called slack variables. Physically, they denote the difference between the right hand side and the left hand side of the inequalities in the standard form. When a slack variable takes the value zero, the corresponding inequality is satisfied with equality. 73 / 423 Lemma 3.1 When bi ≥ 0 for all i, the canonical form max z = n∑ j=1 cjxj a11x1 + a12x2 + ...+ a1nxn + xn+1 = b1 a21x1 + a22x2 + ...+ a2nxn + xn+2 = b2 ... ... ... am1x1 + am2x2 + ...+ amnxn + xn+m = bm xj ≥ 0, j = 1, ..., n+m has at least one feasible solution, x = (0, 0, 0, ..., 0, b1, b2, ..., bm) This solution is obtained by: Setting all the original variables to zero Setting the slack variables to the respective right-hand side values. 74 / 423 Definition 3.12 In general, an LP problem max z = n+m∑ j=1 cjxj a11x1 + ...+ a1nxn + a1,n+1xn+1 + ...+ a1,n+mxn+m = b1 a21x1 + ...+ a2nxn + a2,n+1xn+1 + ...+ a2,n+mxn+m = b2 ... ... ... am1x1 + ...+ amnxn + am,n+1xn+1 + ...+ am,n+mxn+m = bm xj ≥ 0, j = 1, ..., n+m is in canonical form if there exist m columns of A = (aij) which are 1 0 ... 0 , 0 1 ... 0 , 0 0 ... 1 respectively. bi ≥ 0 for all i. 75 / 423 Example 3.4 The LP problem max z = 2x1 3x2 + 4x3 + x5 2x1 + 0x2 + 1x3 + 3x4 + 0x5 = 6 x1 + 1x2 + 0x3 + 0x4 + 0x5 = 2 2x1 + 0x2 + 0x3 + 6x4 + 1x5 = 2 x1, x2, x3, x4, x5 ≥ 0 is in canonical form because the 3rd, 2nd and 5th columns of the coefficient matrix are, respectively, 10 0 , 01 0 , 00 1 Note that (0, 2, 6, 0, 2) is a feasible solution to this LP problem. 76 / 423 Lemma 3.2 Suppose max z = n+m∑ j=1 cjxj a11x1 + ...+ a1nxn + a1,n+1xn+1 + ...+ a1,n+mxn+m = b1 ... ... ... am1x1 + ...+ amnxn + am,n+1xn+1 + ...+ am,n+mxn+m = bm xj ≥ 0, j = 1, ..., n+m is in canonical form and columns j1, . . . , jm are 1 0 ... 0 , 0 1 ... 0 , 0 0 ... 1 . Then the LP has at least one feasible solution, which is given by: Setting xj1 = b1, . . . , xjm = bm Setting all other variables to zero. 77 / 423 §4 – Basic Feasible Solutions Synopsis What is a basic feasible solution (bfs); Extreme points correspond to basic feasible solutions, and vice versa. 78 / 423 §4.1 – Basic Feasible Solutions Consider the system of m linear equations with k variables, where k ≥ m, a11x1 + ...+ a1kxk = b1 ... ... ... am1x1 + ...+ amkxk = bm and assume that (A = (aij) has rank m). 79 / 423 Definition 4.1 A basic solution to the system is constructed by Selecting m columns whose coefficients are linearly independent Setting the k m variables that correspond to the other columns to zero. Solving the m×m system of equations that remain to assign values to the variables that correspond to the selected columns; The variables corresponding to the selected m columns are basic variables, and other variables are nonbasic variables. 80 / 423 Definition 4.2 A basic feasible solution of an LP problem max /min z = cx Ax = b x ≥ 0, where rank(A) is equal to the number of rows of A, is a basic solution of the linear equation system Ax = b that also satisfies the non-negativity constraints xi ≥ 0 for all i. Any basic feasible solution is a feasible solution, but the converse is not true. 81 / 423 Example 4.1 Standard Form The LP max x 3x1 + 4x2 2x1 + x2 ≤ 4 x1 + 2x2 ≤ 3 x1, x2 ≥ 0, is in standard form. 82 / 423 Canonical Form The corresponding canonical form is max x 3x1 + 4x2 2x1 + x2 + x3 = 4 x1 + 2x2 + x4 = 3 x1, x2, x3, x4 ≥ 0 The ‘trivial basic feasible solution’, derived by selecting variables x3 and x4 to be basic is x = (0, 0, 4, 3). 83 / 423 What are the other basic feasible solutions Suppose we select x2 and x3 to be basic. Then, the reduced system is x2 + x3 = 4 2x2 = 3. This yields the basic feasible solution x = (0, 3/2, 5/2, 0). 84 / 423 If we select x1 and x2 to be the basic variables, the reduced system is 2x1 + x2 = 4 x1 + 2x2 = 3. This yields the basic feasible solution x = (5/3, 2/3, 0, 0). 85 / 423 If we select x1 and x3 as basic variables, the reduced system is 2x1 + x3 = 4 x1 = 3 This yields the basic solution x = (3, 0, 2, 0) which is not feasible. The two other basic solutions are (2, 0, 0, 1), which is basic feasible, and (0, 4, 0, 5), which is not basic feasible. 86 / 423 Lemma 4.1 (Lemma 3.2 restated) If max z = n+m∑ j=1 cjxj a11x1 + ...+ a1nxn + a1,n+1xn+1 + ...+ a1,n+mxn+m = b1 ... ... ... am1x1 + ...+ amnxn + am,n+1xn+1 + ...+ am,n+mxn+m = bm xj ≥ 0, j = 1, ..., n+m is in canonical form and columns j1, j2, . . . , jm are 1 0 ... 0 , 0 1 ... 0 , 0 0 ... 1 . Then x = (x1, . . . , xn, xn+1, . . . , xn+m) ∈ Rn+m defined by xj1 = b1, . . . , xjm = bm, all other xj = 0 is a basic feasible solution. 87 / 423 Example 4.2 The LP problem max z = 2x1 3x2 + 4x3 + x5 2x1 + 0x2 + 1x3 + 3x4 + 0x5 = 6 x1 + 1x2 + 0x3 + 0x4 + 0x5 = 2 2x1 + 0x2 + 0x3 + 6x4 + 1x5 = 2 x1, x2, x3, x4, x5 ≥ 0 is in canonical form. So x = (0, 2, 6, 0, 2) is a basic feasible solution, with x2, x3 and x5 basic variables. 88 / 423 §4.2 – Extreme Points and Basic Feasible Solutions Why are we interested in basic feasible solutions It is because they correspond to extreme points. Geometry Algebra Extreme Point Basic Feasible Solution 89 / 423 Consider the standard form linear programming problem: max x z = n∑ j=1 cjxj such that a11x1 + ...+ a1nxn ≤ b1 a21x1 + ....+ a2nxn ≤ b2 ... ... ... am1x1 + ...+ amnxn ≤ bm xj ≥ 0, j = 1, ..., n. The feasible region is a convex polyhedron S, which is a subset of Rn. 90 / 423 The canonical form of this linear program is max x z = n∑ j=1 cjxj such that a11x1 + ...+ a1nxn + ...+ a1,n+mxn+m = b1 a21x1 + ....+ a2nxn + ...+ a2,n+mxn+m = b2 ... ... ... am1x1 + ...+ amnxn + ...+ am,n+mxn+m = bm xj ≥ 0, j = 1, ..., n+m. where (aij) for i = 1, . . .m and j = n+ 1, . . . n+m forms the m×m identity matrix. 91 / 423 The coefficient matrix (aij) for i = 1, . . .m and j = 1, . . . n+m clearly has rank m (since the final m columns are linearly independent). By the definition of standard form, we also know that bi ≥ 0, for all i. 92 / 423 Theorem 4.1 The vector x ≡ (x1, . . . , xn) in Rn is an extreme point of S if and only if x in Rn+m is a basic feasible solution of the set of equations a11x1 + ...+ a1nxn + ...+ a1,n+mxn+m = b1 a21x1 + ....+ a2nxn + ...+ a2,n+mxn+m = b2 ... ... ... am1x1 + ...+ amnxn + ...+ am,n+mxn+m = bm xj ≥ 0, j = 1, ..., n+m. 93 / 423 Proof: This is an ‘if and only if’ theorem. Frequently we have to prove theorems like this by dealing with the ‘if’ and ‘only if’ parts separately. We do this here. 94 / 423 ‘if’ In this part we have to show that if x = (x1, . . . , xn+m) is a basic feasible solution of the equations then x is an extreme point of S. We use ideas that we shall return to later in the subject. Corresponding to any basic feasible solution x ∈ Rn+m, we can partition the set of indices into those corresponding to basic variables, which correspond to the m linearly independent columns of the coefficient matrix, and those corresponding to non-basic variables which are set to be zero. Thus we can write x = [xB, 0]. A slight subtlety is that some of the components in xB can also be zero. 95 / 423 We can also partition A according to basic and non-basic variables, so that A = [AB, ANB]. Then Ax = b is the same as [AB, ANB] [ xB 0 ] = b, which implies that ABxB = b. By definition, the columns of AB are linearly independent and so AB is nonsingular. Thus xB = A 1 B b. 96 / 423 Let y ≡ (y1, . . . , yn+m) be a feasible solution to the set of equations. Because the final m columns of the coefficient matrix correspond to the identity matrix, we know that, for any i = 1, . . . ,m, yn+i = bi n∑ j=1 aijyj . This means the value of y ≡ (y1, . . . , yn) determines the value of y, and vice versa. With this machinery, we can now prove the ‘if’ part. We use proof by contradiction. 97 / 423 Let x ≡ (x1, . . . , xn+m) be a basic feasible solution to the set of equations, and assume that x ≡ (x1, . . . , xn) in Rn is not an extreme point of S. Then there exist feasible y = (y1, . . . , yn) and z = (z1, . . . , zn), not equal to x , and λ ∈ (0, 1) such that x = λy + (1 λ)z . Since y and z are feasible, we can add nonnegative slack variables (yn+1, . . . , yn+m) and (zn+1, . . . , zn+m), as on the previous slide, such that Ay = b Az = b. where y = (y1, . . . , yn+m) and z = (z1, . . . , zn+m). 98 / 423 Moreover, since, for any i = 1, . . . ,m, xn+i = bi n∑ j=1 aijxj , we know that xn+i = λyn+i + (1 λ)zn+i and so we can write x = λy + (1 λ)z. 99 / 423 Partition y and z according to the basic/non-basic partition of x so that y = [yB, yNB] and z = [zB, zNB]. Looking first at the non-basic variables, we observe that 0 = λyNB + (1 λ)zNB. Furthermore yNB and zNB are nonnegative. This implies that yNB = zNB = 0. 100 / 423 It follows that [AB, ANB] [ yB 0 ] = b and [AB, ANB] [ zB 0 ] = b. which, because AB is nonsingular, implies that yB = zB = [AB] 1b. 101 / 423 Thus x, y and z are all the same, which implies that (x1, . . . , xn), (y1, . . . , yn) and (z1, . . . , zn) are all the same, and this contradicts our assumption that y and z are not equal to x . We conclude that x = (x1, . . . , xn) cannot be written as a convex combination of two different feasible points. It must therefore be an extreme point. 102 / 423 ‘only if’ In this part we have to show that if x = (x1, . . . , xn) is an extreme point of S, then x = (x1, . . . , xn+m), where xn+i = bi n∑ j=1 aijxj , is a basic feasible solution of the constraint equations in the canonical form. 103 / 423 We start by partitioning x in a different manner from the ‘if’ part of the proof. Here we partition x into [x+, 0], where the components in x+ are strictly positive. We also partition A in the same way so that A = [A+, A0] and [A+, A0] [ x+ 0 ] = b. So we have, A+x+ = b. 104 / 423 We want to show that the columns of A+ are linearly independent. Again, we do this using a proof by contradiction. Assume that the columns of A+ are not linearly independent. Then there is a non-zero w+ (not necessarily nonnegative) such that A+w+ = 0. 105 / 423 Since x+ > 0, there exists such that x+ > w+ and x+ > w+.
(These inequalities should be read coordinate-wise.)
We have
[A+, A0]
[
x+ + w+
0
]
= A+(x+ + w+) = A+x+ + A+w+ = b.
Since x+ + w+ > 0, (x+ + w+, 0) is a feasible solution to the
constraint equations, and its first