Instructors - Selin, Xingyin
Component | Weightage (%) |
---|---|
Midterms | 32.5 |
Finals | 32.5 |
Quiz 1 | 15 |
Quiz 2 | 15 |
Class participation | 3 |
Course evaluation | 2 |
Week | Content |
---|---|
1 | Introduction, modeling with linear programming, geometry of linear programs. Reformulation tricks (absolute value, minimax, maximin). Canonical form of an LP and reduction to canonical form. |
2 | The simplex method in matrix form: simplex tableau, basic feasible solutions, reduced costs, pivoting. Phase I, Phase II. Anti-cycling rules. |
3 | Sensitivity analysis. Alternative optima. |
4 | Dual of a linear program, relationship to sensitivity analysis. Weak and strong duality theorem. |
5 | Two-player zero-sum games. The minimax theorem. |
6 | Introduction to networks and Dijkstra’s algorithm. |
7 | Recess week. |
8 | Max-flow / min-cut. Maximum matching. The min-cost flow problem. |
9 | Network simplex algorithm. Applications. |
10 | Integer programming formulations. Modeling with binary variables. LP relaxation. |
11 | Branch-and-bound. Cutting planes: knapsack covers, Gomory fractional cuts. |
12 | Dynamic programming: shortest paths, Bellman’s principle of optimality. |
13 | Traveling Salesman Problem |
14 | Review session. Final exam |
Procedure of solving a problem with linear programming.
1) Transform the problem into a linear program.
2) Transform the linear program into a standard linear program.
3) Ensure the standard linear program has linearly independent rows.
4) Obtain a canonical form of the linear program.
5) Obtain the optimal solution of the linear program.
6) Conduct sensitivity analysis and consider edge cases.
Consider edge cases as well [G] and multiple solutions [H]
“General” linear program (one if the objective function is linear and the constraints are linear equalities or inequalities).
\[\begin{align*} max/min &\quad \vec{c}^T\vec{x} \\ s.t. &\quad \vec{a}_i^T\vec{x} \enspace \geq \text{or} = \text{or} \leq \enspace b_i \quad \forall i \\ \end{align*}\]Note that all inequalities are non-strict. Please transform all strict inequalities to non-strict inequalities.
Standard form of a linear program
\[\begin{align*} \max \quad \vec{c}^T \vec{x} \\ s.t. \quad A\vec{x} &= \vec{b} \geq \vec{0} \\ \vec{x} &\geq 0 \end{align*}\]Canonical form of a linear program - variables and constraints of $x$ can permuted to represented in this way
\[\begin{array}{r@{}cl} \max \quad c_N x_N &+k \\ \quad x_B + A_N x_N &=& b \\ \quad x_B, x_N &\geq& 0 \end{array}\]One basic feasible solution can be read off the canonical form, with $x_b = b$ and $x_N = 0$, with the objective value equal to $k$.
If $c_N$ is nonnegative, the basic feasible solution is an optimal solution.
Please note that a linear program does not allow strict inequalities.
Least squares - cannot be converted into a linear program.
This includes the least-squares best-fit line.
Converting linear absolute residuals into a linear program.
The problem: Minimize $\Sigma_{i=1}^6 \vert \epsilon_i \vert$, where $\epsilon$ is a linear expression of decision variables (in this case, the intercept and slope of two variables).
Solution 1 Minimise $\Sigma_{i=1}^6 r_i^-, + r_i^+$
where
\[\epsilon = r_i^- - r_i^+\]with
\[r_i^-, r_i^+ \geq 0\](The optimal solution will minimise such that $r_i^-$ and/or $r_i^+$ is zero.)
Solution 2 Minimise $\Sigma_{i=1}^6 z_i$ with
\[\begin{align} \epsilon_i &\leq z_i \\ -\epsilon_i &\leq z_i \end{align}\](I still don’t understand how this works.)
Converting maximum absolute residual into a linear program.
The problem: Minimize $\max(\vert \epsilon_i \vert \enspace \forall i)$
Solution
Minimise $r$.
For all $i$,
\[\begin{align} r &\geq \epsilon_i \\ r &\geq -\epsilon_i \end{align}\]From the slides (notice that $\epsilon_i$ is an expression).
The linear program finds the optimal values of $b_0$, $b_1$, $b_2$ and $r$(s) that optimises the objective function. $L_i$, $P_i$ and $E_i$ are constants given by the data.
All linear programs can be transformed into a standard linear program. Here are the steps.
If the objective function is to minimise, filp all the coefficients $\vec{c}$ to get a maximising objective function instead.
If the constant of the equality or inequality is negative - please flip all the cofficients of the inequality:
\[a_1 x_1 + ... a_n x_n \leq b < 0 \enspace \rightarrow \enspace -a_1 x_1 ... - a_n x_n \geq b' > 0\]If the condition is a $\geq$ inequality, for the inequality add a nonnegative slack variable:
\[a_1 x_1 + ... a_n x_n \geq b \enspace \rightarrow \enspace a_1 x_1 ... + a_n x_n + s_g = b \enspace \text{and} \enspace s_g \geq 0\]If the condition is a $\leq$ inequality, for the inequality add a nonnegative slack variable:
\[a_1 x_1 + ... a_n x_n \leq b \enspace \rightarrow \enspace a_1 x_1 ... + a_n x_n - s_l = b \enspace \text{and} \enspace s_l \geq 0\]If there are free variables like $x_f$, you have to create two slack variables which are greater-or-equal-to zero
\(x_f \rightarrow x_f^+ - x_f^-\) for all appearance of $x_f$ in the all the equality constraints and objective function Also, add the inequality $x_f^+, x_f^- \geq 0$
PHASE I - Transforming the standard form to the canonical form \(\max\left\{ cx | A\vec{x} = \vec{b} \geq \vec{0}, \vec{x} \geq 0 \right\}\) $A$ is made up of $n$ linearly independent rows.
First, we add nonnegative $n$ slack variables to create a different LP problem with a different objective function.
Begin with a canonical form of the new LP problem by managing the objective function. Carry out the Simplex Phase II iteration. Results:
Remove the slack variables, and continue Phase II with the original objective function from the standard form.
PHASE II - Obtaining the optimal solution from the canonical form
We start with the canonical form.
We choose an entering basis variable (which is currently a non-basis variable).
We also choose an leaving basis variable (which will be a non-basis variable).
Then we pivot the basis variables. The LP problem should remain the same, and still in canonical form after every pivot. Then we iterate until we reach either one of the following conditions
interpreting-matrices.png
calculate-allowable-change.png
Definitions
Given an “primal” LP (expressed in this form)
\[\begin{align} \max \enspace c^T x& \\ Ax& \leqslant b \\ x& \geqslant 0 \end{align}\]we can formulate a dual LP
\[\begin{align} min \enspace b^T y& \\ A^{T} y& \geqslant c \\ y& \geqslant 0 \end{align}\]The dual of the dual is the primal. (You need to express the dual form into a primal before you can calculate the dual of the dual.)
Complementary Slackness property
Let $\bar{x}$ and $\bar{y}$ be feasible solutions to the primal and dual problem, respectively. $\bar{x}$ and $\bar{y}$ are optimal solutions for the two respective problems if and only if:
\[\begin{align} \bar{y}_i (b_i - a_i^T \bar{x}) &= 0, \enspace \forall i, \quad \text{and} \\ (A_j^T \bar{y}_i - c_j) \bar{x}_j &= 0, \enspace \forall i \end{align}\][QUESTION] Do they imply one another?
In an optimal solution
This applies for all constraint-solution pair, and solution-constraint pair.
Weak Duality Theorem
Condition
Result
One of the intuitions - Suppose that $x^$ is optimal to the primal, then $x^$ is feasible to the relaxation.
Strong Duality Theoerm
Condition
Result
Fundamental Theorem of Linear Programming
For a pair of primal problem P and dual problem D, exactly one of the following is true
Formulating the dual of a general LP
Economic interpretation
Please try to understand this.
linear-programming-dual-formulation.png
Please also understand the optional component in W4L2 on restoring feasibility after adding a constraint.
Two-player zero-sum game
Guaranteed payoff
To maximise guaranteed payoff, row player would play row 3.
zero-sum-game-guaranteed-payoff.png
Dominating strategies
Saddle point
When the game only have one outcome each player would not choose any other outcome.
[QUESTION] Is the saddle point always an intersection of dominating strategies?
Mixed strategy
Made up of a set of probailities. The row player can play different rows with a probability that does not change. Objective - to maximise guaranteed payoff. Intuition - the opponent will detect the strategy, and the guaranteed payoff will converge.
Solution:
\[p_1 = 7/18, \enspace p_2 = 5/8, \enspace p_3 = 1/3, \enspace z = 1/9\]The linear programming problem for row player
Objective: maximise guaranteed payoff of the row player
\[\begin{align} \max \min\{E(C_1), E(C_2), E(C_3)\}& \\ p_1 + p_2 + p_3 &= 1 \\ p_1, p_2, p_3 &\geq 0 \end{align}\]where
\[\begin{alignat*}{4} E(C_1) & {}={} &-2p_1 & {}+{} &2p_2 &{}+{} &p_3 \\ E(C_2) & {}={} & p_1 & {}-{} & p_2 &{} {} \\ E(C_3) & {}={} & 2p_1 & {} {} & &{}-{} &2p_3 \end{alignat*}\]Formulating as a linear programming problem
\[\begin{align} \max z& \\ E(C_1) &\geq z\\ E(C_2) &\geq z\\ E(C_2) &\geq z\\ p_1 + p_2 + p_3 &= 1 \\ p_1, p_2, p_3 &\geq 0 \end{align}\](This is a relaxation because they are not exactly equivalent, because $z$ could be any value besides $E(C_1), E(C_2), E(C_3)$)
The linear programming problem for column player
Objective: minimise the guaranteed payoff of column player
\[\begin{align} \min \max\{E(R_1),E(R_2),E(R_3)\} \\ q_1 + q_2 + q_3 &= 1 \\ q_1, q_2, q_3 &\geq 0 \end{align}\]where
\[\begin{alignat*}{4} E(R_1) & {}={} &-2q_1 & {}+{} &q_2 &{}+{} &2q_3 \\ E(R_2) & {}={} & 2q_1 & {}-{} &q_2 &{} {} \\ E(R_3) & {}={} & 2q_1 & {} {} & &{}-{} &2q_3 \end{alignat*}\]Formulating as a linear programming problem
\[\begin{align}\min v& \\ E(R_1) &\leq v\\ E(R_2) &\leq v\\ E(R_2) &\leq v\\ q_1 + q_2 + q_3 &= 1 \\ q_1, q_2, q_3 &\geq 0 \end{align}\]This is a primal-dual pair. ([TODO] To show)
Graphical solution to small games
If there are only two strategies available, you can use one variable ${p, 1-p}$ and you do not need to use an LP to solve the problem.
You can use a graphical solution. The maximum is a piecewise function with a maxima at (0.375, -0.125).
[TODO] Explain for the column player as well.
Formulating games as LP problems
Each player can decide on a combination of decisions before the games commence. (Example from cohort - combinations of decisions of whether to pass or bet on head and tails). The strategy is to assign a probability to each combination.
Directed graph and undirected graph
Differences between directed graph and undirected graph
Property | Undirected Graph | Directed Graph |
---|---|---|
Edges of nodes | Degree | Indegree, Outdegree |
Adjacency Matrix | Symmetric | Not necessarily symmetric |
Incident Matrix | Columns sum to two | Columns sum to zero |
Terminologies
Special graphs
Theorems
Shortest path problem
The shortest path tree stores all shortest path from the starting node.
Three set of variables
Key steps of Dijkstra’s algorithm
Workings of Dijkstra’s algorithm
Taxonomy of network problems (and methods)
Linear programming (Simplex)
Min-cost flow problem (Network Simplex)
For our course we will assume all edges have infinite capacity.
Max-flow problem (Ford-Fulkerson)
This can be formulated into a min-cost flow problem by connecting an arc from the tap of the source, with infinite capacity and negative one cost.
(The dual problem is the min-cut problem)
Baseball elimination (example)
(Only have four layers, attempts to exhaust all supply)
Maximum cardinality matching
Transhipment problem
How to meet the demand while minimising delivery costs?
Can be formulated into a min-cost flow problem by adding dummy sources or taps with zero cost arcs. This makes the total demand equals to the total supply.
Shortest path problem (Dijkstra)
You are transporting a unit supply from the source to the tap, at minimum cost
Transportation problem
Transhipment problem without intermediate nodes.
Assignment problem
Transhipment problem but the total demand is equal to the total supply.
Given a directed network $G = (V,E)$ and a source node $s$, a sink node $t$, and a possibly infinite capacity $u_{ij}$ for each arc $(i,j)$
The max-flow problem is the problem of finding a value of flow for every arc in such a way that
Formulation of the Linear Program
\[\begin{align} \text{Objective function}\\ \max& \ q \\ \text{The inflow of the source}\\ & \qquad \sum_{(s,j) \in E} x_{sj} &=& \ q\\ \text{The outflow of the tap}\\ & \qquad \sum_{(i,t) \in E} x_{it} &=& \ q\\ \text{For all other nodes, inflow equals outflow}\\ \forall v \in V, v \neq s, t & \qquad \sum_{(v,j) \in E} x_{vj} - \sum_{(i,v) \in E} x_{iv} &=& \ 0\\ \text{Capacity constraint}\\ \forall (i,j) \in E& \qquad x_{ij} &\leq& \ u_{ij}\\ \text{Nonnegativity constraint}\\ \forall (i,j) \in E& \qquad x_{ij} &\geq& \ 0 \end{align}\]Transforming into a max-flow problem
Problem parameters
These are the numbers that you need to keep track of
Initialise the current flow value as zero
Iterate with the following steps
Terminate the algorithm when there are no more directed paths from the source to the tap. The complexity of the algorithm depends on the choice of augmenting paths.
The backflow capacity needs to be decreased if used (from 10 to 9 in the example below).
backflow.png
My understanding of the intuition
when you assign the flow you also assign flexibility with the backflow values
progress is made when the edges from the source and tap is removed, eventually leaving no directed node from the source to the tap
The backflow is ignored in the calculation of the value of the min-cut.
Each feasible solution of the dual problem provides an upper bound to the primal problem.
Natuaral bounds - if you partition right outside the source and the tap, you get trivial upper bounds. It is not possible to send any more than the source can send, or any more than the sink can take.
The optimal solutions for the primal and dual problems is the same.
The dual of max-flow problem is the min cut problem. Assigning the $d$-values provides a partition.
Wikipedia offers good explanation https://en.wikipedia.org/wiki/Max-flow_min-cut_theorem Symbols differ.
This is a more general form of a max-flow problem, but still a specific class of linear programming problem (refer to taxonomy)
For each node the outflow minus inflow equal $b_i$. If $b_i > 0$, node $i$ is a supply node, if $b_i$ is a demand node, node $i$ is a demand node.
Properties
Transforming into a min-cost problem In the min-cost problem total demand should be equal to the total supply $\sum_i b_i = 0$, or else there this is not a min-cost problem (or no feasible solution). Therefore
Converting a max-flow problem into a min-cost problem
Every max-flow problem can be converted into a min-cost problem (refer to taxonomy of linear programming problems).
Add an arc from the sink to the source with infinite capacity. The objective function is to minimise the negative of the flow of the new sink-source arc. The cost of the rest of the arcs are zero.
A tree is a graph with no cycles. It can be directed or undirected, but we consider undirected trees.
A spanning tree of a connected graph is a subset of its arcs that forms an undirected tree touching every node. (Directions does not matter here)
The basic feasible solution of the LP formulation of a min-cost flow problem correspond to a spanning tree.
The basic feasible solution is a spanning tree that allows all supplies or demands to be satisfied.
The capacity limits $u_{ij}$ will not be considered in this course.
These are the numbers that you need to keep track of
Initialising the network simplex
You start with a basic feasible solution a minimum spanning tree that allows all supplies and demands to be satisfied.
Compute the simplex multipliers starting from a leaf (chosen at random or instructed). The leaf (which is a node) is set to zero.
Recurse until solution is optimal
Compute the reduced costs. If all reduced cost are nonnegative, the solution is optimal.
Pivot to improve the BFS. “We try to send as much flow as possible along that arc, while staying feasible”. In other words, you redirect the flow to the nonbasic arc, from the (series of) basic arcs, until one of the flow of the basic arc is zero.
Update the simplex multipliers. You can repeat the same process, or use the following faster procedure. Then, repeat the cycle.
Assumptions
Phase I of network simplex
Do this to obtain a basic feasible solution for the network simplex.
(Following comment needs to be confirmed, I am not too sure.)
The flow on the actual arcs is zero, and the flow on the artificial arcs is equal to the demand or supply of the node.
To obtain a basic feasible solution, we want to redirect the flow on the artificial arcs to the actual arcs. We set the costs on the artificial arcs to one, and the costs on the actual arcs to zero.
If you cannot reduce the cost on this modified problem to zero, you cannot obtain a minimum spanning tree from the actual arcs, and there is no feasible solution.
Comparison between simplex and network simplex
Memorising the solutions to each problem does not scale. You need to understand the design priniciple behind every solution.
Example - flipping the board of five. Takeaways
Example - IKEA with setup costs. Takeaways
Use of a big-M constraints to add “setup cost” variable and constraint.
$x \leq My$
Linear programming can only solve convex problems, with inclusive boundaries, and you cannot multiply variables.
Strategy - transform the problem into ORs and ANDs.
Modelling less-than-or-greater-than (non-convex) inequality
\[x\leq2 \quad \text{or} \quad x \geq 6\]You can turn off either constraint by adding a binary variable $w_i$ with a big $M$.
\[x \leq 2 + M w \quad \text{and} \quad x \geq 6 - M (1-w) \quad \text{and} \quad w \text{ binary}\]Consider the geometry of the formulation.
The yellow region is convex.
Modelling with complement constraint
One and only one of the constraint and its complement can be fulfilled.
If the variables and coefficient are integers, we can formulate a complement. For the constraint $x_1 + 2x_2 \leq 10$, the violation is $x_1 + 2x_2 \geq 11$. (We cannot use strict inequalities for linear programming).
Implication statement into logical constraint
If $P \implies Q$, $\overline{P} \or Q$ must be true. $\overline{P}$ is the complement of $P$.
Piecewise linear functions
This is more for objective function, though it is possible for the constraint.
$x$ is replicated into three parts. $x_i$ will be set to zero if we are not using the i-th piece. $w_i$ indicates whether are we using the i-th piece.
In this approach, we solve the IP by solving its LP relaxations. The relaxation is tighted with new constraints until we obtain the solution for the IP.
You solve the LP relaxation. If the optimum solution is not all integers, you find valid inequalities to make the corner points integer.
You observe a subset $S$ with $\vert S \vert$ elements such that its weights is larger than the limit. The knapsack cover inequality prevents you from choosing all of these elements, you can choose at most $\vert S \vert - 1$ elements from $S$.
\[\sum_{j \in S} x_j \leq \vert S \vert - 1\]A minimal cover is a cover that $\vert S \vert - 1$ of its elements has a weight below the limit. Using minimal covers is usually useful.
Suppose all variables are integers. You have a equality constraint with deciment coefficient and RHS.
(incomplete and may be inaccurate)
For a (minimising) binary/integer program with $n$ variables, the brute-force or full enumeration takes $2^n$ computations. We do not want to do all of them.
Key idea - prune the tree
Properties
The IP optimum is the minimum of the children IP problems (i.e. equal to one of them)
The LP optimum is smaller than the children LP optimums (may not be equal to one of them).
Elements of a Dynamic Program
Stages. the decision to be optimized are taken sequentially over a number of stages, that typically represent time periods.
States. at each stage, all the information required to take future decisions can be represented by a state, regardless of how we reached the current state.
Decisions. at each stage and state, we have a set of decisions or actions available, which bring us to some state in the subsequent stage.
Principle of optimality. any optimal sequence of decisions has the property that, whatever the current stage, state and decision, the remaining decisions must be optimal with regard to the stage and state resulting from the current decision. (Is this the optimal substructure?)
Essentially, you need to be able to calculate for all the states in one direction until you get your required answer.
Construction
You need to define the stages and the states. This breaks down the problem into subproblems.
Each stage $k$ consists of all the states. Each stage may only have one state (126 match game) or many states (shortest path - the current node). At each stage, each state has a value $f_k(j)$ (126 match game - win or lose, shortest path - cost to reach)
You use the past stage(s) to compute the next stage. This is the recursion.
Your program needs to stop when you have obtained the answer, you need to specify a stopping condition. The problem may be unbounded and your program need to be able to detect and return that.
Preflight checklist
Please provide the mathematical expression to compute the next stage.
Please remember to define the boundary conditions of all states.
Each stage $k$ is the number of games played.
There is only one state for each stage. So we not specify the state.
The value function $f(k)$ indicates whether player one will win if the game starts with $n$ matches.
The recursion $f(k) = 1-\min\lbrace 1, f_{k-1}, f_{k-2}, f_{k-6} \rbrace$
The boundary condition $f(1) = f(2) = f(4) = f(5) = f(6) = 1$ and $f(3) = 0$.
The stopping condition when $f(n)$ is computed
Each stage $k$ is maximum number of arcs allowed to each of the nodes.
Each state $j$ is the current node.
The value function $f_k(j)$ is the cost of the path from the starting node to the node ended.
The recursion $f_k(j) = \min \lbrace f_{k-1} (j), \min_{(i,j) \in A}\lbrace f_{k-1} (i) + c_ij \rbrace \rbrace$ You either do not move, or you move from another location and incur the cost.
The boundary condition $f_0(1) = 0$ and for other nodes $f_0(j) = \infty$
The stopping condition $f_k(j) = f_{k+1}(j)$ $\forall j$. If $k > 2J$ there is negative cycle.
Each stage $k$ is the year.
Each state $j$ is the number of plants at the beginning of the year.
The value function $f_k(j)$ is minimum cost to adhere to the requirement from stage $k$ until the end of the planning horizon, if state $j$ plants has already been built.
The recursion $f_k(i) = \min_{j=1,2,3}{ f_{k+1}(i), 1.5 + c_k j + f_{k+1}(i+j) }$ but $\infty$ if $i < d_k$. You either do not build a plant, or you build a number of plants incurring fixed and variable cost.
The boundary condition $f_6(8) = 0$, $f_6(j) = \infty$ for $j \leq 7$.
The stopping condition if $f_0(0)$ has been computed. (This is an example of backward recursion.)
Find a tour in the graph that has the minimum cost.
To eliminate all subtours you need $2^n$ constaints. At some $n$ you cannot even insert the constraints into your computer.
You solve without the subtour constraint initially. When the solution contains a subtour, add the subtour into the constraint and solve again. You might need to do this many times. Nevertheless each solution provides a lower bound to the solution. (Any feasible solution provides an upper bound to the problem.)
Assume that we know the best tour from one node to another, excluding certain nodes. This can be the basis for iteration.
We say that subtour $T$ dominates $T’$ if:
A heuristic is an algorithm that will find some solution that may not be optimal.
Provable guarantees
Method | Runtime | $k$ | $D$ |
---|---|---|---|
Nearest Neighbour | $O(n^2)$ | NA | 1 |
Greedy | $O(n^2 log(n))$ | NA | 1 |
Insertion | $O(n^2)$ | NA | $(n-2)!$ |
Christofiedes | ? | 2* | 1 |
Local search (2-Opt) | $O(n^2)$ | NA | $(n-2)!$ |
Nearest Neighbour
Start from any node - this is random. (You can try all of them.) Then, always go to the node with smallest distance to the current node and that has not been previously visited. After visiting all nodes, go back to the starting node.
Greedy Algorithm
You start with the cheapest arc, and go to the next cheapest node that do not create a small cycle, until all nodes is visited.
Insertion Algorithm
Randomly start with two nodes, you have a 2-node subtour. You randomly choose a node to insert. You make a 3-node subtour by finding the best edge to replace. Repeat until you cover the entire map.
Christofiedes Algorithm
Start with the minimum spanning tree. Then do magic.
Assumes triangle inequality. (Triangle inequality do not hold in real life, however. A detour may be shorted than a direct path.)
Local search Algorithms (2-Opt)
Find a pair of edges that will have a lower cost when swapped. Repeat until you cannot find such a pair.
This improves any current solution. This algorithm makes it harder for observers to point out an easier solution visually.