Dynamic Programming
Jesus A. Gonzalez
July 17, 2019
Dynamic Programming
- Similar to divide and conquer
- Solves problems by combining solutions
- Programming refers to solving problems in a tabular way
- Dynamic programming applies when subproblems are not independent
- They share subproblems
- Solves each subproblem only once
- Store the solution to reuse
- Saves time
- Generally used in optimization problems
Dynamic Programming
- Four steps
- Characterize the structure of an optimal solution
- Recursively define the value of an optimal solution
- Compute the value of an optimal solution in an bottom-up way
- Construct an optimal solution from the computed information
The Rod Cutting Problem
- The problem consists in deciding where to cut steel rods
- Serling Enterprises buys long steel rods, cuts them and sells them
- Each cut is free
- What is the best way to cut the rods?
- We know the price \(p_i\) (in dollars) for rods of \(i\) inches (\(i\) is an integer number)
The Rod Cutting Problem
The Rod Cutting Problem
- The rod-cutting problem
- Given a rod of length \(n\) inches and table of prices \(p_i\) for \(i = 1, 2, n\),
- Find the maximum revenue \(r_n\) obtainable by cutting up the rod and selling the pieces
- A possible solution could be not-cutting the rod at all
- For a rod with \(n = 4\) and the prices table, the best solution would be cutting the rod into two pices of two inches each and obtaining 10 dollars
The Rod Cutting Problem
- We can cut a rod of length \(n\) in \(2^{n-1}\) different ways
- 8 possible ways to cut a rod of length 4
- i.e a rod cut \(4 = 2 +2\) shows that a row of length 4 is cut into two pieces of length 2 each
- If an optimal solution cuts the rod into \(k\) pieces, \(1 \leq k \leq n\)
- \(n = i_1 + i_2 + \dots + i_k\) provides maximum revenue
- \(r_n = p_{i1} + p_{i2} + \dots + p_{ik}\).
The Rod Cutting Problem
The Rod Cutting Problem
- Optimal revenue figures \(r_i\) for \(i = 1,2,\dots,10\) with corresponding optimal decompositions:
- \(r_1 = 1\) from solution 1 = 1 (no cuts)
- \(r_2 = 5\) from solution 2 = 2 (no cuts)
- \(r_3 = 8\) from solution 3 = 3 (no cuts)
- \(r_4 = 10\) from solution 4 = 2 + 2
- \(r_5 = 13\) from solution 5 = 2 + 3
- \(r_6 = 17\) from solution 6 = 6 (no cuts)
- \(r_7 = 18\) from solution 7 = 1 + 6 or 7 = 2 + 2 + 3
- \(r_8 = 22\) from solution 8 = 2 + 6
- \(r_9 = 25\) from solution 9 = 3 + 6
- \(r_{10} = 30\) from solution 10 = 10 (no cuts)
The Rod Cutting Problem
- More generally
- \(r_n = \max(p_n, r_1 + r_{n-1}, r_2+r_{n-2}, \dots, r_{n-1} + r_1)\)
- \(p_n\), making no cuts at all and selling the rod uncut
- \(r_i + r_{n-i}\), initial cut of size \(i\) and other of size \(n-i\), for each \(i = 1, 2, \dots, n-1\)
The Rod Cutting Problem
- To solve the problem of original size \(n\)
- We solve smaller problems of same type but smaller sizes
- Rod-cutting problem exhibits optimal substructure
- Optimal solutions to a problem incorporate optimal solutions to related subproblems, which we may solve independently
The Rod Cutting Problem
- Another way to view the decomposition,
- First piece of length \(i\) cut off, left-hand end
- Right-hand remainder of length \(n-i\)
- The remainder may be further divided
- A first piece followed by some decomposition of the remainder
- \(r_n = \max_{1 \leq i \leq n} (p_i + r_{n-i})\)
- When no-cuts at all: first piece has size \(i = n\) and revenue \(p_n\) and remainder has size 0 with revenue \(r_0=0\)
- This formulation embodies the solution to only one related subproblem
- The remainder, instead of two
The Rod Cutting Problem
- Recursive top-down implementation
- Input: array \(p[1 .. n]\) of prices and integer \(n\), the rod length
- Output: maximum revenue possible for a rod of length \(n\)
The Rod Cutting Problem
- The recursive implementation takes a long time to run for moderate size inputs (i.e. \(n = 40\), several minutes to more than an hour)
- It calls itself with the same parameters many times
- Solves the same problem repeatedly
The Rod Cutting Problem
The Rod Cutting Problem
- Running time of CUT-ROD
- \(T(n)\) denotes number of calls made to CUT-ROD when called with second parameter equal to \(n\)
- This is the number of nodes in a subtree with root at \(n\) in the recursion tree
- Includes initial call at its root so \(T(0) = 1\)
- \(T(n) = 1 + \sum_{j=0}^{n-1} T(j)\).
- 1 for call at root node
- \(T(j)\) counts number of calls (including recursive) due to CUT-ROD(\(p, n-i\)), where \(j=n-i\).
- Solving the recurrence we find that \(T(n) = 2^n\), exponential in \(n\)
The Rod Cutting Problem
- ROD-CUT explores the \(2^{n-1}\) possible ways of cutting up a rod of length \(n\)
- The recursion tree has \(2^{n-1}\) leaves
The Rod Cutting Problem
- Using Dynamic Programming for Optimal Cutting
- Naive recursive solution is inefficient, we want CUT-ROD to be more efficient
- We want each subproblem to be solved only once
- We save the solution of those subproblems
- We just look for those solutions instead of recomputing them
- Dynamic programming uses additional memory
- Savings may be really good
- i.e. exponential \(\rightarrow\) polynomial
- Number of distinct subproblems is polynomial in the input size
- Those problems can be solved in polynomial time
The Rod Cutting Problem
- Two ways to implement dynamic programming
- Top-down with memoization
- Recursive procedure
- Modified to save results to each subproblem (in array or hash table)
- Procedure checks if subproblem has been solved previously
- If previously solved, it gets the value
- If it hasn’t been previously solved, y computes its value
- The recursive procedure has been memoized, it remembers previous previous results
The Rod Cutting Problem
- Two ways to implement dynamic programming
- Bottom-up method
- Depends on natural notion of size of a subproblem
- Solving a subproblem depends only on solving smaller subproblems
- Sort problems by size and solve them in size order, smaller first
- When we solve a subproblem, all smaller subproblems used to compute its solution have been solved
- Solutions are stored and computed only once
The Rod Cutting Problem
- These approaches yield algorithms with the same asymptotic running time
- Except when top-down approach doesn’t actually recurse to examine all possible subproblems
- Bottom-up approach has better constant factors
- Less overhead for procedure calls
The Rod Cutting Problem
- Memoized Cut Rod Version
- \(p\) holds the list of prices
- \(n\) is the size of the rod
- \(r\) is an array to store the solutions to subproblems
- \(- \infty\) means that subproblem hasn’t been solved yet
The Rod Cutting Problem
- Lines 1-2 checks if desired value has already been computed and returns it
- Lines 3-7 compute the new value (previously unknown)
- Line 8 saves the computed value
The Rod Cutting Problem
- Bottom-up Cut Rod Version
- Stores subproblem solutions in \(r\)
- Solves subproblems of size \(j\), in order of increasing size: \(j=1,2, \dots, n\)
- In line 6 the algorithm directly references array entry \(r[j-i]\)
- Instead of making a recursive call to solve subproblem of size \(j-i\)
The Rod Cutting Problem
- Running time of the memoized and bottom-up of the CUT-ROD algorithm
- \(\Theta(n^2)\) for both of them
- BOTTOM-UP-CUT-ROD
- Doubly-nested loop structure
- Iterations of inner for loop (lines 5-6) forms an arithmetic series
- Then, it runs in \(\Theta(n^2)\)
- MEMOIZED-CUT-ROD
- Recursive call to solve previously solved subproblem returns immediately
- It solves a subproblem only once
- Total number of iterations for loop of lines 6-7, is \(n\)
- Forms an arithmetic series for a total of \(\Theta(n^2)\)
The Rod Cutting Problem
- The arithmetic series (from the textbook)
- The summation \(\sum\limits_{k=1}^{n} k = 1 + 2 + \dots + n\)
- Is an arithmetic series and has the value
- \(\sum\limits_{k=1}^{n} k = \frac{1}{2} n(n+1) = \Theta(n^2)\).
The Rod Cutting Problem
- Subproblem graphs
- Subproblems involved and how subproblems depend on one another
- Directed graph, one vertex for each distinct subproblem
- Like a collapsed version of the recursion tree for the top-down recursive method
- Bottom-up dynamic programming algorithm
- Consider vertices in an order: reverse topological sort of the subproblem graph
- No subproblem is considered until all subproblems that it depends upon have been solved
- Size of the subproblem graph \(G=(V,E)\) help us determine running time of the dynamic programming algorithm
- Sum running times of each different subproblem
The Rod Cutting Problem
The Rod Cutting Problem
- Reconstructing a solution
- Our Dynamic Programming (DP) algorithm
- Returns value of an optimal solution
- Doesn’t return an actual solution: a list of piece sizes
- Extend the DP approach to also record a choice that led to the optimal value
- With this information, we can print an optimal solution
The Rod Cutting Problem
- BOTTOM-UP-CUT-ROD, extended
- Computes for each rod size \(j\): maximum revenue \(r_j\) and optimal size of first piece to cut off \(s_j\)
- In line 8,
- We store in \(s[j]\) the optimal size \(i\) of the first piece to cut off when solving problem of size \(j\)
The Rod Cutting Problem
- Printing the solution
- Need price table \(p\) and a rod size \(n\)
- Calls EXTENDED-BOTTOM-UP-CUT-ROD to compute \(s[1 .. n]\)
- Prints out the list of pieces sizes for an optimal decomposition of a rod of length \(n\)
The Rod Cutting Problem
- Call to EXTENDED-BOTTOM-UP-CUT-ROD(\(p\),10)
- Call to PRINT-CUT-ROD-SOLUTION(\(p\),10) returns just 10
- Call to PRINT-CUT-ROD-SOLUTION(\(p\),7) returns 1 and 6
Matrix-chain Multiplication
- Given a sequence (chain) \(\langle A_1, A_2, \dots, A_n \rangle\) of \(n\) matrices
- Compute the product \(A_1 A_2 \dots A_n\)
- Matrix multiplication is associative
- All parenthesizations yield the same product
- Fully parenthesized product
- Either a single matrix
- Product of two fully parenthesized matrix products, surrounded by parentheses
Matrix-chain Multiplication
- Fully parenthesized product of \(A_1 A_2 A_3 A_4\):
- \((A_1(A_2(A_3 A_4)))\),
- \((A_1((A_2 A_3)A_4))\),
- \(((A_1 A_2)(A_3 A_4))\),
- \(((A_1(A_2 A_3))A_4)\),
- \((((A_1 A_2)A_3)A_4)\).
- Way of parenthesization has dramatic impact on cost of evaluating the product
Matrix-chain Multiplication
- Standard pseudocode (from SQUARE-MATRIX-MULTIPLY procedure, section 4.2)
Matrix-chain Multiplication
- We require \(A\) and \(B\) to be compatible
- \(A\) is a \(p \times q\) matrix, \(B\) then is a \(q \times r\) matrix
- The result \(C\) is an \(p \times r\) matrix
- Time to compute \(C\) dominated by scalar multiplications (line 8) as \(pqr\)
- We will express costs as number of scalar multiplications
Matrix-chain Multiplication
- Different costs by different parenthesizations of a matrix product
- Chain of 3 matrices \(\langle A_1, A_2, A_3 \rangle\)
- Dimensions: \(10 \times 100\), \(100 \times 5\), and \(5 \times 50\)
- \(((A_1 A_2)A_3)\)
- We perform \(10 \cdot 100 \cdot 5 = 5000\) scalar multiplications to get \(A_1 A_2\)
- We perform \(10 \cdot 5 \cdot 50 = 2500\) scalar multiplications to get the final result
- Total of 7,500 scalar multiplications
- \((A_1(A_2 A_3))\)
- We perform \(100 \cdot 5 \cdot 50 = 25,000\) scalar multiplications to get \(A_2 A_3\)
- We perform \(10 \cdot 100 \cdot 50 = 50,000\) scalar multiplications to get the final result
- Total of 75,000 scalar multiplications
Matrix-chain Multiplication
- Matrix-chain multiplication problem
- “Given a chain \(\langle A_1, A_2, \dots, A_n \rangle\) of \(n\) matrices, for \(i = 1, 2, \dots, n\), matrix \(A_i\) has dimension \(p_{i-1} \times p_i\), fully parenthesize the product \(A_1 A_2 \dots A_n\) in a way that minimizes the number of scalar multiplications.”
- Goal: find the order of multiplying matrices with the lowest cost
Matrix-chain Multiplication
- Counting the number of parenthesizations
- \(P(n)\) is the number of alternative parenthesizations of sequence of \(n\) matrices
- \(n=1\),we have 1 matrix and only 1 way to fully parenthesize
- \(n \geq 2\), a product is the product of 2 fully parenthesized matrix subproducts
- Split may occur between the \(k^{th}\) and \((k+1)\) matrices for \(k=1,2, \dots, n-1\)
- We obtain recurrence 15.6, with solution \(\Omega(2^n)\)
- There is an exponential number of parenthesizations \(\Omega(2^n)\), exponential in \(n\)
- Brute force algorithm performs poorly
Matrix-chain Multiplication
- Applying dynamic programming
- Optimally parenthesize a matrix chain
- Four steps
- Characterize the structure of an optimal solution
- Recursively define the value of an optimal solution
- Compute the value of an optimal solution
- Construct an optimal solution from computed information
Matrix-chain Multiplication
- Step 1: The structure of an optimal parenthesization
- Let \(A_{i,j}\) be the matrix resulting from product \(A_i A_{i+1} \dots A_j\) with \(i < j\)
- If we divide the product between \(A_k\) and \(A_{k+1}\) for \(i \leq k < j\)
- We then compute \(A_{i..k}\) and \(A_{k+1..j}\) separately
- Solution for each subproblem must be optimal in order to make solution of \(A_1 .. A_n\) optimal
- Cost = Cost(\(A_{i..k}\)) + Cost(\(A_{k+1..j}\)) + Cost for multiplying both matrices
- If there were another way to group the matrices with better cost, then this wouldn’t be optimal
Matrix-chain Multiplication
- Step 1: The structure of an optimal parenthesization
- We need to find the best way to split the product, \(k\)
- We use this optimal substructure to construct an optimal solution from optimal solution to subproblems
- We split the product into two subproblems, \(A_i A_{i+1} \dots A_k\) and \(A_{k+1} A_{k+2} \dots A_j\)
- When we search for correct place to split the product, we consier all possible places, to make sure we find the optimal one
Matrix-chain Multiplication
- Step 2: A recursive solution
- We define the cost of an optimal solution
- Recursively in terms of the optimal solution to subproblems
- Subproblems
- Problem of determining the minimum cost of grouping the matrices with parenthesis for \(A_i A_{i+1} \dots A_j\) for \(1 \leq i \leq j \leq n\)
- Let \(m[i,j]\) be the minimum number of scalar multiplications to compute \(A_{i,j}\)
- The total cost to get \(A_{1..n}\) would be \(m[1,n]\)
Matrix-chain Multiplication
- Step 2: A recursive solution
- We define \(m[i,j]\)
- If \(i = j, m[i,j] = 0\), trivial problem, only one matrix, no need for scalar multiplications
- If \(i < j\), we assume optimal division between \(A_k\) and \(A_{k+1}\), (\(i \leq k < j\))
- \(m[i,j]\) = cost to compute \(A_{i,k}\) + cost to compute \(A_{k+1,j}\) + cost to compute \(A_{i..k}A_{k+1..j}=m[i,k] + m[k+1,j] + p_{i-1}p_kp_j\)
- But we don’t know the value of \(k\) and we will have to try all \(j-i\) possibilities
Matrix-chain Multiplication
- Step 2: A recursive solution
- The recursive definition for the minimum cost to group the parenthesis of the product \(A_iA_{i+1} \dots A_j\) is shown in equation 15.7
- We still need to define \(s[i,j]\) as the value of \(k\) in which we split the product \(A_iA_{i+1} \dots A_j\) in an optimal parenthesization
- \(s[i,j]\) is the value of \(k\) such that \(m[i,j] = m[i,k] + m[k+1,j] + p_{i-1}p_kp_j\).
Matrix-chain Multiplication
- Step 3: Computing the optimal cost
- We can use the recurrence to compute the minimum cost of \(m[1,n]\) to multiply \(A_1A_2 \dots A_n\)
- But the algorithm is still exponential (no better than brute force)
- But we can take advantage that we have relatively few subproblems
- One for each selection of \(i\) and \(j\) where \(1 \leq i \leq j \leq n\), or
- \({n \choose 2} + n = \Theta(n^2)\)
- The algorithm can find repeated subproblems
- Overlapping
- We use a bottom-up tabular method
Matrix-chain Multiplication
- Step 3: Computing the optimal cost
- Bottom-up method
- We solve smaller subproblems first and the largest ones will be easier to solve
- We define 2 arrays
- \(m[1..n, 1..n]\), to store the minimum costs of multiplying \(A_i \dots A_j\)
- \(m[1,n]\) stores the cost of multiplying \(A_1 A_2 \dots A_n\)
- \(s[1..n, 1..n]\), to store the optimal divisions, the index of \(k\)
- Used to construct the optimal solution
Matrix-chain Multiplication
- Step 3: Computing the optimal cost
- Execution time of \(O(n^3)\) and requires \(\Theta(n^2)\) memory to store \(m\) and \(s\)
- Optimal substructure + Overlapping subproblems \(\rightarrow\) dynamic programming applies
Matrix-chain Multiplication
- Step 3: Computing the optimal cost
- First computes \(m[i,i] = 0\) for \(i = 1,2, \dots n\)
- First execution of for loop in lines 5-13
- Computes \(m[i,i+1]\) for \(i = 1,2, \dots, n-1\) (minimum costs for chains of length \(l=2\))
- Second execution of for loop in lines 5-13
- Computes \(m[i,i+2]\) for \(i = 1,2, \dots, n-2\) (minimum costs for chains of length \(l=3\))
- At all times the \(m[i,j]\) cost computed in lines 10-13 depends only on table entries \(m[i,k]\) and \(m[k+1,j]\) already computed
Matrix-chain Multiplication
- Step 3: Computing the optimal cost
- Procedure on a chain of \(n = 6\) matrices
- \(m[i,j]\) defined for \(i \leq j\), only use portion of \(m\) above the diagonal
- Matrix chain listed at the bottom
Matrix-chain Multiplication
- Step 3: Computing the optimal cost
Matrix-chain Multiplication
- Step 3: Computing the optimal cost
Matrix-chain Multiplication
- Step 4: Constructing an optimal solution
- MATRIX-CHAIN-ORDER finds the optimal number of scalar multiplications
- We construct the optimal solution with the information in \(s[1..n,1..n]\)
Matrix-chain Multiplication
- Step 4: Constructing an optimal solution
- The final matrix multiplication computing \(A_{1..n}\) optimally is \(A_{1..s[1,n]}A_{s[1,n]+1 .. n}\)
- We can find these matrix multiplications recursively
- \(s[1,s[1,n]]\) determines the last matrix multiplication when computing \(A_{1..s[1,n]}\)
- \(s[s[1,n]+1,n]\) determines the last matrix multiplication when computing \(A_{s[1,n]+1..n}\)
Elements of Dynamic Programming
- First element: Optimal substructure
- A problem with optimal solution that has subproblems with optimal solutions
- If problem has this property, dynamic programming might apply
- Pattern for discovering optimal substructure:
- Solution to a problem consists of making a choice
- i.e. initial cut in a rod, index to split a matrix chain
- Making this choice leaves one or more subproblems to solve
- You suppose that for a given problem, you are given the choice that leads to an optimal solution
- Not yet concerned on how to determine this choice
- You assume it has been given to you
- Given this choice
- Determine which subproblems ensue and how to best characterize the resulting space of subproblems
- Show that solutions to subproblems used within an optimal solution to the problem must also be optimal
- Use “cut-and-paste” technique
- Assume one of those is not optimal and then derive a contradiction
- “Cutting out” the nonoptimal solution to each subproblem and “pasting” the optimal one
Elements of Dynamic Programming
- Try to keep the space of subproblems as simple as possible
- Expand if necessary
- Space for rod-cutting problem: problems of optimally cutting up a rod of length \(i\) for each size \(i\)
Elements of Dynamic Programming
- Running time of dynamic programming depends on the product of two factors
- Number of subproblems overall
- How many choices we look at for each subproblem
- Rod cutting had \(\Theta(n)\) subproblems overall and \(n\) choices to examine for each of them, for a total of \(O(n^2)\)
- Matrix-chain multiplication had \(\Theta(n^2)\) subproblems overall and in each we had at most \(n-1\) choices, for a total of \(O(n^3)\) running time (actually \(\Theta(n^3)\))
Elements of Dynamic Programming
- We can also use the subproblem graph to analyze the problem
- Each vertex corresponds to a subproblem
- Choices for a subproblem are the edges incident to that subproblem
- Rod cutting: subproblem graph had \(n\) vertices and at most \(n\) edges per vertex, yielding \(O(n^2)\) running time
Elements of Dynamic Programming
- Second element: Overlapping subproblems
- The space of subproblems must be small
- i.e. recursive algorithm solves the same subproblems over and over
- Means that the optimization problem has overlapping subproblems
- Typically the total number of distinct subproblems is polynomial in the input size
- Dynamic programming takes advantage of this property
- It solves subproblems only once
- Stores the solutions in a table
- Uses that table to Look for solutions already computed (constant time per lookup)
Elements of Dynamic Programming
- Second element: Overlapping subproblems
Elements of Dynamic Programming
- Second element: Overlapping subproblems
- In divide-and-conquer we don’t see the overlapping subproblems element
- Brand-new problems are generated at each step of the recursion
- The rod cutting problem
- Makes exponentially many calls to find solutions of smaller subproblems
- Dynamic programming solution takes the exponential-time of the recursive algorithm down to quadratic time
Elements of Dynamic Programming
- Second element: Overlapping subproblems
- Recursive and inneficient matrix chain multiplication algorithm
Elements of Dynamic Programming
- Second element: Overlapping subproblems
- Recursion tree for RECURSIVE-MATRIX-CHAIN(\(p\), 1, 4)
- Gray areas are already stored in a look up table, repeatedly computed
Elements of Dynamic Programming
- Second element: Overlapping subproblems
- Time to compute \(m[1,n]\) by recursive procedure is at least exponential in \(n\)
- \(T(n) \geq 1 + \sum\limits_{k=1}^{n-1} (T(k) + T(n-k) + 1)\) for \(n > 1\)
- \(T(n) \geq 1 + \sum\limits_{k=1}^{n-1} 1 + \sum\limits_{k=1}^{n-1} T(k) + \sum\limits_{k=1}^{n-1} T(n-k)\)
- \(T(n) \geq 1 + (n-1) + \sum\limits_{k=1}^{n-1} T(k) + \sum\limits_{k=1}^{n-1} T(k)\)
- \(T(n) \geq 2 \sum\limits_{i=1}^{n-1} T(i) + n\)
Elements of Dynamic Programming
- Second element: Overlapping subproblems
- We can prove that \(T(n) = \Omega(2^n)\), using the substitution method
- Show that \(T(n) \geq 2^{n-1}\) for all \(n \geq 1\)
- Basis easy: \(T(1) \geq 1 = 2^0\)
- For \(n \geq 2\):
- \(T(n) \geq 2 \sum\limits_{i=1}^{n-1} 2^{i-1} + n\)
- \(= 2 \sum\limits_{i=0}^{n-2} 2^i + n\)
- \(= 2(2^{n-1} - 1 ) + n\), by equation A.5
- \(= 2^n -2 + n\)
- \(\geq 2^{n-1}\)
- Time to solve the problem with Dynamic Programming
- \(\Theta(n^2)\)
- Solves repeated subproblems only once
Elements of Dynamic Programming
- Third element: Reconstructing an optimal solution
- We store the choices made in each subproblem in a table
- We use this table to reconstruct the optimal solution
Elements of Dynamic Programming
- Alternative to dynamic programming: Memoization
- Keeps a top-down strategy offering the efficiency of the bottom-up dynamic-programming approach
- Maintain a table with subproblem solutions
- The control structure for filling in the table is more like the recursive algorithm
Elements of Dynamic Programming
- Alternative to dynamic programming: Memoization
Assignment
- For this assignment you will implement the following Dynamic Programming Algorithm:
- Find the Longest Repeated Subsequence (due: )
- This algorithm is similar to the longest common subsequence that is described in the book.