Jesus A. Gonzalez

July 17, 2019

- Similar to divide and conquer
- Solves problems by combining solutions
- Programming refers to solving problems in a tabular way
- Dynamic programming applies when subproblems are not independent
**They share subproblems**- Solves each subproblem only once
- Store the solution to reuse
- Saves time

- Generally used in optimization problems

- Four steps
- Characterize the structure of an optimal solution
- Recursively define the value of an optimal solution
- Compute the value of an optimal solution in an bottom-up way
- Construct an optimal solution from the computed information

- The problem consists in deciding where to cut steel rods
- Serling Enterprises buys long steel rods, cuts them and sells them
- Each cut is free

- What is the best way to cut the rods?
- We know the price \(p_i\) (in dollars) for rods of \(i\) inches (\(i\) is an integer number)

- Rods prices table

- The
**rod-cutting problem**- Given a rod of length \(n\) inches and table of prices \(p_i\) for \(i = 1, 2, n\),
- Find the maximum revenue \(r_n\) obtainable by cutting up the rod and selling the pieces
- A possible solution could be not-cutting the rod at all
- For a rod with \(n = 4\) and the prices table, the best solution would be cutting the rod into two pices of two inches each and obtaining 10 dollars

- We can cut a rod of length \(n\) in \(2^{n-1}\) different ways
- 8 possible ways to cut a rod of length 4
- i.e a rod cut \(4 = 2 +2\) shows that a row of length 4 is cut into two pieces of length 2 each
- If an optimal solution cuts the rod into \(k\) pieces, \(1 \leq k \leq n\)
- \(n = i_1 + i_2 + \dots + i_k\) provides maximum revenue
- \(r_n = p_{i1} + p_{i2} + \dots + p_{ik}\).

- Optimal revenue figures \(r_i\) for \(i = 1,2,\dots,10\) with corresponding optimal decompositions:
- \(r_1 = 1\) from solution 1 = 1 (no cuts)
- \(r_2 = 5\) from solution 2 = 2 (no cuts)
- \(r_3 = 8\) from solution 3 = 3 (no cuts)
- \(r_4 = 10\) from solution 4 = 2 + 2
- \(r_5 = 13\) from solution 5 = 2 + 3
- \(r_6 = 17\) from solution 6 = 6 (no cuts)
- \(r_7 = 18\) from solution 7 = 1 + 6 or 7 = 2 + 2 + 3
- \(r_8 = 22\) from solution 8 = 2 + 6
- \(r_9 = 25\) from solution 9 = 3 + 6
- \(r_{10} = 30\) from solution 10 = 10 (no cuts)

- More generally
- \(r_n = \max(p_n, r_1 + r_{n-1}, r_2+r_{n-2}, \dots, r_{n-1} + r_1)\)
- \(p_n\), making no cuts at all and selling the rod uncut
- \(r_i + r_{n-i}\), initial cut of size \(i\) and other of size \(n-i\), for each \(i = 1, 2, \dots, n-1\)

- \(r_n = \max(p_n, r_1 + r_{n-1}, r_2+r_{n-2}, \dots, r_{n-1} + r_1)\)

- To solve the problem of original size \(n\)
- We solve smaller problems of same type but smaller sizes
- Rod-cutting problem exhibits
*optimal substructure*- Optimal solutions to a problem incorporate optimal solutions to related subproblems, which we may solve independently

- Another way to view the decomposition,
- First piece of length \(i\) cut off, left-hand end
- Right-hand remainder of length \(n-i\)
- The remainder may be further divided
- A first piece followed by some decomposition of the remainder
- \(r_n = \max_{1 \leq i \leq n} (p_i + r_{n-i})\)
- When no-cuts at all: first piece has size \(i = n\) and revenue \(p_n\) and remainder has size 0 with revenue \(r_0=0\)
- This formulation embodies the solution to only
related subproblem*one*- The remainder, instead of two

- Recursive top-down implementation
- Input: array \(p[1 .. n]\) of prices and integer \(n\), the rod length
- Output: maximum revenue possible for a rod of length \(n\)

- The recursive implementation takes a long time to run for moderate size inputs (i.e. \(n = 40\), several minutes to more than an hour)
- It calls itself with the
parameters many times*same* - Solves the same problem repeatedly

- It calls itself with the

- Running time of CUT-ROD
- \(T(n)\) denotes number of calls made to CUT-ROD when called with second parameter equal to \(n\)
- This is the number of nodes in a subtree with root at \(n\) in the recursion tree
- Includes initial call at its root so \(T(0) = 1\)
- \(T(n) = 1 + \sum_{j=0}^{n-1} T(j)\).
- 1 for call at root node
- \(T(j)\) counts number of calls (including recursive) due to CUT-ROD(\(p, n-i\)), where \(j=n-i\).

- Solving the recurrence we find that \(T(n) = 2^n\),
in \(n\)*exponential*

- ROD-CUT explores the \(2^{n-1}\) possible ways of cutting up a rod of length \(n\)
- The recursion tree has \(2^{n-1}\) leaves

- Using Dynamic Programming for Optimal Cutting
- Naive recursive solution is inefficient, we want CUT-ROD to be more efficient
- We want each subproblem to be solved only
*once*- We save the solution of those subproblems
- We just look for those solutions instead of recomputing them

- Dynamic programming uses additional memory
*Time-memory trade-off*

- Savings may be really good
- i.e. exponential \(\rightarrow\) polynomial
- Number of distinct subproblems is polynomial in the input size
- Those problems can be solved in polynomial time

- Two ways to implement dynamic programming
*Top-down with memoization*- Recursive procedure
- Modified to save results to each subproblem (in array or hash table)
- Procedure checks if subproblem has been solved previously
- If previously solved, it gets the value
- If it hasn’t been previously solved, y computes its value

- The recursive procedure has been
, it remembers previous previous results*memoized*

- Two ways to implement dynamic programming
*Bottom-up method*- Depends on natural notion of
*size*of a subproblem- Solving a subproblem depends only on solving
*smaller*subproblems

- Solving a subproblem depends only on solving
- Sort problems by size and solve them in size order,
*smaller first*- When we solve a subproblem, all smaller subproblems used to compute its solution have been solved
- Solutions are stored and computed only once

- Depends on natural notion of

- These approaches yield algorithms with the same asymptotic running time
- Except when top-down approach doesn’t actually recurse to examine all possible subproblems
- Bottom-up approach has better constant factors
- Less overhead for procedure calls

- Memoized Cut Rod Version
- \(p\) holds the list of prices
- \(n\) is the size of the rod
- \(r\) is an array to store the solutions to subproblems
- \(- \infty\) means that subproblem hasn’t been solved yet

- Lines 1-2 checks if desired value has already been computed and returns it
- Lines 3-7 compute the new value (previously unknown)
- Line 8 saves the computed value

- Bottom-up Cut Rod Version
- Stores subproblem solutions in \(r\)
- Solves subproblems of size \(j\), in order of increasing size: \(j=1,2, \dots, n\)
- In line 6 the algorithm directly references array entry \(r[j-i]\)
- Instead of making a recursive call to solve subproblem of size \(j-i\)

- Running time of the memoized and bottom-up of the CUT-ROD algorithm
- \(\Theta(n^2)\) for both of them

- BOTTOM-UP-CUT-ROD
- Doubly-nested loop structure
- Iterations of inner
**for**loop (lines 5-6) forms an arithmetic series - Then, it runs in \(\Theta(n^2)\)

- MEMOIZED-CUT-ROD
- Recursive call to solve previously solved subproblem returns immediately
- It solves a subproblem only once

- Total number of iterations for loop of lines 6-7, is \(n\)
- Forms an arithmetic series for a total of \(\Theta(n^2)\)

- Recursive call to solve previously solved subproblem returns immediately

- The arithmetic series (from the textbook)
- The summation \(\sum\limits_{k=1}^{n} k = 1 + 2 + \dots + n\)
- Is an
and has the value*arithmetic series* - \(\sum\limits_{k=1}^{n} k = \frac{1}{2} n(n+1) = \Theta(n^2)\).

- Subproblem graphs
- Subproblems involved and how subproblems depend on one another
- Directed graph, one vertex for each distinct subproblem
- Like a collapsed version of the recursion tree for the top-down recursive method
- Bottom-up dynamic programming algorithm
- Consider vertices in an order:
*reverse topological sort*of the subproblem graph - No subproblem is considered until all subproblems that it depends upon have been solved

- Consider vertices in an order:
- Size of the subproblem graph \(G=(V,E)\) help us determine running time of the dynamic programming algorithm
- Sum running times of each different subproblem

- Reconstructing a solution
- Our Dynamic Programming (DP) algorithm
- Returns value of an optimal solution
- Doesn’t return an actual solution:
*a list of piece sizes*

- Our Dynamic Programming (DP) algorithm
- Extend the DP approach to also record a
*choice*that led to the optimal value- With this information, we can print an optimal solution

- BOTTOM-UP-CUT-ROD, extended
- Computes for each rod size \(j\): maximum revenue \(r_j\) and optimal size of first piece to cut off \(s_j\)
- In line 8,
- We store in \(s[j]\) the optimal size \(i\) of the first piece to cut off when solving problem of size \(j\)

- Printing the solution
- Need price table \(p\) and a rod size \(n\)
- Calls EXTENDED-BOTTOM-UP-CUT-ROD to compute \(s[1 .. n]\)
- Prints out the list of pieces sizes for an optimal decomposition of a rod of length \(n\)

- Call to EXTENDED-BOTTOM-UP-CUT-ROD(\(p\),10)
- Returns the arrays

- Call to PRINT-CUT-ROD-SOLUTION(\(p\),10) returns just 10
- Call to PRINT-CUT-ROD-SOLUTION(\(p\),7) returns 1 and 6

- Given a sequence (chain) \(\langle A_1, A_2, \dots, A_n \rangle\) of \(n\) matrices
- Compute the product \(A_1 A_2 \dots A_n\)
- Matrix multiplication is associative
- All parenthesizations yield the same product
product*Fully parenthesized*- Either a single matrix
- Product of two fully parenthesized matrix products, surrounded by parentheses

- Fully parenthesized product of \(A_1 A_2 A_3 A_4\):
- \((A_1(A_2(A_3 A_4)))\),
- \((A_1((A_2 A_3)A_4))\),
- \(((A_1 A_2)(A_3 A_4))\),
- \(((A_1(A_2 A_3))A_4)\),
- \((((A_1 A_2)A_3)A_4)\).

- Way of parenthesization has dramatic impact on cost of evaluating the product

- Standard pseudocode (from SQUARE-MATRIX-MULTIPLY procedure, section 4.2)

- We require \(A\) and \(B\) to be compatible
- \(A\) is a \(p \times q\) matrix, \(B\) then is a \(q \times r\) matrix

- The result \(C\) is an \(p \times r\) matrix
- Time to compute \(C\) dominated by scalar multiplications (line 8) as \(pqr\)
- We will express costs as number of scalar multiplications

- Different costs by different parenthesizations of a matrix product
- Chain of 3 matrices \(\langle A_1, A_2, A_3 \rangle\)
- Dimensions: \(10 \times 100\), \(100 \times 5\), and \(5 \times 50\)
- \(((A_1 A_2)A_3)\)
- We perform \(10 \cdot 100 \cdot 5 = 5000\) scalar multiplications to get \(A_1 A_2\)
- We perform \(10 \cdot 5 \cdot 50 = 2500\) scalar multiplications to get the final result
- Total of 7,500 scalar multiplications

- \((A_1(A_2 A_3))\)
- We perform \(100 \cdot 5 \cdot 50 = 25,000\) scalar multiplications to get \(A_2 A_3\)
- We perform \(10 \cdot 100 \cdot 50 = 50,000\) scalar multiplications to get the final result
- Total of 75,000 scalar multiplications

*Matrix-chain multiplication problem*- “Given a chain \(\langle A_1, A_2, \dots, A_n \rangle\) of \(n\) matrices, for \(i = 1, 2, \dots, n\), matrix \(A_i\) has dimension \(p_{i-1} \times p_i\), fully parenthesize the product \(A_1 A_2 \dots A_n\) in a way that minimizes the number of scalar multiplications.”

- Goal: find the order of multiplying matrices with the lowest cost

- Counting the number of parenthesizations
- \(P(n)\) is the number of alternative parenthesizations of sequence of \(n\) matrices
- \(n=1\),we have 1 matrix and only 1 way to fully parenthesize
- \(n \geq 2\), a product is the product of 2 fully parenthesized matrix subproducts
- Split may occur between the \(k^{th}\) and \((k+1)\) matrices for \(k=1,2, \dots, n-1\)

- We obtain recurrence 15.6, with solution \(\Omega(2^n)\)

- There is an exponential number of parenthesizations \(\Omega(2^n)\), exponential in \(n\)
- Brute force algorithm performs poorly

- \(P(n)\) is the number of alternative parenthesizations of sequence of \(n\) matrices

- Applying dynamic programming
- Optimally parenthesize a matrix chain

- Four steps
- Characterize the structure of an optimal solution
- Recursively define the value of an optimal solution
- Compute the value of an optimal solution
- Construct an optimal solution from computed information

- Step 1: The structure of an optimal parenthesization
- Let \(A_{i,j}\) be the matrix resulting from product \(A_i A_{i+1} \dots A_j\) with \(i < j\)
- If we divide the product between \(A_k\) and \(A_{k+1}\) for \(i \leq k < j\)
- We then compute \(A_{i..k}\) and \(A_{k+1..j}\) separately
- Solution for each subproblem must be optimal in order to make solution of \(A_1 .. A_n\) optimal
- Cost = Cost(\(A_{i..k}\)) + Cost(\(A_{k+1..j}\)) + Cost for multiplying both matrices

- If there were another way to group the matrices with better cost, then this wouldn’t be optimal

- Step 1: The structure of an optimal parenthesization
- We need to find the best way to split the product, \(k\)
- We use this optimal substructure to construct an optimal solution from optimal solution to subproblems
- We split the product into two subproblems, \(A_i A_{i+1} \dots A_k\) and \(A_{k+1} A_{k+2} \dots A_j\)
- When we search for correct place to split the product, we consier all possible places, to make sure we find the optimal one

- Step 2: A recursive solution
- We define the cost of an optimal solution
- Recursively in terms of the optimal solution to subproblems

- Subproblems
- Problem of determining the minimum cost of grouping the matrices with parenthesis for \(A_i A_{i+1} \dots A_j\) for \(1 \leq i \leq j \leq n\)
- Let \(m[i,j]\) be the minimum number of scalar multiplications to compute \(A_{i,j}\)
- The total cost to get \(A_{1..n}\) would be \(m[1,n]\)

- We define the cost of an optimal solution

- Step 2: A recursive solution
- We define \(m[i,j]\)
- If \(i = j, m[i,j] = 0\), trivial problem, only one matrix, no need for scalar multiplications
- If \(i < j\), we assume optimal division between \(A_k\) and \(A_{k+1}\), (\(i \leq k < j\))
- \(m[i,j]\) = cost to compute \(A_{i,k}\) + cost to compute \(A_{k+1,j}\) + cost to compute \(A_{i..k}A_{k+1..j}=m[i,k] + m[k+1,j] + p_{i-1}p_kp_j\)
- But we don’t know the value of \(k\) and we will have to try all \(j-i\) possibilities

- Step 2: A recursive solution
- The recursive definition for the minimum cost to group the parenthesis of the product \(A_iA_{i+1} \dots A_j\) is shown in equation 15.7
- We still need to define \(s[i,j]\) as the value of \(k\) in which we split the product \(A_iA_{i+1} \dots A_j\) in an optimal parenthesization
- \(s[i,j]\) is the value of \(k\) such that \(m[i,j] = m[i,k] + m[k+1,j] + p_{i-1}p_kp_j\).

- Step 3: Computing the optimal cost
- We can use the recurrence to compute the minimum cost of \(m[1,n]\) to multiply \(A_1A_2 \dots A_n\)
- But the algorithm is still exponential (no better than brute force)
- But we can take advantage that we have relatively few subproblems
- One for each selection of \(i\) and \(j\) where \(1 \leq i \leq j \leq n\), or
- \({n \choose 2} + n = \Theta(n^2)\)

- The algorithm can find repeated subproblems
- Overlapping
*We use a bottom-up tabular method*

- Step 3: Computing the optimal cost
- Bottom-up method
- We solve smaller subproblems first and the largest ones will be easier to solve

- We define 2 arrays
- \(m[1..n, 1..n]\), to store the minimum costs of multiplying \(A_i \dots A_j\)
- \(m[1,n]\) stores the cost of multiplying \(A_1 A_2 \dots A_n\)

- \(s[1..n, 1..n]\), to store the optimal divisions, the index of \(k\)
- Used to construct the optimal solution

- \(m[1..n, 1..n]\), to store the minimum costs of multiplying \(A_i \dots A_j\)

- Bottom-up method

- Step 3: Computing the optimal cost
- Execution time of \(O(n^3)\) and requires \(\Theta(n^2)\) memory to store \(m\) and \(s\)
- Optimal substructure + Overlapping subproblems \(\rightarrow\) dynamic programming applies

- Step 3: Computing the optimal cost
- First computes \(m[i,i] = 0\) for \(i = 1,2, \dots n\)
- First execution of
**for**loop in lines 5-13- Computes \(m[i,i+1]\) for \(i = 1,2, \dots, n-1\) (minimum costs for chains of length \(l=2\))

- Second execution of
**for**loop in lines 5-13- Computes \(m[i,i+2]\) for \(i = 1,2, \dots, n-2\) (minimum costs for chains of length \(l=3\))

- At all times the \(m[i,j]\) cost computed in lines 10-13 depends only on table entries \(m[i,k]\) and \(m[k+1,j]\) already computed

- Step 3: Computing the optimal cost
- Procedure on a chain of \(n = 6\) matrices
- \(m[i,j]\) defined for \(i \leq j\), only use portion of \(m\) above the diagonal
- Matrix chain listed at the bottom

- Step 3: Computing the optimal cost