Planning and Acting in the Real World
Jesus A. Gonzalez
July 2, 2016
Introduction
- Planners in the real world
- Planning and scheduling the operations
- Spacecraft
- Factories
- Military campaigns
Introduction
- Planners in the real world
- Require extensions
- Representation language
- Actions with duration and resource constraints
- Way to interact with the environment
- Construct plans organized hierarchically
- Experts communicate what they know about how to solve the problem
- Solve problems in an abstract level before working with details
Time, Schedules, and Resources
- Classical planning representation is about
- What to do
- In what order
- Cannot talk about time
- How long an action takes
- When the action occurs
Time, Schedules, and Resources
- Planning to produce schedules
- Assign initial and final times
- Schedule for an airline, assign planes to flights, departure and arrival times
- Resource constraints
- Limited number of staff in an airline (i.e. pilots)
- Staf cannot be in two places at the same time
Time, Schedules, and Resources
- Approach
- Plan first
- Planning phase:
- Select actions
- Consider ordering constraints
- Meet goals
- Schedule later
- Schedule phase:
- Add temporal information
- In order to meet resource and deadline constraints
Time, Schedules, and Resources
- Plan first, schedule later
- Common in real world manufacturing and logistical applications
- Planning usually done by human experts
- Could be done automatically
- Should produce minimal ordering constraints required for correctness
- i.e. GRAPHPLAN, SATPLAN, partial-order planners
- Search-based methods can also do it
- They produce totally ordered plans
- Can be easily converted to plans with minimal ordering constraints
Time, Schedules, and Resources
- Representing Temporal and Resource Constraints
- A job-shop scheduling problem
- Consists of a set of jobs
- Each job consists of a set of actions
- Actions have ordering constraints among them
- Actions (each of them) have a duration and a set of resource constraints
Time, Schedules, and Resources
Time, Schedules, and Resources
- Representing Temporal and Resource Constraints
- A job-shop scheduling problem
- Constraints specify
- A type of resource
- i.e. bolts, wrenches, pilots, and so on
- The required number of that resourse
- Whether the resource is:
- Consumable
- i.e. the bolts will no longer be available for use
- Reusable
- i.e. a pilot will be available when the flight is over
- Resources can be produced by actions (with negative consumption)
- Manufacturing
- Growing
- Resupply
Time, Schedules, and Resources
- Representing Temporal and Resource Constraints
- A solution to a job-shop scheduling problem
- Must specify start times for each action
- Must satisfy temporal ordering constraints
- Must satisfy resource constraints
- Can be evaluated with a cost function
- Can be complex and could consider several factors
- Nonlinear resource costs
- Time dependent delay costs
- We assume that the cost function is the total duration of the plan: makespan
Time, Schedules, and Resources
- Representing Temporal and Resource Constraints
- Problem for the assembly of two cars
- Two jobs
- Of the form \([AddEngine, AddWheels, Inspect]\).
- Resources statement
- Four types of resources
- Gives their availability number at start time
- 1 engine hoist
- 1 wheel station
- 2 inspectors
- 500 lug nuts
Time, Schedules, and Resources
- Problem for the assembly of two cars (continues…)
- Action schemas
- Give duration and resource needs for each action
- Lug nuts consumed as tires are added to car
- Other resources borrowed at start of action and returned when action ends
- Uses the aggregation technique
- Represents resources as numerical quantities
- \(Inspectors(2)\)
- Instead of as named entities: \(Inspector(I_1)\) and \(Inspector(I_2)\)
- Groups objects into quantities when they are undistinguishable for the problem at hand
- i.e. not important to make a distinction between inspectors
- Important to reduce complexity
Time, Schedules, and Resources
- Solving Scheduling Problems
- For now we only consider the temporal scheduling problem
- Ingnore resource constraints
- Want to minimize makespan (the plan duration)
- We need to find the earliest start time for all actions consistent with ordering constraints of the problem
- Useful to represent ordering constraints as a directed graph, relating the actions
- We can apply the critical path method (CPM) to the graph
- To determine possible start and end times of each action
- A path through the graph
- Represents a partial-order plan
- Linearly ordered sequence of actions beginning with \(Start\) and ending with \(Finish\)
- 2 action paths in partial-order plan of figure 2
- Critical path
- The path with longest duration
- Critical because it determines the duration of the whole plan
- Reducing other paths doesn’t affect the duration of the plan
- Delaying the start of any action of the critical path, slows down the plan as a whole
- Actions of the critical path
- Have a time window to be executed
- Earliest possible start time \(ES\)
- Latest possible start time \(LS\)
- \(LS - ES\) is the slack of an action
Time, Schedules, and Resources
- Temporal constraints for the job-shop scheduling problem
- The whole plan takes 85 minutes
- Each action has 15 minutes of slack
- Each action on critical path has no slack (by definition)
- The \(ES\) and \(LS\) times for all actions constitute a schedule for the problem
Time, Schedules, and Resources
- Formulas for the definition of \(ES\) and \(LS\)
- Also the outline of a dynamic-programming algorithm to compute them
- \(A\) and \(B\) are actions and \(A < B\) means that \(A\) comes before \(B\)
- \(ES(Start) = 0\)
- \(ES(B) = max_{A<B} ES(A) + Duration(A)\)
- \(LS(Finish) = ES(Finish)\)
- \(LS(A) = min_{B>A} LS(B) - Duration(A)\).
Time, Schedules, and Resources
Time, Schedules, and Resources
- Start assigning \(ES(Start) = 0\)
- When we get an action \(B\) such that
- All actions that come immediately before \(B\) have \(ES\) assigned
- Set \(ES(B)\) as the maximum of the earliest finish times of those immediately preceding actions
- Earliest finish time of an action is the earliest start time + the duration
- Repeat until all actions have an \(ES\) assigned
- Then compute the \(LS\) values in similar way
- Working backwards from the \(Finish\) action
Time, Schedules, and Resources
- Finding a minimum-duration schedule given a partial ordering on actions and no resource constraints
Time, Schedules, and Resources
- Introducing resource constraints the problem becomes more complicated
- Turns the problem more difficult, in fact, NP-hard
- \(AddEngine\) actions that begin at same time (fig. 2), require same \(EngineHoist\), cannot overlap
Time, Schedules, and Resources
- Complexity of critical path algorithm
- \(O(NB)\)
- \(N\) is the number of actions
- \(b\) is the maximum branching factor (into or out of an action)
Time, Schedules, and Resources
- Solution with the fastest completion time
- 115 minutes
- 30 minutes longer than required for the schedule without resource constraints (85 minutes)
- There’s no time requiring the 2 inspectors
- Can move 1 of the ispectors to a more productive position
Time, Schedules, and Resources
- Challenge problem posed in 1963 (Lawler et al., 1993)
- Find the optimal schedule for a problem involving
- 10 machines
- 10 jobs
- 100 actions each
- Unsolved for 23 years
- Many approaches tried
- Branch-and bound
- Simulated annealing
- Tabu search
- Constraint satisfaction
Time, Schedules, and Resources
- The minimum slack heuristic (a popular approach)
- Each iteration
- Schedule for the earliest possible start
- An unscheduled action with predecessors scheduled and that has the least slack
- Update the \(ES\) and \(LS\) times for each affected action
- Repeat
- Works well in practive
- For the assembly problem, the plan takes 130 minutes instead of the 115 minutes shown before
Hierarchical Planning
- Many real world applications involve a huge set of actions
- i.e. a detailed motor plan contains about \(10^{10}\) actions
- Difficult to deal with that many actions
- AI does something that humans appear to do
- Make plans at higher levels of abstraction
Hierarchical Planning
- Example, a plan to take vacations in Hawaii
- Go to SF airport
- Take flight 11 to Honolulu by Hawaiian Airlines
- Do vacation stuff (for 2 weeks)
- Take flight 12 to SF by Hawaiian Airlines
- Go home
- These are abstract actions
- Go to SF airport decompose to:
- Drive to long-term parking lot
- Park
- Take shuttle to terminal
Hierarchical Planning
- We can decompose abstract actions several levels
- Until we get actions that can be executed
- Planning can occur before and during the execution of the plan
- i.e. defer problem of planning route from parking spot to shuttle bus station
- Until a parking spot has been found during execution
- This action will remain abstract prior to the execution phase
Hierarchical Planning
- Hierarchical decomposition
- Attempts to manage complexity
- In a hierarchical structure
- At each level of the hierarchy, a task is reduced to a small number of activities at the next lower level
- Computational cost is reduced because the plan is done for a small set of activities
- Nonhierarchical methods have tasks with a large number of actions, very impractical
Hierarchical Planning
- High-level Actions
- Adopt formalism from the area of Hierarchical Task Networks, or HTN planning
- Assume
- Full observability
- Determinism
- Availability of set of actions, called primitive actions, with standard precondition-effect schemas
Hierarchical Planning
- High-level Actions
- Key additional concept
- High level action or HLA
- i.e. action “Go to San Francisco Airport” in the previous example
- An HLA has one or more refinements
- Into a sequence of actions, each of which could be an HLA or a primitieve action
- Following figure shows:
- The \(GO(Home, SFO)\) action may have 2 refinements
- A recursive refinement for navigation in the vacuum world
Hierarchical Planning
- High-level Actions
Hierarchical Planning
- High-level Actions
- High level actions and refinements
- Embody knowledge about how to do things
- When an HLA refinement contains only primitive actions
- It’s called an implementation of the HLA
- i.e. in the vacuum world
- Sequences \([Right, Right, Down]\) and \([Down, Right, Right]\)
- Implement the HLA \(Navigate([1,3],[3,2])\)
Hierarchical Planning
- High-level Actions
- A high-level plan achieves the goal from a given state if at least one of its implementations achieves the goal from that state
- Not all implementations achieve a goal
- The agent decides which implementation it will execute
Hierarchical Planning
- Searching for Primitive Solutions
- HTN planning usually starts with a single “top level” action called ACT
- Find an implementation of Act that achieves the goal
- This is a general approach
- Repeatedly choose an HLA in current plan
- Replace it with one of its refinements
- Until the plan achieves the goal
- Can be implemented with breadth-first tree search
Hierarchical Planning
- Searching for Primitive Solutions
Planning and Acting in Nondeterministic Domains
- Extend planning to handle environments such as
- Partially observable
- Nondeterministic
- Unknown
Planning and Acting in Nondeterministic Domains
- Methods for this
- For environments with no observations
- Sensorless planning, also known as conformant planning
- For partially observable & nondeterministic environments
- For unknown environments
- Online planning and replanning
Planning and Acting in Nondeterministic Domains
- These planners use a factored representation instead of atomic
- They represent belief states
- Set of possible physical states in which the agent might be in
- For unobservable or partially observable environments
Planning and Acting in Nondeterministic Domains
- Example problem:
- We have a table and a chair
- The goal is to have both objects with the same color
- There are also two cans of paint
- The colors of the paint and objects are unknown
- Only the table is initially in the agent’s field of view
- \(Init(Object(Table) \land Object(Chair) \land Can(C_1) \land Can(C_2) \land InView(Table))\)
- \(Goal(Color(Chair,c) \land Color(Table,c))\)
- There are 2 possible actions:
- Remove lid from paint can
- Painting an object using paint from an open can
Planning and Acting in Nondeterministic Domains
- Now we allow preconditions and effects to contain variables that are not part of the action’s variable list
- i.e. \(Paint(x,can)\) doesn’t mention the color variable \(c\)
- This isn’t allowed in the fully observable case
- This is because in the partially observable case we might or might not know the color of the can
- \(Action(RemoveLid(can)\),
- PRECOND: \(Can(can)\)
- EFFECT: \(Open(can))\)
- \(Action(Paint(x,can)\),
- PRECOND: \(Object(x) \land Can(can) \land Color(can,c) \land Open(can)\)
- EFFECT: \(Color(x,c))\)
Planning and Acting in Nondeterministic Domains
- Now the agent has to reason about the percepts it will get when it executes the plan in order to solve the partially observable problem
- We add a new type of schema, the percept schema
- \(Percept(Color(x,c)\),
- PRECOND: \(Object(x) \land InView(x)\)
- \(Percept(Color(can,c)\),
- PRECOND: \(Can(can) \land InView(can) \land Open(can)\)
- \(1^{st}\) schema says that when an object is in view, the agent will perceive its color
- \(2^{nd}\) schema says that if an open can is in view, the agent perceives the color of the paint in the can
Planning and Acting in Nondeterministic Domains
- The agent also needs actions to have objects to come into view, one at a time
- \(Action(LookAt(x)\),
- PRECOND: \(InView(y) \land (x \neq y)\)
- EFFECT: \(InView(x) \land \neg InView(y))\)
- In a fully observable environment, we wouldn’t need the preconditions of the \(Percept\) axiom
- In a sensorless agent we don’t have \(Percept\) axioms
Planning and Acting in Nondeterministic Domains
- Even the sensorless agent can solve the painting problem
- Open any can, apply it to both chair and table
- A contingent planning agent
- Look at table and chair to obtain colors
- If they match, plan is done
- If not, look at paint cans, if paint matches one piece of furniture, apply that paint to the other piece
- Otherwise, paint both pieces of furniture with the same color
Planning and Acting in Nondeterministic Domains
- An online planning agent
- Might initially generate a contingent plan with less branches
- Might ignore possibility of no cans mathing any of the furniture
- Then, it deals with problems as they appear by replanning
- It can also deal with incorrectness of its action schemas
Planning and Acting in Nondeterministic Domains
- Sensorless planning
- Now we search in a belief-state space
- Convert sensorless planning problem into a belief-state planning problem
- The belief state represented by a logical formula instead of explicitly enumerating a set of states
Planning and Acting in Nondeterministic Domains
- Sensorless planning
- Initial state can ignore the \(InView\) fluents
- Can also take as given the unchanging facts
- \(Object(Table) \land Object(Chair) \land Can(C_1) \land Can(C_2)\)
- These hold in every belief state
Planning and Acting in Nondeterministic Domains
- Sensorless planning
- Agent doesn’t know the color of cans or whether they are open or closed, it knows cans have colors
- \(\forall x\) \(\exists c\) \(Color(x,c)\)
- After skolemizing (removing existential quantifiers)
- \(b_0 = Color(x,C(x))\)
Planning and Acting in Nondeterministic Domains
- Sensorless planning
- We do not follow the closed-world assumption as we did in classical planning
- In closed-world assumption any fluent not mentioned in a state is false
- We switch to an open-world assumption
- States contain both, positive and negative fluents
- If a fluent doesn’t appear, its value is unknown
- A possible solution
- \([RemoveLid(Can_1), Paint(Chair, Can_1), Paint(Table, Can_1)]\).
Planning and Acting in Nondeterministic Domains
- Contingent planning
- Generation of plans with conditional branching based on percepts
- For environments with partial observability, non-determinism, or both
- A possible contingent solution
Planning and Acting in Nondeterministic Domains
- Contingent-planning agent could maintain its belief state as a logical formula
- Evaluate each branch condition
- Determine if the belief state entails the condition formula or its negation
Planning and Acting in Nondeterministic Domains
- Online replanning
- A robot might seem to perform a repetitive task
- But it mitht actually have the capability to respond to unexpected incidents
- i.e. a robot in a manufacturing plant
- Replanning requires execution monitoring in order to know when a new plan is required
- Useful when a contingency planning is continuosly replanning
- Some branches of a partially constructed contingent plan could just say Replan
- Amount of planning in advance and how much planning is left for later is a tradeoff
- Among possible events with different costs and probabilities of occurring
Planning and Acting in Nondeterministic Domains
- Online replanning
- Replanning may also be required when the agent’s model of the world is incorrect
- Model for an action may have:
- Missing precondition
- i.e. Agent may not know it needs a screwdriver to open a paint can
- A missing effect
- i.e. Floor may get paint when painting an object
- A missing state variable
- i.e. Amount of paint in a can, the amount needed can’t be zero
Planning and Acting in Nondeterministic Domains
- Online replanning
- Online agent has three levels to monitor the environment and checks for it before executing an action
- Action monitoring: before executing an action, agent verifies that all preconditions still hold
- Plan monitoring: before executing an action, agent verifies that remaining plan will still succeed
- Goal monitoring: before executing an action, agent checks to see if there is a better set of goals it could be trying to achieve
Planning and Acting in Nondeterministic Domains
- Online replanning
- Schematic of action monitoring
Planning and Acting in Nondeterministic Domains