Final Math 454 Homework, Due 12/9/05

Project Parameters

• Select one problem from the list below, or propose your own problem, with approval.

• Work with up to 1 other person.  Turn in one solution per group.

• Use the ideas we discuss in class, such as for testing to see if a team can still win its division by the end of the season.

• Clearly describe the way you will convert the problem into a network flow problem.  Define all of your notation.

• Illustrate your construction with at least one, hopefully more examples.  Try to make the examples interesting -- such as for the winning team problem a situation in which it is a close matter as to whether the team will win the division.

• Take the "Application" section with a large grain of salt.  For those of you that quickly solve the above steps, you can consider this part.  For those of you who need more time to solve the above steps, you can skip this part.

• You can take liberties with the problem if there is a reasonable justification -- just don't change the problem to make it trivially true.

• If you haven't made progress by Wednesday, 12/7, I will need to meet with you on Wednesday afternoon.

A word on minimum cost flow

Minimum cost (maximum) flow means to assign a cost C(e) to each arc in the network, and compute the total cost of the flow in the network as
cost = (sum over all arcs e) C(e)*f(e)
(Recall that f(e) is the flow on arc e, and c(e) (lowercase c) is the capacity on arc e.)  Minimum cost flow simply means to find, over all possible maximum flows, the one with minimum cost.  Sometimes this is referred to as the Min cost max flow problem, or sometimes just the min cost flow problem.  If you choose a problem involving minimum cost flow, it should be clear where the costs are coming from.  You don't have to provide an algorithm which actually finds the min cost flow.  Usually you can start with any max flow and look for ways of redirecting flows around cycles in order to reduce the cost without changing the value of the flow.  But in general this is a linear programming problem, and not entirely trivial.

1. Eulerian Tours (by reversing arcs).
• Setting: A directed graph G is given in which every node has an even, nonzero number of arcs incident to it.  This means both arcs coming into the node and going out of the node.
• Eulerian Tour: An Eulerian Tour is a directed path on a directed graph such that every arc is traversed exactly once.  It is not difficult to prove that if, for every node, the number of arcs going into the node is equal to the number of arcs coming out of the node, then there is an Eulerian Tour for the directed graph.
• Creating an Eulerian Tour: If a directed graph has a node for which the number of arcs coming in is not equal to the number of arcs going out, then one may reverse one or more arcs in order to make the number in each direction equal.  In fact, as long as the total number of arcs for each node is even and nonzero, this can be done in such a way as to make the number of arcs in either direction equal for every node in the graph.  Therefore, in this case arcs can be switched in order to give an Eulerian Tour.
• Problem: Given a directed graph G where each node has an even, nonzero number of total arcs going in and coming out, determine a way of flipping as few arcs as possible so that the G has an Eulerian Tour.  Do this by converting the problem into a Minimum Cost Flow problem.
• Application: Find some data on the internet in order to construct an example G.  This data might come from plane, train, or bus routes; city street layouts; or perhaps communications networks.  The data should be interesting in that there is no Eulerian Tour before the problem is solved, but there is an Eulerian Tour after the problem is solved.
• Examples: In the original graph, at a given node you might measure the quantity: |#in arcs - #out arcs|.  Then you can define the arc deviation on the entire graph G as the sum over all nodes of this value |#in arcs - #out arcs|.  Construct example graphs where the arc deviation is fixed, but the minimum number of arcs needed to be flipped to create an Eulerian Tour could be very small or very large.
2. Eulerian Tours (by duplicating arcs).
• Setting: A connected digraph G is given.  This means every node has at least one incoming arc and at least one outgoing arc.
• Eulerian Tour: An Eulerian Tour is a directed path on a directed graph such that every arc is traversed exactly once.  It is not difficult to prove that if, for every node, the number of arcs going into the node is equal to the number of arcs coming out of the node, then there is an Eulerian Tour for the directed graph.
• Creating an Eulerian Tour: It is always possible in this setting to duplicate one or more arcs in order to create an Eulerian Tour in the directed graph.  An arc is duplicated by placing a second arc in parallel to the first; that is, both arcs have the same head node and tail node.
• Problem: Given a connected, directed graph G, convert G into a new directed graph G' which has an Eulerian Tour by duplicating as few arcs as possible.  Do this by converting the problem into a Minimum Cost Flow problem.
• Application: Find some data on the internet in order to construct an example G.  This data might come from plane, train, or bus routes; city street layouts; or perhaps communications networks.  The data should be interesting in that there is no Eulerian Tour before the problem is solved, but there is an Eulerian Tour after the problem is solved.
• Examples: In the original graph, at a given node you might measure the quantity: |#in arcs - #out arcs|.  Then you can define the arc deviation on the entire graph G as the sum over all nodes of this value |#in arcs - #out arcs|.  Construct example graphs where the arc deviation is fixed, but the minimum number of arcs needed to be duplicated to create an Eulerian Tour could be very small or very large.
3. Minimum Cost Messenger Problem.
• Setting: On a directed network N, there is a start s and a finish t, and each arc is labeled with a probability value between 0 and 1.  There is at least one directed path from s to t.  There will be many such paths for the more interesting examples.
• Messenger Routing: A messenger routing is a way of sending as many messengers through the network as possible, starting at s and finishing at t.  However, no two messengers may use the same vertex or the same arc in their path of travel, except for the start s and the finish t.
• Path Reliability: A single messenger's path has a probability of being "traversible"; i.e., a probability of the messenger's being able to travel the entire path from s to t.  The path is made non-traversible, or blocked, if and only if one of the arcs is blocked.  The probability value assigned originally to the arc denotes the probability of the arc being traversible, or non-blocked.  So the probability of a directed path in the network from s to t being traversible is simply the product of these probability values on each of the edges in the directed path.
• Problem: Find a way of assigning messengers to directed paths such that: 1) as many messengers as possible are assigned to paths which have no vertices or edges in common, 2) The probability of a single messenger getting through the network is as high as possible.  This must be done by converting the problem to a Minimum Cost Flow problem.
• Application: On the internet, try to find some network reliability data, which would include the directed network plus the probability values on the arcs.  Solve the problem using this data.  Network reliability data is often found associated with communications networks.  For instance, a telephone company might collect data on the failure probabilities of each link in one of its communications networks.
4. Aircraft Maintenance.
• Setting: An airline must make important decisions as to how to schedule aircraft maintenance.  A given pool of aircraft must be repaired or expanded in order to satisfy the number of flights that the company has pre-determined.
• Demand for Planes: The airline company has a fixed demand in terms of number of airplanes in a give day, where we are looking over a fixed period of days.
• Plane Maintenance: All planes are assumed to need no maintenance at the beginning of the time period.  The model assumes that a fixed amount of maintenance must be done after each flight in order for a plane to be ready for the next flight.  There are three ways in which to process aircraft in order to keep the demand satisfied.  The airline company can either buy new planes, repair planes on a regular maintenance time frame, or repair planes on an accelerated maintenance time frame at a higher cost.
• Problem: Find a way of satisfying the total demand for aircraft such that the cost is minimized.  Do this by converting the problem to a Minimum Cost Flow problem.  Relevant quantities will include the number of days  q needed for quick maintenance, the number of days r needed for a normal maintenance, the cost c of a new plane, the cost d of a quick maintenance, and the cost e of a normal maintenance.
• Application: Incorporate real airline data into your model.  One way to do this is to pick an airline and collect data like the total number of planes in the fleet at the beginning of a period, and the demand for planes during each day of the period.  You may have to adjust your data because serious maintenance is not done after every day of flight for a plane.  For instance, to convert to the model described here, you could multiply the size of the airline's fleet but the average number of days a plane flies without serious maintenance.  Part of the project requires understanding the available data well enough to make assumptions to fit the data to the model.
• Examples: If the real data does not give situations in which significantly different strategies of combining the three ways of providing aircraft arise, then create these examples yourself.  Try to vary the input as little as possible, while demonstrating that significant changes occur in the number of planes that are purchased or the speed at which the maintenance is done.
5. Non-Parallel Job Processing.
• Setting: A number k of identical processors are available as computational resources.  A list of n jobs is given with corresponding processing time requirements, release dates, and deadlines.
• Completing a Job: Processing may begin on a job J as soon as it is released, but the job must be completed by its deadline.  At most one processor may work on the job at any given time, and that processor is completely occupied during that time with the job.  Obviously, for reasonable input the processing time requirement must not exceed the length of the time interval between the release date and the deadline, or else a single dedicated processor can't even complete the job.  (Note that we do not allow parallel computation on a job.)
• Job Preemption: It is allowable to stop processing on a job J in order to work on a separate job K.  Job J may be completed by a processor at a later time.
• Problem: Is it possible to schedule all n jobs on the k processors in such a way that all jobs are completed by their respective deadlines?  If so, give such a solution schedule.  This problem must be solved by converting to a Maximum Flow problem.
• Relevant Quantities: If all of the release dates and deadlines are sorted into one big list with duplicates removed, in order to solve the problem it is sufficient to say which job each processor is working on during the intervals defined by this sorted list.  (A very good project will include a correct argument as to why this is so.)  It is starting from this description of the problem that the corresponding maximum flow problem should be obtained.
• Application: Interesting data for this problem might be obtainable, for instance, from the Super Computer Center (SDSC).  The data would be from non-parallel Crays with multiple processors, or from a set of Crays with similar characteristics on which jobs are scheduled.  If you are interested in trying to get data like this, you might send an email to SDSC (browse through their site to find the appropriate recipient).  Otherwise you may search the Internet for data.  In order to make the data fit the model, you may have to make simplifying assumptions.  This is fine, but try to make as few assumptions as possible.
• Examples: You should construct your own pathological examples - that is, examples with as little processor time required as possible, but that do not allow for a scheduling in which all jobs are completed by their respective deadlines.  Another way to look at this is that the processors are occupied as little as possible, but there is still not a way to do the scheduling.  Try to do this in such a fashion that the time interval of every job overlaps with either the first job or the last job.  You should also construct highly non-pathological examples in which it is very easy to schedule a large number of jobs.  In this case all of the processors will be almost constantly running, but it is still possible to do the scheduling.  It is likely that there are a number of highly symmetric examples in the second category, and you should attempt to display one of these, and give the general form of a family of such examples.