Friday, March 21, 2008

Monte Carlo simulation

Sometimes, figuring out something deterministically is hard.  Sometimes, it's just plain impossible.

Let's say we have a model and we want to make sure that model is correct.  The best way to do that is to mathematically prove that it is correct.  This works well for simple model, but when you have insanely completed models it become a huge pain to try to prove it right.  What can you do to validate it?  We can use something called a Monte Carlo simulation.  This is also known as statistical sampling.
Here's how wikipedia describe the Monte Carlo method:
  1. Define a domain of possible inputs.
  2. Generate inputs randomly from the domain, and perform a deterministic computation on them.
  3. Aggregate the results of the individual computations into the final result.
I'll use the simplest of example to show how this can be done.  I want to write a test to know if a number is even or odd (told you it was going to be simple).  I'll write some code to test it:

def is_even(number):
  if number % 2 == 0:
    return True
  else:
    return False

This is Python code by the way, but it should be simple enough even for non programmer.  It's a function called is_even that takes an integer and returns true if the number modulo 2 is 0 and false otherwise.

Let's run a monte carlo simulator on that function to make sure it's right.  We'll generate n random integer between minus one billion and a billion and we expect that around 50% of them will be even.  We'll set n at a reasonable 100 000 runs.  The higher the number of runs, the more accurate the result will be, or rather the higher the confidence in the results will be.  It should look like this:

import random
n = 100000
number_even = 0
for i in range(n):
  random_number = random.randint(-1000000000, 1000000000)
  if is_even(random_number):
    number_even = number_even + 1

print '%f%% of numbers are even' % (float(number_even) / float(n))

When I run this I get this message back:
0.501340% of numbers are even

And without any kind of mathematical proof we have validated the model.  Though it's important to realize I haven't actually proved anything, I've simply shown that with a high probability (I'm too lazy here but we could actually find the confidence interval relatively easily) the model acts a certain way.


This example was, of course, way too trivial.  But this kind of reasoning is very useful when a system has too many different components which makes it difficult to accurately predict, such as a traffic flow system (think about elevator planning for example).

Red lights cameras

Think red lights cameras will pay for themselves with the fines they bring? Think again. It seems the city of Dallas has been having some issues with it's camera problem.

They figured the cameras would bring in 14.8M$ in revenue last year but they only brought 6.2M$. Why? People figured out that going through a red light with an automated system that fined you wasn't a very good idea, and so they stopped doing it.

What is Dallas doing? They're shutting down some of the cameras because they're not bringing in as much money as expected. This brings up an important issue: are the cameras there to increase public safety or are they there to increase revenue?

They're also thinking about idling some of them. An idle camera is very cheap to maintain and the desired safety effect will still be there (as long as they rotate which camera are idle).

(Found through Freakonomics.)

Monday, March 17, 2008

Airplane boarding

A year ago, I was 22 and I had only taken a plane once. Since then, I've taken six trips using an airplane. As an amateur optimizer, the whole process of boarding the plane always seemed inefficient to me. But much work has been put into finding efficient boarding algorithm, this page seems to describe the different methods pretty clearly.

So I won't talk about getting in the plane but rather about getting out. Unfortunately, airline companies don't really have any kind of control over the passengers so the process is rather chaotic. Let's examine how it work:

People at the front row of the plane get up first and leave, followed by the people in the second row, third row, etc. until the plane is empty. This is roughly how it goes. The big problem here is the presence of overhead bins. Let's use a plane with 3 seats on each side of the single aisle as an example. Only the two persons on the aisle seats of each row can actually get up and access the overhead bins. What that means is that when it's (finally) time for your row to leave the plane, four people will have to take the time to take their carry-on luggage from the overhead bins.

Since the aisle space is the bottleneck here, every second you block it is a second that everyone behind you lose. When you're in the last row you lose a lot of time because of that! Why is that happening? It's pretty easy to understand. If it takes you five seconds in the aisle to take your stuff out, that's only five seconds to you, even if it might be multiplied by the hundred people waiting behind you. I guess it's the "I waited this long, now it's my turn, I'll take as much time as I want" phenomenon we can see in whatever other situation where there is a long wait time. Once you're not waiting anymore you don't realize lots of other people are still waiting.

What can you do to help? Well, if you're in the aisle seat, offer to bring down other people's luggage from the bins. That should help tremendously. Otherwise, when it's your turn, try to avoid staying immobile in the aisle, try getting your stuff and then go back to a seat if you need to adjust your bag or whatnot.