Sunday, October 9, 2011

The Sieve of Eratosthenes in Python

Whilst working on the 10 Io one liners to impress your friends post I felt I needed to turn to python to complete number 10 - the Sieve of Eratosthenes. My intention was to understand it in python as best I could, then simplify the python code until I had one line I could try to translate into Io. This post is about that attempt.
We start off by trying to translate the description in the wikipedia article into python line by line. Which gives us the following.
def esieve(n):
    primes = range(2, n+1)
    p = 2
    while p < n:
        for i in range(p, n+1):
            if p*i in primes:
                primes.remove(p*i)
        p += 1
    return primes
9 Lines isn't bad for a start, but that if statement can be cleaned up. What if instead of looking for items in our list one at a time. We make a new list of items to be removed, and remove them all at the end? We could use a set to hold our lists. This lets us use the minus (-) operator to give us a new set of items not in our marked set.
def shorter_esieve(n):
    marked = set()
    p = 2
    while p < n:
        for i in range(p, n+1):
            marked.add(p*i)
        p += 1
    return sorted(set(range(2, n+1)) - marked)
We only removed one line in that last attempt. Not great. But it looks like we're using a while loop and incrementing each step. Why don't we just do a for?
def shorter_esieve(n):
    marked = set()
    for p in range(2, n+1):
        for i in range(p, n+1):
            marked.add(p*i)
    return sorted(set(range(2, n+1)) - marked)
6 lines, getting better. Now here is the magic. We're using two for loops to generate a set of values. So we can just use a list comprehension to build our list, which we then use to make our marked set.
def shorter_esieve(n):
    marked = set([p* i for p in range(2, n+1) for i in range(p, n+1)])
    return sorted(set(range(2, n+1)) - marked)
And moving the assignment inline
def much_shorter_esieve(n):
    return sorted(set(range(2, n+1)) - set([p*i for p in range(2, n+1) for i in range(p, n+1)]))
And there we have it. The Sieve of Eratosthenes in one line of python. If you'd rather watch the refactoring happening step by step. Here's a video, set to suitable music.

No comments:

Post a Comment