Thursday, February 23, 2012

NeoBlog - My Neo4j Challenge Entry

It's been an interesting start to the year. Towards the end of January I purchased a copy of Seven Databases in Seven weeks to give myself a boost in the world of nosql databases. Within a week I had discovered the neo4j challenge. It seemed too good an opportunity to miss, so I embarked on writing an application for the competition. This is my write up of how it went.
Some Design

I decided early on that my focus was.
  1. Learning about neo4j
  2. Learning about writing web apps in python
  3. Submitting an entry

Getting Started

With this in mind I spent a week playing around with neo4j via Nigel Small's excellent py2neo library. I started off with modelling the london underground in neo4j, and playing around with finding routes around. Here's a tweet I made with a photo of part of the network. This was a great learning activity, I found a bug in the pyneo library which I fixed and Nigel was good enough to pull into his repo. You can see the commit here This was my first real contribution to an open source project, which I was pretty pleased with.


The Blog Idea

Despite having lots of fun playing I couldn't get it quite working the way I wanted, so I decided to keep it on the back burner, and try out another idea I had for the competition. One I knew wasn't going to take much. The idea was a for each node to be a post. Instead of using tags to connect similar posts you would just connect them with edges. It seemed simple enough - so off I went.


The Doing

I'd not done any real web applications in python before - I few toy django applications, but django (and rails for that matter) always feel a bit heavyweight for my liking (that's a topic for another post) so I was looking forward to using flask. The application took shape quite quickly. I spent a few hours adding users and admin pages - but I felt this began to detract from the aim. My intention from the start was to keep the application simple, I felt that an application that would be shared for other people to clone from should be as small as possible. I wanted others to be able to understand the application in under ten minutes, by removing admin pages and users I managed to get rid of about half the code until I was down to an application that could do 3 things:

  • Add a post
  • Link a post to another one
  • View all posts
Not especially ground breaking or shippable - but ok I believe as an example.

What would I do different next time?

When I started, I wasn't totally sure how everything was going to end up, so I decided to play safe and use a language I was familiar with. Looking back at it, I wish I had taken the chance and written it in clojure, I think this would have been an ideal opportunity to play more with clojure.

Something that didn't occur to me until after I deployed and I'm considering adding (I really should do at some point) is that when two people link a post two edges are created. I think instead an edge should have a weighting, and each user that creates a connection adds to the weight. You could then display similar posts in order of similarity. This idea is playing into some other work I'm doing - but I should really add to this one.

Summary

I had lots of fun doing this. I learned a bit more about writing web apps. I learned a bit more about git. and I learned how to use neo4j, all in all, not bad for a few days work.

References:

Neo4j Challenge

Neo Blog Entry

Neo Blog Source Code


Wednesday, January 4, 2012

Under: A new Idiom from the J language

Thanks to this post I've started the year discovering a language I never knew existed - and a cool little feature in it.

Imagine you have a function called g

g(5) #returns 10

Now imagine you have another function which undoes whatever happened in g.
undo-g(10) #returns 5

Not too impressive on the face of it. You could guess that g just multiplies by 2 and undo-g divides by 2. The J language comes with some of these built in. Which it calls obverse functions.
   4 + 4
8
   4 +^:_1 (4)
0
In this function +^:_1 effectively means apply the + function -1 times. You could do it twice:
   4 +^:_2 (4)
_4 (In J _4 means -4)
Seem crazy? Stay with it...

How many times do you see this sort of pattern in your code?
OpenFile
    ReadData
CloseFile

OpenSocket
    SendData
CloseSocket
Look familiar? Well, because J has the idea of obverse functions you get a lovely little syntax that J calls Under which covers this pattern. In J it looks like this
f&.g x
Which means apply g to x. Then apply f. Then apply the inverse of g.
    obverse(func(verb(x))) #J calls functions verbs
The J documentation lists loads of cool definitions you can build using under.
Here's the idea in clojure using the under pattern to construct a new definition of multiplication and addition. It's a cool idea to start the year with. I wonder how many places I'll start seeing this pattern? For more information about J check out the excellent J for C Programmers (Rich 2007)

Tuesday, January 3, 2012

2011: A Retrospective


What goals did I set myself last year?

1) Publish a blog entry or video that explains monads, teach someone at
work how to use them.
I've not done this, I think I understand monads, but I'm looking for someone who knows more than me to confirm I've got it right.

2) Contribute to an open source project related to the arduino.

I didn't do this, I spent the first 3 months playing with the arduino before I moved on to other things. I designed a simple messaging system in google app engine, and then my interest in the arduino tailed off. I could open the source for this, might be an interesting idea.

3) Finish one of my articles and submit it to some publishers for
consideration to publish.

Not really sure what I meant by this. But I spoke at two conferences this year.

What else did I achieve in 2011?

Touch Typing

I commited myself to learning to touch type during the year. Thanks to a lot of support from others and from a wide range of freely available tools I'm now typing comfortably above 50WPM - and hoping this will increase as I continue practising - All my blog posts are now proudly touch typed!

Test Driven Development

I started work on a new product this year at work, and from the offset everyone on the team was encouraged to do TDD. We're ending the year with the product deployed with 80% test coverage - not perfect but not bad. We've also written our own automated acceptance test suite, which tests all of the things that are above the level of our unit test. We don't yet have a way of measuring test coverage here but it seems reasonable that including the automated suite will push our actual coverage above 80%.

Functional Programming

I've played with functional programming at various point in the year. The end of the year has seen me focus more on clojure, and all that lisp languages offer, but Haskell is still there in the background. I need to get some experience building medium sized applications in Clojure or Haskell to increase my confidence

Emacs & Vim

This year I added Emacs and Vim to my list of editors I am comfortable with. I think I'm more on the side of sticking with emacs. But time will tell.

Summary

I think it's clear that I deviated from my goals in some pretty dramatic ways. Looks like I need to examine my priorities more often.

Plans for 2012

Programming in Schools (Codemanship Teacher-Practitioner Exchange)

Help teach Ryan enough stuff so he can teach a class in programming.

Back to Basics: Algorithms and Data Strucutres

It's become clear this year that this is an area of my knowledge that needs some attention. I've signed up to Tim Roughgarden's Design and Analysis of Algorithms Course By the end of the year I need to have blogged at least once about and algorithm and once about a data structure.

Functional Programming in Clojure

By the end of the year I need to be comfortable enough to do a project in clojure.

DSLs

I need to have written a dsl and use it for something.

Review the Retrospective

I should review this post half way throught the year and update if needed.

Friday, December 2, 2011

String equality, identity and interning in Python

In a list of things I should have already known comes this. The difference between using 'is' and == on strings in Python.

Let's look at two strings. One unicode (u"unicode string") and one not "not unicode string".

Python 2.7.2+ (default, Oct  4 2011, 20:03:08) 
>>> type("foo")
type 'str'
>>> type(u"foo")
type "unicode"
>>> u"foo" == "foo"
True
>>> u"foo" is "foo"
False

So using == shows the two strings as equal, and 'is' doesn't. What's going on here?

Python interns its strings. Which means only one copy of each distinct string is stored. You can see this by using the built-in function id() to see the identity of our strings.

>>> a = "foo"
>>> b = "foo"
>>> c = u"foo"
>>> print id(a)
3074129864
>>> print id(b)
3074129864
>>> print id(c)
3074128400
You can see our normal strings have the same id because they are the same object. Our unicode string has a different id to our two 'normal' strings. Using the == operator asks python to compare equality of our two strings. Using 'is' compares the identity. As our unicode and normal string are different objects, comparing with 'is' returns false.

I wonder how many of us are guilty of misusing 'is' on strings?

Tuesday, November 29, 2011

Heroku and Ubuntu

So, I've just spent an hour or two trying to make a new heroku app on my ubuntu machine, and I was getting nowhere.
$git push heroku master
Agent admitted failure to sign using the key.
Permission denied (publickey).
fatal: The remote end hung up unexpectedly
I'd followed all the usual help on stackoverflow and heroku's excellent help section but was making no progress. Until I discovered this. The link to the bug report didn't work for me, but it's clear that I needed to set this SSH_AUTH_SOCK=0 environment variable first:
$export SSH_AUTH_SOCK=0
$git push heroku master
And now we're off!

Sunday, October 9, 2011

The Sieve of Eratosthenes in Python

Whilst working on the 10 Io one liners to impress your friends post I felt I needed to turn to python to complete number 10 - the Sieve of Eratosthenes. My intention was to understand it in python as best I could, then simplify the python code until I had one line I could try to translate into Io. This post is about that attempt.
We start off by trying to translate the description in the wikipedia article into python line by line. Which gives us the following.
def esieve(n):
    primes = range(2, n+1)
    p = 2
    while p < n:
        for i in range(p, n+1):
            if p*i in primes:
                primes.remove(p*i)
        p += 1
    return primes
9 Lines isn't bad for a start, but that if statement can be cleaned up. What if instead of looking for items in our list one at a time. We make a new list of items to be removed, and remove them all at the end? We could use a set to hold our lists. This lets us use the minus (-) operator to give us a new set of items not in our marked set.
def shorter_esieve(n):
    marked = set()
    p = 2
    while p < n:
        for i in range(p, n+1):
            marked.add(p*i)
        p += 1
    return sorted(set(range(2, n+1)) - marked)
We only removed one line in that last attempt. Not great. But it looks like we're using a while loop and incrementing each step. Why don't we just do a for?
def shorter_esieve(n):
    marked = set()
    for p in range(2, n+1):
        for i in range(p, n+1):
            marked.add(p*i)
    return sorted(set(range(2, n+1)) - marked)
6 lines, getting better. Now here is the magic. We're using two for loops to generate a set of values. So we can just use a list comprehension to build our list, which we then use to make our marked set.
def shorter_esieve(n):
    marked = set([p* i for p in range(2, n+1) for i in range(p, n+1)])
    return sorted(set(range(2, n+1)) - marked)
And moving the assignment inline
def much_shorter_esieve(n):
    return sorted(set(range(2, n+1)) - set([p*i for p in range(2, n+1) for i in range(p, n+1)]))
And there we have it. The Sieve of Eratosthenes in one line of python. If you'd rather watch the refactoring happening step by step. Here's a video, set to suitable music.

Saturday, October 8, 2011

10 Io one liners to impress your friends

It's been ages since I've done anything with the Io language. But after seeing this spate of 10 [language] one liners to impress your friends in Davide Varvello's post I thought I would spend the evening trying it with Io. Here's the gist: According to Davide's post the others in this category so far are:
Ruby
Scala
CoffeeScript
Haskell
Clojure
Python
Groovy
I'd love to see more. It's a great little activity to get started with a language.