Tea is awesome.

The ramblings and bloggings of Sam Phippen

Building Tiny Web Services Lightning Fast With Heroku and Python

I often build scripts that need to have some kind of network persistance layer or tiny web services that munge files or json or whatever. When I have to do this I don’t immediately reach for rails, or any of these other super heavyweight frameworks. The reason for this being that I don’t need all the extra super powers those frameworks come with, and I can instead deal with a little more of the manual stuff because I’m not going to be spending much time doing any of that anyway. This article will try to serve as a guide to setting up tiny python projects on heroku. Using the Notely Server as an example.

Set up your project

Make sure you’ve got the heroku gem,foreman and venv installed and run the following commands

#create diriectory
mkdir app_name && cd app_name
#create a virtual python environment that won't screw with your global one
virtualenv venv --distribute
#use python environment and install dependancies
source venv/bin/activate
pip install flask
pip install psycopg2
#create a base app.py file
wget http://samphippen.com/app.py -O app.py
#create files necessary for heroku to run
pip freeze > requirements.txt
echo "web: python app.py" > Procfile
#add everything into git
wget http://samphippen.com/pyapp.gitignore -O .gitignore
git init
git add .
git commit -m "Initial commit" 
#setup heroku
heroku create --stack cedar
heroku addons:add shared-database     
#push to heroku and open in a browser
git push heroku master
heroku open

Change something

In app.py, you can see a route that matches “/” and returns the text ‘Hello World!’. This is the base point for our app, use the Flask docs to change something, run the server with forman start and see what it’s doing locally before pushing back to heroku

Persist stuff

When you ran the giant blob of commands up there, you added a database to heroku using postgres. You can interface with this database by using a psycopg connection. To create one you can use the following python snippet

username = os.environ["DATABASE_URL"].split(":")[1].replace("//","")
password = os.environ["DATABASE_URL"].split(":")[2].split("@")[0]
host = os.environ["DATABASE_URL"].split(":")[2].split("@")[1].split("/")[0]
dbname = os.environ["DATABASE_URL"].split(":")[2].split("@")[1].split("/")[1] 
conn = psycopg2.connect(dbname=dbname, user=username, password=password, host=host) 

once you’ve got a database connection you can query it using Psycopg’s interface .

Conclusions

This is, I’m pretty sure, the fastest way to get from nothing to a running web service with a database that you can use to build stuff in existence at the moment. For me it’s been incredibly useful to be able to throw these services up. I wouldn’t have been able to do that with heroku.

Let me know if you’ve done something cool with this by mailing me

Some Thoughts About Redis

Redis is a key/value store that I’ve recently used for the Student Robotics competition. I really like it, but I think it’s got some flaws.

Redis has a bunch of datatypes: Strings, Hashes, Lists, Sets and Sorted Sets. Firstly, you’ll note that there’s a lack of the integer data type, but redis has an INCR command. This command operates on redis’s string data type, and if that string is actually an integer, that integer will be atomically incremented. Whilst I know that you can store integers (and floats) in strings, it doesn’t seem to me to be a good way of storing these commonly used data types. Additionally if you’re using a redis binding and you do something like this:

>>> redis.set("my_key",0)
True
>>> redis.get("my_key")
'0'

The binding has no way of figuring out if the data it gets back should be: an integer, or the string “0”. This means that any code one writes where integer values for keys are set, one has to add extra code when one pulls the data out of redis, so that the data can be treated as integer values. Alternative key/value stores and databases have had the ability to store values in integer data types for long time. (redis also does the same thing with strings for booleans and nulls)

You’d think that with redis’s more advanced data structures (hashes and lists for example), you’d be able to do some nesting, so that for example you could have a list of hashes. Unfortunately this is not the case. When we were working with redis we spent a little while trying to come up with a solution and we came up with two alternatives

  1. A list of json strings: redis’s list structure can only store strings (or intish strings), so we nested our data structures using json strings. This meant that when we took items out of the list they had to be json parsed and json encoded. This wasn’t too much of a pain, but it wasn’t particularly elegant.

  2. Make keys heirarchical: For student robotics we decided that we’d namespace our keys in the same way, prefixed with “org.srobo”. For our a list of teams we had keys of the form “org.srobo.teams.n.thing” where n was the team number. This meant that we could nest our data structures by using a tree of variables, storing things in some nodes and nothing in others.

Of these solutions I tend to prefer the first one. Whilst it’s slightly more horrible it does mean that all your data is conceptually stored in one place in redis. Redis makes no distinction between keys, so there’s nothing in redis that allows it to interact directly with our structured heirarchy, instead that was dealt with in python scripts.

Redis has a publish subscribe mechanism which is extremely useful. The basic idea being that you can subscribe to or publish on a “channel”. There isn’t anything that particularly relates the data you’ve got stored in redis to the way output occurs on any given channel, in fact you could not store any data in redis and just use it as a publish subscribe mechanism. I can think of many strategies for combining uses of variables and keys, but for our project we came up with a pretty good solution.

In our solution we use the redis command monitor which sends an update any redis command is executed, we then read the output of that and any time a variable is modified we publish a message on a channel with the same name as the variable letting any subscribed programs know that that variable has been updated. We don’t publish the value but just the fact that an update has occured.

Redis is a very cool piece of technology, and I think it’s definitely worth having a play around with. We used it for a production system over the weekend with about 20 updates a second and it seemed to work fairly stably. I’m not convinced I prefer the system over SQL or other key/value stores (like MongoDB), but I’ve met people who use it in production, and they all say that they love it.

Notely, Now With Sync

So yesterday (trust me, it was yesterday from my point of view) I created an application called notely (github). I’ve just finished creating the sync component of the notely software. You can now “pair” your notely instance with another notely instance and sync it by typing “notely sync”. The notely server is a tiny python app which I plan to blog a little about the construction of at some time in the future. It’s hosted on heroku because heroku’s cool. You can get the source here: github

edit 23:36:10 UTC+1: there was an error which has now been fixed.

Airport Hacking

I’m at an airport with nothing but a laptop and wifi, so I built a little command line utility to allow me to quickly save small text notes for myself. I’ll probably extend this to have note sections. The tool is called notely and it has a really simple command line interface. I mostly made this because I often want to have a list of a few text items for things like to-do lists or to leave reminders for myself. I think something super fast like this is exactly what I need. Todo: sync and a webpage that makes the data available on tablets/phones/whatever. The code’s available here

Lighting London

London is a great place for tourists, and people often talk about the “hottest” places in London. I wanted to build some way to visualise this, and I’m really happy with the results.

By mining the 4square api I was able to determine where people were within London, places like coffee shops, tourist attractions, clubs and gig venues are all unified into 4square. By getting a map from Open Street Map and overlaying the data from 4square using a simple lighting equation I got really nice results. 

Check the map out here and the source on github feel free to fork me :)

Licenseme

Every time you write an open source project, you should license it so that other people should use the code. I’m not going to tell you which license you should use because that’s a matter of great flamewars. I’ve written a very simple utility called licenseme to put a license in your projects.

It’s really easy to run, just pass it a list of the names of contributors and their email addresses and it’ll generate a license for you of the specified type. I use this in all my projects that I open source because, like I said, people need to know under what terms they can re-use your code.

Command Line Apm

In Starcraft we measure “apm”. The number of actions you perform in a minute to determine how fast you’re playing. I wanted to do the same thing on my command line, so I wrote a tool called command-line-apm. It’s a really simple python script which you can put in your ps1 to show you how many actions (commands) you’re running in a minute. I max out at about 18, see if you can do better.