Time and time again, developers stumble upon APIs using OAuth. I've recently added Fitbit integration to an application I'm working on (details soon).

FitBit's API uses OAuth v1 for authentication, and using OAuth with Django was really straightforward. Here's what I did:

Prerequisites

You'll need the following packages:

requests
requests-oauthlib

Overview

Before I dive in to the code, I'll give an overview. My application has urls.py entries for /fitbit/ for requesting the request token and storing the OAuth credentials. I store the credentials in a FitBitAPI model (ForeignKey to a Django User and CharFields for the OAuth key and OAuth secret. Whenever I need to make authenticated API calls, I can just pull the key and secret for each user right from the database.

urls.py

You just need 2 entries for OAuth v1 to work:

from django.conf.urls import patterns, url
from fitbit_api import views

urlpatterns = patterns('',
    url(r'^request_request_token', views.request_request_token, name='fitbit_api_request_request_token'),
    url(r'^store_credentials', views.store_credentials, name='fitbit_api_store_credentials'),
)

models.py

Again, really simple:

from django.db import models
from django.contrib.auth.models import User


class FitBitAPI(models.Model):
    user = models.ForeignKey(User)
    access_token = models.CharField(max_length=128, default='')
    access_token_secret = models.CharField(max_length=128, default='')

    def __unicode__(self):
        return self.user.email

views.py

This is where the action happens.

from django.shortcuts import redirect
from django.conf import settings
from django.contrib import messages
from fitbit_api.models import FitBitAPI
from requests_oauthlib import OAuth1Session


def request_request_token(request):
    oauth = OAuth1Session(settings.FITBIT_KEY, client_secret=settings.FITBIT_SECRET)
    fetch_response = oauth.fetch_request_token('https://api.fitbit.com/oauth/request_token')
    resource_owner_key = fetch_response.get('oauth_token')
    resource_owner_secret = fetch_response.get('oauth_token_secret')
    credentials = FitBitAPI.objects.create(user=request.user, access_token=resource_owner_key, access_token_secret=resource_owner_secret)
    return redirect('https://www.fitbit.com/oauth/authorize?oauth_token=%s' % resource_owner_key)


def store_credentials(request):
    oauth = OAuth1Session(settings.FITBIT_KEY, client_secret=settings.FITBIT_SECRET)
    oauth_response = oauth.parse_authorization_response(request.build_absolute_uri())
    verifier = oauth_response.get('oauth_verifier')
    oauth = OAuth1Session(settings.FITBIT_KEY,
                client_secret=settings.FITBIT_SECRET,
                resource_owner_key=credentials.access_token,
                resource_owner_secret=credentials.access_token_secret,
                verifier=verifier)
    oauth_tokens = oauth.fetch_access_token('https://api.fitbit.com/oauth/access_token')
    resource_owner_key = oauth_tokens.get('oauth_token')
    resource_owner_secret = oauth_tokens.get('oauth_token_secret')
    credentials.access_token = resource_owner_key
    credentials.access_token_secret = resource_owner_secret
    credentials.save()
return redirect('/')  # all done!

That's all there is to it! Just make sure when you register your application you set the callback URL to be one that makes store_credentials() run, in this case /fitbit/store_credentials/.

LXC - Linux containers - are a relatively new technology available on Linux. LXC is similar to virtualization (VMWare, KVM, Parallels...), but it is much closer to the concept of BSD "jails". There are some advantages to using LXC over virtualization:

  1. No overhead. LXC is just a container, isolating users, processes, and files, but not emulating a processor, network cards, sound, etc. The end result is no overhead in using LXC containers at all.

  2. Instant-on, instant-off, instant-setup. Starting a container takes less than a second, as does shutting down. Once you download the initial OS image (see below), setting up new containers takes seconds. No installation procedures to go through!

  3. Extremely easy to set up, use, and expand on. On Ubuntu 12.04 (and later), installation consists of one command. Setting up your first container is also 1 command. Starting that container - also 1 command. No installation, setting up users, restarting your machine, kernel modules, downloading ISOs, etc.

  4. Docker uses LXC under the hood. I haven't used Docker much, but it's becoming really popular.

I use LXC all the time for development work. Whenever I need a clean Ubuntu installation to run tests on (great for making sure your setup process actually works!), try things out (different databases, ideas), or installing things I know I won't need for a long time (as soon as I'm done with school, TeXLive is going to be removed with 1 command!).

Here's how to get started on Ubuntu 12.04 and later.

Installing LXC

sudo apt-get install lxc lxc-templates debootstrap

Setting up an Ubuntu Container

sudo lxc-create -t ubuntu -n my-first-container -- -r precise

You probably guessed - "my-first-container" is the name of the container, running Ubuntu Precise (12.04).

This command will download the latest Ubuntu 12.04 packages and install them. It will also cache the image for instant creation later (just use the same command for more containers). It will also set up simple NAT networking.

Using the Container

sudo lxc-start -n my-first-container

That's all! You'll be asked to log in. The Ubuntu container has a default username "ubuntu" and password "ubuntu". Check the machine's IP address with the command: ip addr.

Stopping the Container

sudo shutdown -h now

Maybe "halt" or "poweroff" work, but I've aliased this and it's muscle-memory for me.

Deleting (Destroying) a Container

sudo lxc-destroy -n my-first-container

That's all there is to it! The image for creation of future containers won't be erased, but this container will be.

Enjoy!

I spent a bit of time in my Introduction to Parallel Programming (CS 498DP) class a while ago working on a "cache-conscious" version of Radix sort. "Cache-conscious" algorithms are ones that take into account the size of the CPU's cache.

The algorithm is an implementation of a paper presented in class - available here.

Radix sort works by partitioning the input into "buckets" based on individual digits of each significant position.

The "cache-conscious" implementation of the algorithm performs a traditional radix sort if all of the numbers in the current bucket fit into the CPU's cache - otherwise, the numbers are further partitioned.

The code is written in C, and compiles using GCC, Clang, or GCC with OpenMP. See the project here.

I've just updated my fork of Tweepy, a nice Python wrapper for the Twitter API.

My fork adds proxy support to Tweepy. I've kept the usage almost exactly the same, with only one user-facing change (to set the proxy URL).

Using a proxy is as simple as:

api = tweepy.API(auth, proxy_url="example.com:8888")

That's all! I've tested it with TinyProxy.

Check it out on Github.

Now that summer is over and school is back in session, I'll be posting less and working more on schoolwork. I'm learning C++, C, MIPS Assembly, and Verilog in courses this semester and a lot more outside of class.

Working at SimpleRelevance was awesome and was a great experience! I'm going to miss the team and 1871, but I hope to be back at some point - but if not, it was everything I could have asked for, and more.

This summer I've been developing and hacking away at SimpleRelevance, in "Research and Development". SimpleRelevance is a recommendation system that's really easy to set up and use. At it's core, it pairs customers with products or services they'd like, and makes it easy for online stores or marketers to use that data.

This kind of personalization involves plenty of calculations and mathemagical calculations. We use Celery to distribute these tasks to several worker boxes.

We've used RabbitMQ in the past to queue up these tasks and serve them to the Celery "workers", which then returned the results.

However, we decided to switch to Redis after reading all about it and how it would speed up our infrastructure immensely.

RabbitMQ is huge, takes up a lot of RAM, takes forever to push tasks (more on this later), and is just...well, big. Redis is a simpler system that integrates caching and is extremely fast and much more lightweight than RabbitMQ.

Redis as a Celery broker distributed tasks extremely quickly and got the results back and even cached them nicely for us. Operations that would take 10 seconds with a RabbitMQ backend now took 1 second. RabbitMQ as a Celery broker distributed tasks a little bit slower and gets results back a little bit slower.

The obvious choice at this point would be to use Redis and get it over with, right?

Well, queuing up 500,000+ tasks on Redis degraded its performance immensely. RAM usage was still fine and the CPU wasn't doing much work, but tasks just took much longer to get out and return - that same 1 second single task took 30 seconds when there were 500,000 other identical tasks sitting in the queue.

RabbitMQ was indifferent when we threw hundreds of thousands of tasks on it and performance was just the same as if we only gave it 10 tasks.

So we're left with a problem - running 10 tasks on Redis is extremely fast, but running 500,000 tasks is extremely slow. Running 10 tasks with RabbitMQ yields okay performance, and running 500,000 tasks on it yields okay performance.

So right now, I'm still looking for other options or for a way to combine the two - we'll figure it out.