Thomas James Just a geek.

Technology, Gadgetry, Photography and Software Development

Unwind: Sequences in the cloud

by

When starting to build the url shortener, the first thing i did was research how others had solved the problem before me. This research turned up a lot of the same thing, a simple database where the autonumber/identity column auto generated value was then used as the link id and turned into something non-numeric like base62 for display to the user.

This approach is all well-and-good for a single relational database, but trying to replicate this with CouchDB just felt wrong. This problem needed a webscale solution, one that would scale and not be at risk of conflict. I needed an eventually consistent solution to the problem and auto-numbers weren’t it.

Now, yes for this little prototype piece of software i didn’t technically need a web scale solution, but i needed to build one to learn how to do it and more importantly to fail at it.

The best idea i could come up with at the time was assigning a unique id to each web node in the group and generating a byte array that was then hashed and cut down to a smaller value before being encoded as base62. This worked pretty well, it was easy to check for the existence of a document and the likelihood of a collision was pretty minimal.

<code>value = base62(node-id(1 byte) + hash(url)(4 bytes) + random-bytes(1 byte))
</code>

** This solution just didn’t feel right **

So i started to investigate how to replicate the sequence number solution in a way that wouldn’t clash with the web-scale approach, in that it needed to be able to work for a n-node scenario, but would still provide a sequential, unique number.

The research took me to a pretty obvious place, implement the sequence number provider as a service that each node can request a sequence number from, this way the problem of generating a unique sequence of sequential numbers can be solved without needing to worry about the rest of the solution.

Within a single application the implementation of the solution becomes pretty simple

public class Counter
{
    private long _counter = 0;

    public Counter(long seed)
    {
        this._counter = seed;
    }

    public long NextValue()
    {
        return Interlocked.Increment(ref this._counter);
    }
}

The next problem is then persisting the value of the counter. Depending on the solution this could either be performed with some sort of journalling system that proxies the counter, or simply persisting the value to a text file that is read on startup.

comments powered by Disqus
  • profile for Thomas James on Stack Exchange, a network of free, community-driven Q&A sites

4g android appharbor apple asp.net atrix aws barcamp beer blog brew-smith coding conferencing deployfu docpad ec2 fusion-garage grid10 homebrew hp-mini information insertdatahere ipad joojoo kernel knowroaming linux macbook-air moncai mono n900 ndc netduino netmf open-source opinion pcengines-apu poor-experience professional public-speaking rant ravendb reflection router singapore talks travel ubuntu unconference vagrant vps wifi windows yubikey