Housing Source Code on the Cheap

We're a very tiny company and don't have a ton of money to throw around.  Back at our old jobs, the IT and Finance departments wouldn't even bat an eye at requests for new computers, 24" monitors, fancy new video cards, gigabit network backbones, you name it!

Things are now quite different.  We know what we want, but we don't have the time or the money to invest in doing things the "big company" way.  At the same time, there are some things we simply can't live without.  This article is about our requirements for handling our source code, and how we do it on the cheap.

 

We have pretty basic needs:

 


1.   Source control must live in a SCM system. 

It's harder to say it more clearly than Alexandrescu and Sutter do in C++ Coding Standards, Item 3:  "The palest of ink is better than the best memory:  Use a version control system.

Perforce has served us both well in the past.  However, with our current setup, we would need at least 3 users and 6 clientspecs, which kicks us up into the $800/license category.  Let's briefly talk about how much 3 $800 licences is to a fledgling startup.  $2400 is way more than we spent on our 3 brand new computers (we even splurged on big monitors!).  $2400 is another couple months of face-rocking game development at Power of Two games, and that's counting lunches!  No, we won't be using Perforce just yet.  Yes, it's time to look for a different SCM solution.

Subversion (svn), on the other hand, is totally free.  It's here to put CVS out of its misery and offers pretty much everything Perforce does, at least in so far as we're going to use it.  It's simple and straightforward.  It also has the added bonus that there's no difference between working while connected or disconnected, so we can very easily work on our laptops when there's no internet access available.  Like, say, down the block from us at the E St. Cafe.  Looks good to us!

 

2.  Our SCM server must always be running and must always be fast.

We strive to check in as frequently as possible.  During these first weeks of fast-and-loose gameplay prototyping, we're checking in code as frequently as every 5 to 10 minutes or so!  This means that our code server always has to be up and always has to be very responsive.

Originally, we had our Subversion repository hosted offsite, on the same shared host machine that serves our web site.  This machine is owned by our hosting company and provides services like www, ftp, and ssh.  Our assets and code are very small right now, so bandwidth isn't really an issue.  Things were working fine but we started to get occasional 'blackouts', 1-5 minute periods during which our server wouldn't respond to ssh.  During these blackout periods, our automated build server would die and required manual intervention to recover.  Noel fixed it by telling the build script to ignore connection errors, but we still weren't very comfortable having our primary code server live offsite in someone else's hands.

We moved the primary Subversion repository to our local build server "LeChimp".  Subversion makes doing this kind of thing very easy with its dump and load commands.  Now our svn server lives on the same local subnet as our development workstations, so operations are fast and totally reliable.  No more mystery blackouts and no more crippled upload speeds!

Unfortunately, doing this stripped us of our next requirement, which is...

 

3.  We must have automated backups of our SCM server to a reliable off-site location.

What happens if the building goes up in flames?  What happens if the office is robbed?  More realistically, what happens if poor LeChimp's hard drive dies in a puff of smoke and sparks?  This wasn't a big deal back when the repository was up on our hosting provider's computer- they do regular backups, store our data on RAID drives, keep an eye on things, and we're happy to pay them for it.  Now, though, we're on our own!

Well, the newest version of Subversion (1.4) comes with a spiffy new feature called svnsync.  This lets you create 'mirror' repositories for read-only or backup purposes.  These mirrors aren't just the current head of the repo, either.  They're the real deal; a full repository copy complete with version history and metadata!  We set up a read-only repository hidden away on our host's server and the ever-faithful cron will svnsync our primary repo into our backup repo every 6 hours.  Excellent, this is exactly what we're looking for.

Except... wait a sec... LeChimp is on our local office subnet, hidden behind our router.  Our router has a dynamic IP that comes from a DHCP lease from our ISP.  It changes regularly, which means our backup can't connect just by using the naked IP.  Our router also doesn't have a name, unless we want to pay a lot more (we don't!).  So how can our backup server connect to our repository if it doesn't know where it is?

This brings us to our final requirement.....

 

4.  We have to be able to get to the SCM server from the Internet over an encrypted connection.

For a couple of reasons, really.  First off, the off-site backup needs to hit our repository.  Also, our office is tiny and we'd love to work from our homes, trendy coffee shops, park benches, maybe even the beach if we can swing the wireless! 

The encryption part is pretty easy.  We connect to our svn repository on LeChimp over ssh, using svnserve in tunnel mode.  Since WinXP doesn't come with an ssh server, we chose to use sshd on cygwin.  Ssh tunneling has no noticeable overhead for us, so we kept things simple and chose to use it from both inside and outside our network.

What we need now is some way to map a hostname to the IP address of our router, some name that won't change.  It's easy enough to set our router up to forward external ssh traffic (over port 22) to LeChimp, but getting from the internet to our router is the hard part.  We'd heard about totally free services like DynDNS but had never tried them before. 

Well, it turns out that our spiffy new 802.11g router actually knows about DynDNS in its firmware!  We tell our router our DynDNS username, password, and host name, and it automatically updates DynDNS whenever our DHCP lease is renewed!

Here's a screenshot of it in action:

Router screenshot showing DynDNS in firmware

Now we have a "permanent" name we can use to ssh into LeChimp from the internet, and all of our interactions with our repository are strongly encrypted.  We only need our router to forward ssh port traffic to our server, and we're up and running.  Now we have a nice uniform way to get to our repository from both inside and outside.

The final rollout looks like this:

And we now have everything we want.


 

So how much did this all cost us?  It depends on how you count it.

  • Subversion: $0
  • ssh (for clients) and sshd (for LeChimp): $0
  • Backup repository storage space: $0
  • DynDNS account and IP mapping: $0

We already needed to cough up dough for the web hosting, so we like to think that coercing it into double duty as an SCM backup is something of a freebie.  Similarly, we had already committed ourselves to buying a standlone build server for running automated builds using CruiseControl.NET (more on that soon...), so we see moving the primary svn repo there as another freebie! :) 

However, if you absolutely must count those other multitaskers, here's the tally:

  • SCM server computer (runs our build server software and now hosts our svn repository): ~$500
  • Web hosting with 20GB storage and 2.1TB/month bandwidth.  (Yes, terabytes, serves web pages and now holds our backup repository):  ~$100/year

 All in all, not too shabby.