Version Control Tips

February 6th, 2009 Krister Schwertfuehrer, Consultant  (email the author)

I’m sure you’re saying “Who needs this in this day and age?”  Well, as recently as last year, I encountered a development team that was still not using ANY version control.  I couldn’t believe it!

Throughout my career I’ve encountered several different version control philosophies, from using *.BAK files on a production server (maybe with a date/timestamp added to the filename to know when the change was made) to systems where automated builds let developers know within minutes whether or not their latest check-in broke something (and most everything in between).

This article contains the nuggets that I’ve gleaned from my experiences and from reading various other items.  It is not intended to be a be-all, end-all, but a starting point (perhaps just a starting point for discussions… and that’s why I’ve decided to write this).  Searching for “Version Control Best Practice” in Google yields some 20 million results.  So, I’d at least like to put this in a place where I can point people to when I want to explain to them about Version Control Practices.

So, without further ado, here are my tips:

1. Pick a System

The first thing an organization must do is decide what system to use.  There are so many good, free systems that companies should NOT ever be without something.  I must say, though, that although Visual Source Safe is still free and available with Microsoft tools, you should avoid it!  The version control in TFS is good, however.  Personally, I like CVS or Subversion.  CVS is pretty much dead and replaced by Subversion, but they both accomplish the same thing.  And once a team that is used to checkout/checkin gets comfortable with the idea of concurrently working on files, these two systems are easy to use.

Some other systems are:

TortoiseSVN – not a version control system, but a tool to use with Subversion
TortoiseCVS
– again, not a version control system, but a tool to use with CVS
Team Foundation Version Control – part of the Microsoft Team Foundation Server
Git – an open source version control system especially for distributed teams.
Mercurial – an open source version control system
PVCS – (not free) a web-based version control system that is evidently being replaced by another Serena product named Dimension Express
ClearCase – (not free) IBM’s version control system.

2. What goes in?

Next an organization must decide what to put in their version control system.  I’ve read different things about this, but I believe you keep only the things you actually control the source of.  Sounds crazy, I know.  Think about it, though.  Lots of groups will put in DLLs or JARs or other libraries, as well as images and generated documentation (like javadocs).  I don’t think you should put these things in a version control system. Someone else is responsible for maintaining the source for those things (or in the case of generated documentation, you should be able to use your version control system to recreate any particular revision of the project and re-generate that documentation; so, why would you want to store something that is going to get out of date and not be maintainable?).

I can understand wanting to keep the project in such a state that you can easily rebuild a particular version of it, but by putting all of these things in your version control system, you suddenly have different versions of the same JAR, for instance in various places throughout your system.  So, which one do I end up using?

I buy in to the idea that you keep things that you control the source of in your version control system.  Then, you let something else manage dependencies.  That said, you should keep any documentation that you create as a team in version control.  You should also keep source code and build scripts in version control, of course.

I don’t think that images should be stored in the version control system for the same reason that I feel DLLs or JARs don’t belong.  However, if you happen to be a graphic designer and need to store your work in a revisionable system, then you should store your image source files in a version control system.

3. Organize your system

Of course, there are different ways to organize your version control system.  I think that each development project that can be worked on individually should be stored in its own high level project folder.  You can reference one or more projects’ output (DLL or JAR or whatever) in another project’s build script, then.  I agree with Subversion’s Redbook about setting up the high level folders for a project:  “we recommend that you create a trunk directory to hold the “main line” of development, a branches directory to contain branch copies, and a tags directory to contain tag copies.”

Inside the trunk directory, you can put folders like doc (for any documentation you create that should be revisioned) and src (which contains your build script and actual source code/packages/namespaces)… just make it such that each development team follows the same pattern.

I plan to expound upon the dependency management theme in a future blog.

4. Teach your developers how to use it

Here’s where lots of well-intentioned plans go awry.  You must teach your developers to update and commit regularly.  Updating helps keep code on their systems in synch with other developers and committing helps keep their code they are working on in synch with the other developers.

By not updating regularly, your developers are putting themselves in a place to work on an old version of code accidentally, which could lead to various issues.  And by not committing regularly, your developers are putting themselves in a place to have merge problems when they finally do decide to commit.

As far as committing goes, developers should commit small, logical chunks of code as often as possible.  Of course, they should do so after ensuring that what they’ve developed at least compiles!  If a large chunk of work is going on and cannot be committed, perhaps that work should be done in a branch, rather than worrying about breaking the trunk or current development branch.  It is understandable that somtimes many files get touched for small changes and there is a need to commit all of those files at one time.  I’m not saying you should NOT do that, but try and make it an exception, if possible.

Another part of committing code that should become second nature to developers is entering a useful comment about the code they are committing.  Such comments might actually come in handy when looking for particular pieces of newly added functionality when looking at a project’s history.

5. Tagging and Branching are GOOD

Branching, specifically, seems to have a bad name.  I’m not condoning going overboard on branching or tagging.  However, a well placed tag or branch could really be useful!

For instance, when a project is going to QA, it might be useful to get the latest code from the repository, build it (and ensure it actually builds), apply a tag along the lines of QA_20090129, and then deploy the compiled code to QA.  Now you know how to get to the code that you deployed on January 29, 2009.  You can just get all of the code tagged with that tag.  Now, the file with that bug that was introduced in the next deployment to QA can be compared to the working version deployed on January 29 and we can see what happened (and perhaps who did it).  That’s just a small example of the usefulness of tags.

Branches, on the other hand, give you a chance to do so much more.  Say you want to set a group of developers ahead on a complete redesign of your current product, while that current product is continuing to be maintained.  You could create a new project and copy all of the code into it, and then your redesign and the current code lose any connection. OR, you could just make a “redesign” branch off the current trunk, and then fixes in the current code could be merged into the branch (and vice-versa).

I recommend at the very least branching for each release.  As such, the trunk is the current development and the 1.0.0 branch (created when you released version 1.0.0) is for bug fixes found in the 1.0.0 release of your product.  Therefore, current development towards version 1.5 or 2.0, for instance, goes on in the trunk.  Bug fixes found in the trunk or 1.0.0 branch can be merged to the other place and everyone is happy.  Subsequent “bugfix” releases of the 1.0 version (1.0.1, 1.0.2, etc) of your product can be made from the branch.

6. Invest in an Automated Build tool

The last piece of advice comes from other bad experiences of going too far between compiling an entire project.  Automated build tools allow you to have all of the latest code compiled, unit tested, deployed, and tagged, among other things, whenever code is checked in.  There are quite a few FREE automated build tools.  It just takes a little bit of time to configure and set them up, but once you are set up, that time investment pays off with better control of your source code.  And, you can use these automated builds as games:  whoever causes the build to break buys donuts or coffee :-)

I’ve used these Automated Build tools and both accomplish the same general thing and I’m sure there are others that work very much like these:

Continuum – open source Automated Build tool from Apache
Hudson – open source Automated Build tool from Sun

Summary

Whatever you decide to do, PLEASE at least decide to use version control and follow as much of a company/team/group standard as possible.  At least if your company/team/group all does the same thing, it will be easier for you all to keep each other honest and work in a way that is best for you.  It also allows for you to easily teach someone else who may join you about how to do things your way and be productive.

Hopefully the tips I’ve provided make sense and work for you.

Please feel free to comment, as I know even such a seemingly simple topic as this can have so many different camps of usage.  I’d like to keep this list up to date and publish a Best Practices document at some point.

Happy coding!

Entry Filed under: Agile and Development

5 Comments Add your own

  • 1. Brian Gray  |  February 16th, 2009 at 9:29 am

    Great tips, Krister. I especially like what you said about how important usage is. Version control is only as good as how well its used!

    Another really interesting point you brought up is not storing anything you don’t control the source of. I like this a lot. One point I might disagree with you on, though, is on images. Maybe I misunderstand the context you are talking about, but take for example a web application where you have a bunch of images. Say you don’t control the actual design of those images, but every once in a while, a designer or a business person says “let’s change the graphic on page X.” I prefer that changed image to be in version control with the change in source that points to the new image.

  • 2. krister.schwertfuehrer  |  February 17th, 2009 at 10:27 am

    Thanks, Brian. I appreciate your feedback.

    I can understand your concern about maintaining versions of a web site, including the images on it. I have struggled with it, also. However, I have come to the conclusion that storing images in version control ends up being messy.

    It is similar to the issue I have with external dependencies — having them in version control results in multiple copies of the same image or similarly named images all over the place. Then it is a manual process to view/compare images and see what images are where.

    If a team/group can all agree on how to name images and can keep them straight, it may not be an issue. You will still have copies of images throughout your version control system, but disk space is cheap, so that’s not a big deal.

    On the other hand, if a team/group can not agree on naming images (at the very least), I would propose another mechanism for storing images. Why not put them in their own repository that is meant for images? There are some good open source image repositories that allow you to tag and compare images appropriately. Also, if you use Adobe CS, you have access to Adobe Version Cue, which allows you to store, track and compare versions of images.

    Then, your build script would reference the appropriate image from its repository location…

    These are just my thoughts. I know there are other ways to accomplish this, but I hope you see what I mean.

  • 3. Brian Gray  |  February 18th, 2009 at 4:42 pm

    Ah I see. You are advocating that teams should still keep a revision history of images, just not in source control. Is that correct?

    A lot of this seems like it might depend on the system being developed (how many images? how often do they change?). But it does seem like that’s generally a good idea.

  • 4. krister.schwertfuehrer  |  February 19th, 2009 at 9:37 am

    That’s correct, Brian. Like you said, it depends on the system being developed and the team/group involved.

  • 5. Jeroen Wenting  |  March 26th, 2009 at 4:43 am

    I’d not choose a system before deciding how to use it. Different systems use different paradigms (optimistic vs. pessimistic locking being at the core), and the decision for one or the other should not be trivialised.

    As to being surprised at finding teams still that don’t use it, I’m no longer surprised at pretty much anything.
    I’ve seen version control implemented in such ways that the teams would be better off without it, I’ve seen teams with no effective version control get by by using impromptu schemes like backing up the sourcetree in timestamped directories at regular intervals.
    I’ve seen teams fail that had comprehensive version control policies because they were poorly designed policies that made the version control repository effectively useless (if there’s no way to find out which version of a file is in which release, having version control provides a false sense of security).

Leave a Comment

Required

Required, hidden


nine + = 15

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed

© 2010-2012 Summa All Rights Reserved -- Copyright notice by Blog Copyright