The Code Whisperer

Practical, on the ground advice for efficient software development

Archive for April, 2008


Backing Up

I really hope this is a topic you can ignore, but I’m going to cover it anyway.

In the late 1990’s I was involved in a major project for a prestigious motor manufacturer. Having very good cashflow they weren’t adverse to spending a bit of money, so I was horrified when I found out that the source control server, with 6 months worth of design and development work on it, firstly wasn’t being backed up and secondly wasn’t in the server room but in the development team’s office situated under an air conditioning pipe that had developed a leak.

Let’s have a look at some data loss scenarios and how they could be recovered from, assuming the appropriate backup policy was in place:

Developer checks in the wrong code
Roll back to the previous version in the source control system. No loss of work.

Developer loses or corrupts checked out code
Revert to the previous night’s backup, loses up to one day of work.

Developer machine corrupt
Re-image the developer machine, get the latest source from the source control and restore the previous night’s backup, loses up to one day of work.

Source control server corrupt
Restore the source control from last night’s or later backup. Checked in code since backup may reside on developer’s machines. Worst case scenario is that each developer loses one day of work each, although this is highly unlikely.

Office destroyed
Build new source control server and restore the source control content from the last off site backup. Build new developer’s machines using image of developer’s machine held off site. Get latest source control on to developer’s machines. The amount of work each developer loses depends on how often backups are sent off site, but this could be as little as one day.

This last scenario is drastic but does happen. To be honest if the office has been destroyed there will be other factors to consider, but with the right backup strategy and a trip to the computer retailer you could be up and running really quickly.

With backing up across the Internet now available and very cheap, it’s possible to build a backup strategy that offers maximum protection with the minimum of fuss. Backups are only important to people when things go wrong, so generating them must be as effortless as possible.

Here is our backup strategy:

  • Source control exists on a server in our office. Code is checked in as soon as possible provided it compiles
  • The source control repository are backed up across the Internet to Data Deposit Box. The Data Deposit Box software runs on the server and detects changes to the repository, extracts, compresses and encrypts them, then securely transfers them to their secure data centre where they are stored in encrypted form
  • Source code changes not checked in are backed up at the end of the day to a file server in our office. The developer’s machine has backup software that backs the files across the network and then shuts the machine down. This backup also includes Outlook PST files
  • Other files such as documents and images are held on network file servers never on individual workstations. Documents and selected other files are backed up to the Data Deposit Box servers using the same technique as for the source code repository
  • A disk image of a developers machine has been taken and is kept on a file server
  • Copies of virtual machines are kept on the file server and backed up from developers machines weekly
  • File servers are backed up completely to external hard drives, which are stored off site
  • USB memory sticks are used to hold backups of this blog, software and licence information for developers products and the latest release code for our products. These are backed up to a files server and tend to go everywhere with me
  • Software installation CDs are kept in a fire safe in the office

I believe this is a pretty comprehensive backup strategy and with the use of Internet backup I’m confident that we could be up and running in a different office in a matter of hours if total disaster did strike.

Key Non-technical Aspects of Good Software Development

In this post I’ve thought about some of the non-technical aspects outside of the normal development flow that support good software development. Their adoption doesn’t guarantee a successful software product, but they do help.

  1. Solves a problem the user of the software has, whether it’s maintaining a list of their DVD collection or processing the accounts of an international company.
  2. The user doesn’t have to substantially alter their process to accommodate the software. There is a limit, however. If the process the user is following is substantially flawed then I wouldn’t expect the software to be that flexible. Sometimes for simplicity the software has to impress a process on the user otherwise it offers too much choice - a bit like trying to choose a toothbrush. Any application has to tread this fine line between too much flexibility and being too rigid. I believe as software designers we need to assess that flexibility make a decision and ensure all potential users of the application are aware of this. As a result your application may not appeal to as many people, but those it does will be much happier with it.
  3. The software is designed for real users, not all and sundry so has the features required to perform a task, not what the software designer thinks might possibly be useful.
  4. The software designers and the development team understand and are passionate about the business and process that the software is aimed at. This is really important and is often overlooked. If the designers and developers don’t have knowledge of the business, there is likelihood that the software will miss the real needs of the user, and will offer a raft of functionality that is of little or no use to the vast majority of users of the software. Word processing software is a case in point. Most people want to just type and be able to format what they type. I agree features like mail merge and style sheets are very useful, but why would anyone want software to AutoSummarize a document they have been working on? If the company developing the software are actually users of the applications they produce, they will know what works and what doesn’t and, most importantly, the functionality required to get the job done as efficiently as possible. To help this get the development out of the office and into the environment their software supports. Don’t look at it as a day wasted, it’s a day invested as the experience will substantially boost their knowledge of the business.
  5. Usability studies. There’s only one way to really evaluate software and that’s to watch people use it, either visually or electronically. Don’t rely on what people say they do, rely on what they actually do - it could well be different. Early prototypes, which won’t have the full functionality behind the scenes, are a good way of testing new applications or functionality. If possible use recording software on the computer or a video camera. Using a video camera also allows the user to be interviewed as they go along, crucial if they get stuck as it gives an insight into their thought process. It never ceases to amaze me the different ways people use software to achieve the same result. We released a beta version of Restoration Manager (link) for a cover CD-ROM on Practical Classics magazine. When the software started for the first time, it asked the user to specify some preferences, one of which was date format. This was achieved by a combo box with a handful of different formats to satisfy the European and US markets. However, despite the fact I knew you had to choose a date from the list, and every time I tested this part of the application I did this, the readers of Practical Classics had other ideas and decided to type in a date format, which caused the application to fail, but with a nice message box giving you the option of emailing the failure direct to us. Over the first weekend that the magazine was on sale, hundreds of potential users did just that. Quite frankly I was distraught, the biggest promotion the application would have and it fell at the first fence. We quickly produced an update and set up an autoresponder so anyone reporting the problem would instantly have a solution. On the bright side, however, it did prove the software was being used and we still get the occasional error report from the magazine CD-ROM 2 years after it appeared in the newsagents.

He or She Who Shouts Loudest Isn’t Always Right

I’ve seen this so many times. A design or requirements document is produced for a system and shown to a number of people. And one person doesn’t like it, so they shout loud and long. Often this someone in a senior position and who won’t use the software in any case.

The most damaging case I’ve experienced was in a meeting with a photocopier manufacturer, who wanted to expand their system to handle rental of fax machines and had employed the consultancy I worked for to carry out a feasibility study. Early on in the meeting one of the participants suddenly blurted out “we can’t do this, it’s against the business model” and went into a five minute rant and stormed out. Being slightly worried about the political implications of this event I asked what position the they held in the organisation. It turned out to be the secretary to the sales director and not someone who could determine the future direction of the company.

My point is quite straightforward. When evaluating feedback look at every comment and, initially, ignore the source so that the outwardly passionate comments don’t swap comments made by others, who may be just as passionate, but not so vocal. In the cold light of day you would value the comment of someone that uses the software every day if it conflicts with comments from someone that is a casual user, but pays your wages. In practise this is hard to do, but there is something you can do to make a decision based on quantifiable data: usability studies. A mixed group of users, different systems, a video camera and with websites a product like crazyegg will provide additional analysis. A usability study takes some setting up, for a start there needs to be two user interfaces to test, but it’s better to make a decision now that creates a superior user interface that people will use than have to either have a product that people don’t like (and will be reluctant to use) or have to release a revised UI in a short space of time.

Creating Software Prototypes

It’s often said that people don’t know what they want until they see something, then they realise they don’t want that.

If it’s a software development project then finding this out at user acceptance testing is far too late in the development cycle. Enter the prototype. A survey in Rapid Prototyping and Software Quality: Lessons from Industry by V. Scott Gordon and James M. Bieman carried out in 1991 revealed that prototyping can lead to better designs, better matches with user needs and improved maintainability.

Accordion Hero started as this prototype

Accordion Hero started as this prototype

If you’ve not come across prototypes before, they are a cheap, hacked, dirty (in terms of not worrying about coding standards) and disposable representation of what how the software will look and what it will do. The prototype is not a full working system - it can be a series of screens without any functionality, or can mimic some functionality with hard coded values. For example a list could be populated values, but these would be hard coded not derived from the database.

The First Golden Rule of Prototypes

Don’t use the same code for the real development of the application

Why? It introduces potentially poor quality code into your application. I’ve seen it many times, and six or seven years on, some poor developer new to the company and application is really struggling to understand code simply because it was prototype code and didn’t confirm to the standards. In addition it may be inefficient and just plain ugly and inadvertently may form the standard for the application.

The Second Golden Rule of Prototypes

Don’t make it look too good

Why? People that don’t develop software generally have no idea what goes on behind the scenes to make an application work in the same way I have no idea how a car is manufactured. In the same way I can see a concept car at a motor show and expect to be able to buy it from the dealer the next day, users of a system see screens and expect the application to be up and running in no time at all. The more perfect the prototype looks and behaves, the greater this expectation.

One of the first website I developed was for the local primary school. I took along a design containing a section of text from their prospectus. The aim of the meeting was to agree on a design, but having understandable text diverted their attention away from the layout. Interestingly the school picked up on several grammatical errors in the text and were horrified when the source was revealed.

The Third Golden Rule of Prototypes

Be aware of feature-creep

Feature-creep is the blight of any software development project and can be added by the development team or users. One of the dangers of prototyping is that the users and development team start to think more about the application and the surrounding processes. This can lead to features being added that have little benefit, sometimes at great cost. When additional features are requested look at the cost-benefit of adding the feature - there’s little point in spending four weeks developing a function that saves one person 10 minutes a year, unless there’s a substantial risk if the work carried out in the 10 minutes goes badly wrong!

If I sound negative about prototypes that really isn’t the case. I believe it is a beneficial tool that can lead to a much better application, provided it’s managed correctly and the three golden rules are followed.

Gunning Fog Index for Software Development

Software Development Metrics

In an earlier post we looked at measuring software development and finished by listing metrics suggested by Steve McConnell in Code Complete. Here’s the list again.

  • Lines of code
  • Work hours spent
  • Number of times routine changed
  • Cost
  • Defect location
  • Defect severity
  • Defect originator
  • Number of lines affected by a defect
  • Time to correct a defect
  • Number of attempts to fix a defect
  • New defects resulting from corrected code

Here’s a few examples on how the metrics can be used, assuming that defects are spotted in system testing by the test team and not part of development:

Number of defects / number lines of code (often multiplied by 1,000)
Indication on the quality of the code released to system test. The defect could be due to the interpritation of the requirements or interpritation of the technical design, but this metric can be used to judge the effect of implementing code reviews or a change to the technical design layout or content.

Number of time routine changed
High value may indicate insufficient or changed functional requirements, poor quality technical design or coding. Again this metric can be used to judge the effect of implementing code reviews or a change to the technical design layout or content.

Number of attempts to fix a defect and Time to correct a defect
A high value may indicate insufficient information on the defect, creating of new bugs when fixing (unless this is being recorded in New defects resulting from corrected code) or could simply be a hard defect to correct. These metrics can help measure the impact of changing the information provided by the test team on each defect, although to have a balanced view it’s necessary to know how much more time is spent by the test team in adding the additional information.

Defect location
This is identifying the location in the code of the defect and can be at a high or low level as you want to record, possibly identifying a poor technical specification or interpretation or a complex part of the application.

Defect originator
The important factor with this metric is not to start a witch-hunt for the developer, but determine why if they are producing a high number of defects. It could be they are working on a more complex part of the system, or they require additional training. The good thing is having used the metric to determine the problem, the metric can be used to see if the situation responds to the changes you make.

Be careful comparing metrics across two different projects. There is a golden rule - no metric or combination of metrics can compare projects of different complexity, without clarification. Whilst defects per thousand lines of code could be used to compare two projects, it takes no account of the complexity of either project, hence the need for clarification. Metrics such as lines of code and work hours spent should never be used to compare projects with the slightest difference.

Free Windows Search Tool

Agent Ransack in action

Free Windows search tool Agent Ransack in action

If, like me, you’ve been frustrated with Windows search not handling all file types then take a look at the oddly named Agent Ransack.

In addition to simple searching, Agent Ransack can search using regular expressions, making it substantially more powerful than the Windows standard search.

A highly recommended addition to your developer’s toolkit.

Software Development Metrics

In an earlier post we looked at measuring software development and finished by listing metrics suggested by Steve McConnell in Code Complete. Here’s the list again.

  • Lines of code
  • Work hours spent
  • Number of times routine changed
  • Cost
  • Defect location
  • Defect severity
  • Defect originator
  • Number of lines affected by a defect
  • Time to correct a defect
  • Number of attempts to fix a defect
  • New defects resulting from corrected code

Here’s a few examples on how the metrics can be used, assuming that defects are spotted in system testing by the test team and not part of development:

Metric Combination Possible Measurement & Inference

Be careful comparing metrics across two different projects. There is a golden rule - no metric or combination of metrics can compare projects of different complexity, without clarification. Whilst defects per thousand lines of code could be used to compare two projects, it takes no account of the complexity of either project, hence the need for clarification. Metrics such as lines of code and work hours spent should never be used to compare projects with the slightest difference.

Measuring Software Development

In order to evaluate anything, from a sponge cake to a Bugatti Veyron there needs to be a purpose for the evaluation and a frame of reference. Evaluating a cake on taste or a car on styling is very subjective and only of value in larger numbers. Evaluating on size or speed is objective and the information easily understood and acted upon.

Measurement is the cornerstone of successful marketing, with a constant repeating cycle of try - measure - adjust, covering all media from Google AdWords to television advertising. There are a number of techniques such as A/B split and multivariate testing that can be performed on a website to maximise it’s potential. Measuring a direct marketing campaign or an ecommerce website is straightforward - number of responses and number/value of sales.

With software development measurement can be rather more tricky. Just with marketing it’s important to measure the effect of change to the development process if you use a new tool or amend a process it’s vital to establish whether it is a valuable addition or not.

Steve McConnell in his excellent Code Complete book discusses various metrics that could be used to measure software development. He makes the observation that Any way of measuring the process is superior to not measuring at all. I don’t completely agree. Introducing metrics into a development process that didn’t have metrics before is a good thing, but must relate to the measurement of something tangible so as not to lead to misunderstanding. It’s very easy to give a developer a hard time because he’s producing half the number of lines of code a day as someone else, but if those lines contain less bugs or the code is more complex the comparison is unjust. Let’s look at the Bugatti Veyron again. It’s a very, very quick car with a top speed of 253 miles per hour. At full throttle it covers 2.46 miles per gallon of petrol, emptying the 21 gallon petrol tank in under 13 minutes. As we’re restricted to a more leisurely 70 mph in the UK, with a combined fuel economy of 11 mpg, to cover 600 miles the Veyron would have to fill up twice, adding 15-20 minutes to the journey. My Peugeot 306 diesel returned 50 mpg and could cover 650 miles on its 13 gallon tank so on our 600 mile journey would be 15-20 minutes quicker. So purely on speed the Veyron easily wins, on fuel economy a clear loser and loses on our contrived journey time.

Now I’ve skewed these metrics, and that’s really the point. Speed and fuel consumption absolute metric, no subjectivity yet a 10 year old Peugeot is quicker than an £810,000 supercar, because of the legal speed limit and ignoring the small matter of acceleration and braking.

Back to our development metrics, I believe it’s essential to choose metrics with care so they don’t become misleading and give the wrong impression. Consider what you are trying to prove or disprove before making a final decision. Using total number of lines of code written over time might be the perfect metric for evaluating an IDE tool, but make sure the two projects compared are of a similar complexity, as a mate of mine says, compare apples with apples, not apples with pears. Also consider using more than one metric to measure, just as you are unlikely to choose a car on one statistic. Choosing the Veyron for a car to commute would clearly be wrong as journey time and fuel economy are important, and the 306 isn’t the best choice for a track car as top speed is important.

Steve McConnell lists a number of metrics in Code Complete, here’s the list boiled down a bit:

  • Lines of code
  • Work hours spent
  • Number of times routine changed
  • Cost
  • Defect location
  • Defect severity
  • Defect originator
  • Number of lines affected by a defect
  • Time to correct a defect
  • Number of attempts to fix a defect
  • New defects resulting from corrected code

Don’t forget you can combine these to produce, say, number of defects per thousand lines of code.

Additionally Steve lists metrics aimed at measuring maintainability. These are:

  • Number of parameters passed to each routine
  • Number of local variables used by each routine
  • Number of routines called by each routine
  • Number of decision points in each routine
  • Control flow complexity in each routine
  • Number of lines of code in each routine
  • Number of lines of comments in each routine
  • Number of blank lines in each routine
  • Number of data declarations in each routine
  • Number of gotos in each routine
  • Number of input/output statements in each routine

In a future blog I’ll look at how some of these metrics can be applied and delve into the maintainability metrics in detail.

Mainframe Techniques in a PC World

I started in 1981 on a Honeywell Level-64 mainframe computer. The mainframe was 20 miles away and the office I was in was connected by a leased BT line, which operated at a low speed compared to ADSL standards today.

The systems consisted of green screens for instant lookup and update, batch data entry updated overnight and batch jobs to produce reports and print various documents. To have a job run it was necessary to fill in a form, which was sent by courier to the machine room in the afternoon and the output returned the following morning. There were separate development, test and live environments and promoting from one environment to another required more forms and sign-off.

This level of formality was the norm for mainframe environments, and there were plenty of people stopping you from doing anything silly, or at least trying to stop you.

The PC environment in the early 90’s was the complete opposite. If an application didn’t work, the developer made a change, stuck it on a floppy disk, walked to the user’s office, copied the amended application and off the user went. No controls, no process, different version of the application on different machines and total anarchy. The worst I have ever experienced was a large installation developed by a software house for a bank. There were 40 workstations running a VB3 application on Windows 3.11 and no two of the workstations had identical software installations. This was partly because the development team had been moved to the live site, removing any barriers or process to releasing the application. In addition some workstations had the VB3 IDE installed so code was being moved from developer’s machine to workstation and back again. With developers working on different bits of the application it became an absolute bun-fight, with the integrity of the workstations paying the price. It took a week to sort this mess.

My experience over the last few years shows that PC development has matured and best practises have brought about sensible development practises, somewhere between “wander round with a floppy disk” and “mainframe form filling”. If you’re setting up a development environment I do have a few suggestions:

  • Build a developer’s environment, with the tools required to do the job on a freshly installed and updated operating system. Take an image of this disk and use it to create environments for new developers. If you need to update the operating system or add new tools make sure you update this master disk too. Assuming you are using source control and backups of personal files, restoring a developer’s environment in case of disaster will be a breeze. Alternatively consider virtualisation to achieve the same result. Obviously take account of licensing issues and machine naming when doing this.
  • Build a system test environment in a similar manner to above.
  • Code and executables flow in one direction. From developer to source control to build to test to customer. System testing must only be carried out on executables installed using the distribution method, never in debug mode in the IDE. It may be necessary to use an IDE to debug code in a system test environment, but the change should never flow the other way.
  • Don’t skip a step in the flow unless everyone knows what’s going on. It’s too easy for a rouge executable to be distributed or for the code not to fed into source control and then be omitted from the next release.
  • Have approval for releasing to the next stage in the flow. Between development and system test this would be a peer review, and between system test and release it would be acceptance by the test team or the customer. I’m not suggesting a return to the mainframe form filling and signature marathon, but a double check by another team member is invaluable

Create a Wiki to Share Information

One of the things that always bugs me is a lack of information needed to get a job done, be it the location of the source control system of whether you just help yourself to the hotel’s breakfast or wait to be seated and served.

With software development it’s often the case that an important piece of information is known or noted by a couple of people and everyone else asks. Worse still if only one person knows, and they are away the day that information is crucial.

Fortunately it’s easy to set up and maintain a Wiki. For those who haven’t come across a Wiki, it’s basically a web based tool promoting the sharing of information between people. Because public Wikis such as MediaWiki. This uses PHP and MySQL and will run on either a Windows or Linux server. Setting up will take a couple of hours and it’s flexible enough to adapt as you add new information. Make sure your team use it as the first port of call and if the information required isn’t there, it needs to be added, either by the person looking for the information, or, if applicable, the person with the information.

Be careful what information is added though. One Wiki I set up for a development company had been expanded to include To-Do lists in addition to various project planning and time capturing systems, which I think is a poor use of this facility. Also avoid duplicating information. We talked about coding standards a few postings ago, which should be a document. I wouldn’t expect to see the standards duplicated on a Wiki, however having a link to the standards document is a good thing.