Does Size Matter?

Recently the topic of code size and its effect on the productivity and maintainability of a code base has been brought up in a number of forums and blogs across the internet. One of the more read would be the following (I won’t go into the issue I have with the author only ‘talking’ to young or inexperienced programmers) –

Codes Worst Enemy – Steve Yegge December 19th 2007

But what exactly is the problem that’s being brought up in these comments? Is it actually the size of the code base that seems to be the issue or is it everything else that has contributed to a problematic increase (a lot of the comments lead towards the difference between strong and weak type languages but a bit on that later)?

A Symptom Not A Problem?
Take for example the attack on design patterns (see ‘Design Patterns Are Not Features’)

A Factory isn’t a feature, nor is a Delegate nor a Proxy nor a Bridge. They “enable” features in a very loose sense, by providing nice boxes to hold the features in. But boxes and bags and shelves take space. And design patterns – at least most of the patterns in the “Gang of Four” book – make code bases get bigger.

When it comes down to it, every single feature of a language ‘enables’ a set of features, as do linked lists, rendering systems, input handling and anything else that comes to mind. But a correctly implemented pattern using these features is much more suitable for a large system than an individual solution that crams as much functionality into as small a space as possible. God help the person who then has to come in a fix this solution when the original author has moved on, where-as a simple comment of ‘This uses a proxy pattern to do…’ would make the code instantly more understandable to any half-decent programmer.

I wouldn’t have such a problem if the issue came from over-use of patterns that lead to a large increase of very small and distributed classes that spread out the functionality across the entire code base, but it seems as though this has come through ‘simple’ implementations, so the question that is begging to be asked is “what has happened to make it so big?”.

I really feel that the main problem being missed here is confusion between code bloat and duplication (and I don’t just mean simple copy and paste duplication but implementation duplication). Steve has mentioned this in his original post but seems to quickly skip past it.

However, copy-and-paste is far more insidious than most scarred industry programmers ever suspect. The core problem is duplication, and unfortunately there are patterns of duplication that cannot be eradicated from Java code.

Now I will admit that it has been a few years since I worked closely with Java, and I am sure someone of Steve’s experience with the language is not wrong, but there are problems with every language and solutions need to be found for the problems you encounter (not everyone is fortunate enough to be able to say “From now on, we are working with X”).

As an example take the FTL. On previous titles I’ve worked on, the mantra was if you needed something, then you implemented it. It didn’t matter if someone else had written a linked list, you needed something slightly different so you created what you needed (one of the first titles I ever worked on professionally actually had 3 different list classes used in throughout the project). But now we are moving forward and using the FTL to avoid this, the code base has been reduced, and it’s clear that it wasn’t the size of the code that was the problem, but the way in which the code base had been approached.

Language Choices

A lot of the discussion (as mentioned above) has come down heavily on the strong/weak language divide. I’m fortunate that while I work primarily with C++, I spend a good chunk of my time working with Ruby. And I will be the first to admit that my Ruby tools are written with less code than the equivalent would be with C++ (or more likely C#). But why?

The simplest offering is not that the language is weakly typed (I would personally say that has very little to do with it), but with the amazing level of support from externally developed libraries. I don’t have the code to hand, but recently I needed to create an FTP interface for a particular script I was working with. With Ruby, this was simply a case of requiring the relevant module and away we go? And if I was writing the tool in C#, if there was an equivalent library, it would have taken just as many lines of code. The only difference being that (in my opinion) the C# version would be easier to understand without running it than the Ruby one for a majority of developers.

An interesting quote from one of the developers of Bugzilla earlier this year

Since 1998 there have been many advances in programming languages. PHP has decent object-oriented features, python has many libraries and excellent syntax, Java has matured a lot, and Ruby is coming up in the world quickly. Nowadays, almost all of our competitors have one advantage: they are not written in Perl. They can actually develop features more quickly than we can…

Full article can be read here – The Problems of Perl: The Future Of Bugzilla

This accurately sums up the point I just made, that these newer language are easier to develop for (and indirectly decrease the code base) because of the number of modules and contributors they have developing for them. Without these (and it’s a shame Java and C++ don’t have such plug-able modules) the languages would be used to create programs that are just as big as any other language.


Don’t get me wrong on any of this, a large code base is difficult to maintain and develop and care must always be taken not to overly bloat something when it isn’t necessary, but rather than look at a code base and say “This is big, before I move on I need to make it smaller to make my life easier”, you need to approach it from a different direction. Examine the complexity from all angles, and if trimming code makes it easier the read or work with, then go for it, but if refactoring it increases the size, but makes it more structured and maintainable, then that is just a valid as any other solution.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s