Why to use string.format() over concatenation (Real world experience)

So today I was working on reverse engineering some old code. I will strip this down and remove proprietary info but still let you see the bug. This is a post for newbies but experienced coders might get a laugh out of this as well.

Here is the error code:

var contract = CustomerId + ProductGroupId + EndDate.ToString("yy-MM");

Now, imagine these values:

CustomerId = 12345
ProductId = 4
Date = 8/7/2014

A new developer would assume that this would return the following:

12345415-08

They would be wrong. See, CustomerId and ProductGroupId are integers. So they don’t concatenate as strings, they add as integers. The real value is this:

1234915-08

12345 + 4 = 12349. This is math, not string concatenation.

How would a developer resolve this bug? There are two solutions:

  • Use ToString() and concatenation (DO NOT USE!)
  • Use string.Concat() (USE FOR SIMPLE CONCATENATION OR PERFORMANCE!)
  • Use string.Format() (USE FOR CONCATENATION WITH FORMATTING!)

Here is the another interesting fact. I know exactly when this bug was introduced into the system. This code was written by the developer using the first solution. The developer had originally added .ToString() to the end of these objects. The developer didn’t write buggy code. He wrote code that worked.

The code used to look like this:

var contract = CustomerId.ToString() + ProductGroupId.ToString() + EndDate.ToString("yy-MM");

So what happened? If a developer didn’t break this code, who did?

I’ll tell you what happened and who did it. Resharper.

Today, developers aren’t the only ones messing with our code. We use tools that are pretty awesome to do masses of code automation for us. One of these tools is Resharper. Well, Resharper detected the ToString() methods as “Redundant Code.” Oh, in this case, it wasn’t redundant code.

The only reason I found this bug is because I was rewriting this piece of code and had to reconstruct a ContractId and I began reverse engineering this code. I noticed that the first part of the ContractId was the CustomerId up until a few months ago. After that, it seemed to be slightly off. I was able to check source from back then and see the difference and see why this bug started. Sure enough, when resharper cleaned up my redundant code, it removed the ToString() methods.

However, while this is a Resharper bug, let’s not jump so fast to blaming Resharper. I’ll submit a bug to them, but are they really at fault? I say not completely. Why? Because the developer chose to use ToString() and concatenation instead of string.Format(). This was the wrong choice. Always use string.format(). This has been preached by many, but just as many don’t listen.

What would have happened if the developer had followed the best practice to use string.Format() or string.Concat?

var contract = string.Concat(CustomerId, ProductGroupId, EndDate.ToString("yy-MM"));

or

var contract = string.Format("{0}{1}{2}", CustomerId, ProductGroupId, EndDate.ToString("yy-MM");

The bug would never had occurred. With either string.Concat() or string.Format(). In those uses, the ToString() methods really would be redundant. With or without them, the code acts the same.

But surely avoiding a Resharper bug isn’t the only reason to use string.Concat or string.Format()? Of course there are more reasons. This error was due to the ambiguous + operator. Using string.Format() or string.Concat() eliminates such ambiguities. Also, with string.Format(), the developer’s intentions are much more clear. The format decided upon is explicitly included. When I type “{0}{1}{2}”, there is no question about what the intended format is. This is why I chose string.Format() for this piece of code. It is self-documenting and unambiguous.

Conclusion

Best practices are best  practices for a reason. Use them. Using string.Format() is always preferable to string concatenation.

P.S. This old project also could have alerted us to this error if anyone would have bothered to write a Unit Test. But this code was buried deep in a method that itself was hundreds of lines long. Following the 100/10 rule and testing would have alerted us when Resharper introduced this error.

4 Comments

  1. nold says:

    hum I disagree with myself 🙂 string.format seems to use stringbuilder 🙂

  2. nold says:

    I disagree !
    1->3 strings please use operator +
    4->8 string please user string.format (or string.concat cf previous post)
    8 and more please use string builder

  3. Nemo says:

    Only use String.Format if you cannot use String.Concat which solves all the issues you mentioned and yields better performance and readability, general rule of thumb is: Use String.Format only if you need the actual formatting options provided, otherwise use String.Concat.

Leave a Reply

How to post code in comments?