Your App Is Slow: Boiling the Frog
Sebastian Good

This is a multi-part series:

1. boiling the frog, 2. speed is a feature, 3. that darn demo dataset, 4. measure early, measure often

How many times has a customer come and told you your product was “slow”? In this multi-part series, we will discuss how “slow” happens, and how you can fix it.

It’s easy to not notice your app getting slower and slower while you’re developing it. Page comes up in 2 seconds instead of 1? Probably that new guy working on the database. Page comes up in 3? Probably backups happening in Europe. 4 seconds? Good reason to go nibble away on your unit tests. 5 seconds? Wow, that [insert component you don’t understand very well] must really be working hard. By the time you get to the end of a sprint or epic, that little page takes 6 seconds, and no one’s quite sure why. Especially when by all rights it should come up in more like 1/10th of a second. Like the proverbial frog who doesn’t realize he’s being boiled until it’s too late, it’s amazingly easy to watch performance slip away a few percent at a time. Performance problems build up as yet another form of technical debt.

What can we do to avoid being frogs? The first best thing is for everyone on the team to have a keen understanding of how fast different operations should be when they start development work and to take ownership of those expectations.

Build profiling into the product

In many applications, there are a few key factors that drive performance. In a web application, it’s payload size and database query latency, usually. In a graphics-intensive application it’s the triangle count and the cost of turning data into triangles. In a simulation, it’s probably network latency, memory bus bandwidth, or CPU pipeline efficiency. Given your architecture, can you continuously measure these things and highlight them to developers in their private builds?

We once worked on an ASP.NET WebForms application, where “view state” is a classic problem. We put a big ticker on the top left of every page that showed how much you were using. If it got to be “big” (over a few dozen kilobytes, if I recall correctly), then we rendered it in an enormous font. It was annoying. We also made it useful: if you clicked on the annoying number we’d pop up a window showing what was actually in your view state so you could figure out how to skinny it down.

Graphics programmers often have a little overlay showing their frame rate. If it gets too low, it’s time to pop open some diagnostics and figure out why. In your web application, if any of your service calls return in, say, more than 100ms plus your built-in ping latency, could you flash the name of the service to the screen, or send the developer an email?

This helps in the all-too-common case that a component works perfectly well early on but starts to suck wind later. That nice query that used to execute off cached data in 20ms in the database now suddenly requires a full table scan and takes 500ms. Why? If database queries are a critical performance driver (and they usually are) the sooner you put your finger on the problem the better. Now when that page takes 2 seconds instead of 1, there’s evidence as to why, and someone can get to work on it.