Undocumented Languages: Configuration Files
Sebastian Good

It starts innocently enough. You need a database connection string and to know which tables are safe to cache, and there’s just no sense in putting that in your source code. Right? I mean, why put hard-coded stuff in your programming language?


Well done! Actually, hold on, it turns out you need a few substitutions there. I mean, when we deploy to production, we can’t assume the database is on localhost, right? No problem, you write yourself a little custom configuration file parser that allows you to target different configuration names

[Database: dev]connectionString=Server=localhost;Database=someDBcache=PERSON_TYPE,CODES,MONKEYS [Database: prod]connectionString=Server=prod-abc;Database=someDBcache=PERSON_TYPE,CODES

That’s sorted it out. Your deployment scripts just take a target name (‘dev’, or ‘prod’) and off you go.

Um. Except someone added a table to cache in development and forgot to add it to production. Did they do that on purpose? Nah, we should obviously DRY up this configuration file. No one should tolerate slow monkeys in production. So back to the configuration file parser, SASS style.

[Database]cache=PERSON_TYPE,CODES,MONKEYS   [Database: dev]  connectionString=Server=localhost;Database=someDB   [Database: prod]  connectionString=Server=prod-abc;Database=someDB

Amirite? Sure, but it turns out some of your programs don’t support using localhost as the target name; it’s better to use the actual hostname. No problem, back to the configuration parser. You better allow a quick substitution. (Hey, parsers are fun! Did you escape the brackets and equals signs correctly?)

[Database]cache=PERSON_TYPE,CODES,MONKEYS   [Database: dev]  connectionString=Server={HOSTNAME};Database=someDB   [Database: prod]  connectionString=Server=prod-abc;Database=someDB

That’ll do the trick. Though, actually, you know what? It turns out when we run the program during database patches, we don’t want to cache anything, right? And that’s an orthogonal concept to the target. Second argument to the deployment script?

[Database]cache=PERSON_TYPE,CODES,MONKEYS   [Database: dev]  connectionString=Server={HOSTNAME};Database=someDB   [Database: prod]  connectionString=Server=prod-abc;Database=someDB [Database: @cache]  cache=

That’ll do it. Now it all works great.

But reflect for just a moment on what actually happened here. To avoid putting “hard coded values” into a programming language you … wrote a custom programming language with no strictly documented semantics, no debugging or diagnostic tools, and with no other users outside your team.

Sometimes you might just consider the heretical thought of doing your configuration up front in a little piece of code that re-uses concepts from your primary programming language. If you’re a dependency injection zealot, you’ve already got a bunch of code in your composition root that sets up your basic objects. Would it be a stretch to call some code there to get these string values you need? And then have a debugger to make sense of all of this as it happened?

In languages that are fundamentally dynamic, this whole conversation must seem surreal. Why invent a language for your data, when you can just load a file and run it to get your values?

Mini languages can be great. Maybe they’re just what the doctor ordered. But they’re languages all the same: make sure you know that you’re making them, and be sure to support them.