Not All Infra Deserves Code

Some parts of your infrastructure don’t need fancy IaC pipelines, complex tf modules, or daily deployments—they just need to be set up once, left alone, and kept simple to age gracefully.

Sep 24, 2025

We’ve mostly won the battle: Infra-as-Code is the default. Terraform, Pulumi, CDK, CloudFormation—pick your flavor. It’s how we define infrastructure these days, and rightfully so. Declarative, testable, reproducible infrastructure? Yes, please.

But after a few years in the trenches, a funny pattern emerges: some parts of your infrastructure change constantly, while others are barely touched. And yet, we often treat them the same way—codifying everything with the same rigor, versioning it, deploying it through CI/CD pipelines.

The result? Some of our infra-as-code ages like milk.

The Problem with Code That Outlives Its Usefulness

Let’s start with the obvious: code ages. Infrastructure code is no exception. What once felt clean and elegant becomes un-runnable eighteen months later:

A Pulumi stack fails because a minor SDK update changed a default value.
A Terraform state file becomes incompatible with a newer binary version.
A resource is removed or renamed in the cloud provider’s API, and your IaC provider doesn’t catch up in time. Resource replacements coming in your way as no one still checks the deployment preview anymore (it always* works)
An old module relies on a provider that was last maintained during the Kubernetes 1.18 era.

This is the entropy of software systems. It’s not a bug—it’s a feature of time1.

Hyrum’s Law reminds us that all observable behaviors will eventually be depended on. The moment your IaC becomes the interface for your team, it’s now part of their expectations; followed by strict reliability expectations, might I add, as everyone is hopefully getting used to near 100% availability at this point. Change becomes risk. Meanwhile, some of this code hasn’t been touched in years. Maybe they need just to change a trivial configuration not so often (change the admin email?) but now your whole pipeline needs to run successfully.

Code is Most Useful Where Change is Inevitable

That’s why one of the most useful heuristics in infrastructure is this:

If something changes frequently and is painful to do manually, codify it.

Applications, environments, databases, CI/CD stacks—these are high-churn systems. They benefit enormously from IaC because:

They’re deployed often
Their configuration evolves
They’re tightly integrated with other systems

IaC gives you confidence, observability, and drift detection. It's living, code-like documentation of the current state. It's the only reasonable approach when you're touching things daily or weekly. There's code all around the application and your IaC is just part of that. The effort to maintain that running is just shared with the rest.

But on the other end of the spectrum...

The Long-Tail of Rarely Touched Infrastructure

Here’s a short list of things you might set up once and never touch again:

SAML or OIDC configuration for a SaaS provider
Bootstrap of the first AWS/GCP project in a new org
Setting up root credentials or creating a billing account
Uploading a domain validation file to an S3 bucket
Creating an AWS Organization structure or the GCP folder/project tree
Defining that one vaultId that you need in your CI environment.

These are one-and-done steps, often before your platform exists. Most of the time you need at least a few manual steps (like creating the first credentials) And yet, we sometimes bake these into IaC components on day one of the product, or worse—shove them into a dedicated repo managed by the DevOps team with the kind of rigor you’d reserve for production DB migrations.

What good does that do? The project hierarchy will barely change. The credentials setup is going to be run once, manually reviewed, and probably never modified again. But now you’re maintaining a stack, tracking Terraform state, dealing with CI credentials, or praying a Pulumi dependency hasn’t broken.

Sometimes a shell script and a README are the right move. They’re not elegant. They’re not DRY. But they’ll still run in two years, when Terraform throws a tantrum about a deprecated attribute.

And hey— sometimes you don't even need CI for that. No high privilege pipelines with extra GitHub Actions code you need to babysit, Just run this with admin credentials on your own machine, once in a lifetime. Why treat it like code that needs to survive daily pipelines and onboarding flows?

Deferred Complexity and the Platform Tax

Codifying everything comes with a cost. You need CI pipelines, secret management, state handling, version pinning, linting. It’s not just YAML anymore—it’s a full software stack.

This isn’t free.

We see infra teams over-engineer one-time flows because the default instinct is "IaC is the best form of automation." But instead of value, we get:

Terraform modules wrapping a single account-level toggle
A Pulumi component to do one-time email validation for a domain
A CI job that only runs a shell command once, ever; and no one knows if it would work if executed again.

We pay an ongoing maintenance cost for temporary logic—and it just doesn’t balance out.

Aging Gracefully with Boring Tools

Sometimes, the best infrastructure artifacts are simple, explicit, and low-tech:

A bash script2 with clear comments running in a dead simple Docker container
A README.md that tells you what to do and why
A minimal set of steps using tools like jq, aws, or gcloud

These tools age better. They don’t rely on external state, they rarely break compatibility, and they can be reasoned about easily. Bonus: these scripts are often run locally, by a human, using admin credentials they already have. You’re not building a pipeline. You’re not securing secrets. You’re just pressing a button—and then forgetting about it for two years. That’s OK.

But remember: the goal isn’t to make them timeless. It’s to make them easy to revisit. And if a command there doesn't work anymore, you can just run it manually during a disaster recovery situation.

What Makes a Script Resilient?

For scripts to age well and serve their purpose (especially in disaster recovery situations), they need to be dead simple to:

Understand and be replaced by manual steps if necessary
Re-run without guessing
Update if something external changes

This means:

The script should be short and easy to read.
It should be well-documented and clear about its purpose.
It should avoid cleverness and minimize dependencies.
It should be tested once in a while, especially if it plays a critical role.

You don’t want to find out that your org creation bootstrap script doesn’t work during a real outage.

And to decide whether it’s a script or a proper IaC artifact, consider these three caping factors:

State complexity: If the state is hard to infer from reading the script, consider going declarative. IaC shines here.
Script length/complexity: The longer and more complex, the higher the maintenance burden. Beyond a certain point, it might as well be a IaC.
High churn: If the thing changes regularly, version it, test it, maintain it. That means real code and should be treated as such.

Reliable Foundations

This isn’t about being lazy or cutting corners. It’s about being deliberate. Finding out what needs more reliability and investing on that. Not every piece of infrastructure deserves the full IaC treatment. Some things are better left as simple, durable scripts.

Recognize the low-churn layer in your infra stack. Keep it boring on purpose. Let the high-churn systems earn the complexity of abstraction.

Use the right tool for the job—and sometimes, that job is best served with just a bash script and a bit of judgment.

Simple is not simplistic. It’s just... simple. And that’s sometimes exactly what you need.

*Cover Illustration by Mila Aguiar

One could argue that infra code suffers even more as it relies on the underlying cloud providers ecosystem. It's a chaotic layer huge APIs operated by big tech with all sorts of weird behaviors and inconsistency.

Of course this is also code, but the stack is an order of magnitude simpler. You can make a reliable bash script that will work in your M4 and in a arm32 Debian 7 in the exact same way. Could you say the same for a TypeScript (Pulumi) project? or a Terraform monorepo?

Vinny’s Substack

Discussion about this post