Article

Treat Automation Like a Product, Not a Script

Jun 5, 2026 · 7 min read

TREAT AUTOMATION LIKE A PRODUCT, NOT A SCRIPT

Scripts break and nobody notices. Products have states, failure handling, and audit trails. Your automations need the same.

Most automations break in ways you never find out about

The worst failure mode in automation is not the crash. The crash is obvious — something stops running and you get an alert, or you notice the silence. The worst failure mode is silent corruption: jobs that half-complete, data that gets partially written, leads that get enriched with stale or wrong values, and nothing in the system tells you it happened.

I have built automations that ran cleanly in testing and then fell over in production in ways I could not have predicted. RocketAPI returning throttle errors mid-batch. Gmail rate limits cutting outreach sequences short. Schema changes in an upstream source breaking extraction silently. The automation kept running. The outputs were wrong. Nobody knew.

This is why I treat automation like a product, not a script. Scripts are written, run, and forgotten. Products have observability, predictable states, failure handling, and audit trails. If you want automations that you can actually trust, they need the same rigour.


Constrain what it is allowed to do

The first layer of control is limiting the action surface. An AI component inside an automation should have a narrow, explicit brief — not open-ended agency.

In practice, this means:

  • Use only provided fields. The automation must not reach outside the data you give it. If you pass CRM records with five fields, the output must be based on those five fields and nothing it infers or invents.
  • Banned claims list. For outreach automation specifically: no fake personal details, no implied history of a relationship that does not exist, no promises the business cannot keep.
  • Tone rules. Define them explicitly. "Professional and direct" is not enough. Write the rules: no more than two sentences of opener, no overfamiliarity, no marketing buzzwords.

These constraints do not reduce quality. They define the floor. Outputs that stay within them are consistent and safe. Outputs that drift outside them are where damage happens.


Constrain the output

The second layer is schema enforcement. If an AI component in your automation can return anything, it eventually will.

Strict output schemas do three things. They make outputs machine-parseable — the next step in the pipeline can extract fields reliably. They surface failures early, because a malformed output is immediately detectable rather than silently wrong. And they reduce the blast radius of a bad run, because a short, structured output has far less room for nonsense than a free-form paragraph.

Short outputs by default is the pattern I apply everywhere. If a step is producing 800 words when 80 would do, that is a constraint problem in the prompt, not a quality problem in the model.


Add gates at the right points

Not every action should run automatically. The question is where the cost of being wrong is high enough that human review is cheaper than fixing the mistake.

My rule: gate anything customer-facing, brand-sensitive, or legally adjacent.

For first-touch outreach to high-value partners, the automation can draft, score, and prioritise — but the send decision is mine. The cost of a bad first message to a key relationship is not worth the time saved by removing the approval step.

For internal ops tasks — classifying inbound emails, summarising logs, flagging anomalies, scoring leads — you can automate aggressively. The failure mode is recoverable, and the volume is high enough that manual review would swamp the time savings.

Confidence thresholds are the other gate mechanism. If the automation hits a low-data situation — a record where key fields are missing or inputs are ambiguous — it should escalate rather than guess. This is not a failure; this is the system working as designed. You want it to surface uncertainty rather than hide it.


Make it auditable

The third layer is visibility. Automation you cannot inspect is automation you cannot trust.

Every job that runs should produce a record: which prompt version was used, what the inputs were, what the output was, when it ran, and whether it succeeded. Not because you will read every record — you will not — but because when something goes wrong, you need to be able to reconstruct what happened and why.

Failure categories are more useful than failure counts. I log not just whether something failed, but why: off-format output, incorrect field extraction, missing data, rate limit hit, schema mismatch. Over time, patterns emerge. If you keep seeing schema mismatch errors from a particular data source, that tells you where to invest in defensive handling.

Dead-letter queues are essential. Failed jobs should not vanish. They should land somewhere reviewable, with enough context to understand what went wrong and decide whether to retry or discard.


Core reliability patterns

The reliability layer is what separates an automation you trust from one you baby-sit.

Idempotency is the foundation. Rerunning a job should not create duplicates or corrupt state. Every pipeline I build is designed to be safely re-executable. If it ran successfully, running it again has no effect. If it failed halfway, running it again picks up from where the failure occurred.

Dedupe keys follow from idempotency. Every entity in the system — leads, partners, creators, messages — has a unique identifier. The pipeline checks for it before writing. This eliminates the category of bugs where restarts or retries produce duplicate records.

Job state machines make pipeline state visible. A job is in one of: queued, running, success, failed, retried, escalated. There is no ambiguous middle ground. You can query the table and know exactly where every job is.

Retry with backoff handles flaky dependencies. Most external APIs will occasionally fail. The question is whether your automation degrades gracefully or falls over. Exponential backoff with a retry limit handles the overwhelming majority of transient failures without manual intervention.

The goal is this: when an automation breaks, it should break safely, leave clear evidence of what happened, and be resumable from the last clean state. That is not a luxury for complex systems. It is the minimum standard for any automation that runs in production.


The operational mindset

The difference between teams that trust their automations and teams that fear them is not sophistication. It is discipline.

Automations built like scripts — quick, undocumented, no failure handling, no audit trail — will eventually cause a problem you cannot diagnose or recover from easily. Automations built like products fail in known ways, recover gracefully, and give you the information you need to improve them.

The investment in gates, schemas, audit logs, and state machines pays back quickly. The first time you need to diagnose a production issue, you will be grateful you can see exactly what happened. The first time an external API goes flaky, you will be grateful your retry logic handles it without waking you up.

Build it right once. Trust it permanently.