AI Automation for SMEs: The 90-Day Pilot as a Playbook
Not an AI strategy paper but a concrete week-by-week plan: who does what, which artefacts get produced, what decision lands on day 90 — and how to structure a pilot contract cleanly.

There are enough articles explaining why AI in mid-sized companies should start with a small pilot. This one explains the how — as a concrete playbook.
If you are still on the why, read first why AI-ready companies start with decisions, not models and why many SMEs talk about AI but few scale it. This article assumes the decision to run a pilot has been made, and answers the next question: what happens over the next 90 days, concretely, week by week?
Why 90 days and not 9 months
According to Bitkom (March 2026), 41 percent of German companies with at least 20 employees already use AI, and another 48 percent are planning or discussing it. The barrier is no longer interest — it is execution.
McKinsey's analysis of digital manufacturing "pilot purgatory" describes the pattern: many companies run pilots, few move them into regular operations. Ninety days is a deliberate window: long enough to change a real process, short enough not to drown in endless concept mode. DORA's 2024 Accelerate State of DevOps Report also names unstable priorities as one of the strongest productivity killers — a fixed 90-day frame protects against exactly that.
The three roles a pilot fails without
Before the weekly plan: a pilot needs three named people. Not three departments — three names.
- Decision Owner. The business person whose decision should improve. Not IT. Whoever owns the process today.
- Data Owner. The person who can say where the data lives and whether it can be trusted.
- Approver. Whoever signs off the AI-assisted action in practice.
If one of these roles is missing, the pilot is not ready to start — no matter how good the model is.
The 90-day playbook
Weeks 1–2: one decision, one KPI
No tool, no model, no platform in these two weeks. Just three outputs:
- One decision is named — not "AI in procurement" but "we want to flag risky inbound invoices before they are posted".
- One business KPI with a baseline. Example: "Today 6 percent of faulty invoices are caught only after posting. Target: under 2 percent."
- One stop criterion is agreed. What has to be true for us not to scale the pilot? That is not pessimism — it is the condition that separates a pilot from an uncancellable project.
Artefact at the end: a one-page pilot charter. One page, not twenty.
Weeks 3–6: the minimum context model
Now data — but only what this one decision needs.
The most common mistake is to start with a data-lake project. Instead: which five to twelve fields does the model need to support the decision? Where do they live? Who owns them? Are they reliable?
In parallel, fix the permission level — this is a design decision, not a detail:
- Does the AI only read and show a suggestion (read-only)?
- Does it suggest an action a human approves (assisted action)?
- Does it write back into a system of record (automated write-back)?
For a first pilot the answer is almost always "assisted action". Artefact: a data contract (fields, source, owner) plus the documented permission decision.
Weeks 7–10: the workflow with a human in the loop
Only now does the AI-assisted flow get built. The key: the AI does not produce a final state, it produces a suggestion with confidence and a reason. The approver decides.
Three things must exist by the end of this phase:
- A workflow where AI suggests and a human approves — embedded in the real daily work, not a separate demo tool.
- Traceability: which data was used, what was suggested, who approved, what the outcome was. Without that trail, AI never becomes part of an auditable business process.
- An escalation path for the cases where the AI is uncertain. Making uncertainty visible is a feature, not a defect.
Weeks 11–12: measure and decide
No "looks good". The KPI defined in week 1 is measured against the baseline. Then exactly one of three decisions is made:
- Scale to another department, line or region.
- Iterate: the approach holds but needs another round.
- Stop, because the stop criterion was reached. A cleanly stopped pilot is not a failure — it prevented an expensive misinvestment.
What belongs in the pilot contract
A frequently overlooked point: the contract structure co-determines success. A well-structured 90-day pilot fixes upfront:
- The specific KPI and its baseline as the success measure — not "introduce AI".
- Data access, hosting region and deletion rules — with EU exposure this is not an afterthought.
- Ownership of code, model artefacts and data after the pilot ends.
- An exit clause: what happens to data and results if it does not scale?
Whoever starts a pilot without these four points does not have a pilot — they have the uncontrolled beginning of a project.
Example: invoice checking in a 70-person company
- Decision: flag risky inbound invoices before posting.
- KPI: share of errors caught only after posting, from 6 % to under 2 %.
- Context model: supplier, amount, VAT category, PO reference, payment term, historical anomaly.
- Permission: assisted — AI flags, finance decides.
- Outcome after 90 days: a measured KPI plus a clear scaling decision.
Not spectacular. That is exactly why it works.
Frequently asked questions
Does a 90-day pilot need a large AI team? No. It needs three named roles (Decision Owner, Data Owner, Approver) and an implementation partner. Size is not the success factor — focus is.
What if our data is not clean? Then that is exactly the output of weeks 3–6 — and a valuable finding. A pilot that surfaces a data problem has already paid for itself.
Can we pilot several use cases in parallel? Possible, but not recommended. One pilot, one decision, one KPI. Parallelism dilutes the signal and collides with the DORA finding on unstable priorities.
What happens after day 90 if it scales? The pilot becomes a product: same workflow, broader scope, harder operations, monitoring and permission requirements.
Conclusion
An AI pilot rarely fails on the model. It fails on a missing decision, an unclear KPI, missing roles and a frame that never ends.
Ninety days, three roles, one KPI, one stop criterion and a clean contract beat any large AI strategy paper — because at the end there is a measurable business decision, not a demo.
Next step
You have a decision in mind but no weekly plan? Start with an AI readiness check. We define the KPI, the roles and the minimum context model, and design a controlled 90-day pilot with a clear stop criterion.
Sources
- Bitkom, Digitalisierung der Wirtschaft: Fast jedes Unternehmen beschäftigt sich mit KI (2026) — bitkom.org
- McKinsey, How digital manufacturing can escape 'pilot purgatory' — mckinsey.com
- DORA, Accelerate State of DevOps Report 2024 — dora.dev
- European Commission, AI Act — digital-strategy.ec.europa.eu