What's the best first AI project for a services firm?

A research and briefing pipeline that compresses 4–6 hours of pre-work into 25–40 minutes. It is narrow (one input, one output), measurable within days, owned by one person, and teaches the team how to specify AI workflows correctly — a skill that transfers to every subsequent build.

Why is a content engine the wrong first project?

Content quality takes 3–6 months to show up in rankings, so the team can't tell if it's working. Content engines also require cross-functional coordination (strategy, SEO, design, sales enablement), making them transformation programmes disguised as first projects. The early struggle teaches teams that AI is unreliable, which kills downstream adoption.

What does the 90-day AI marketing build look like?

Weeks 1–3: research and briefing pipeline (one person, daily use). Weeks 4–6: content drafting pipeline against a brand-voice rubric. Weeks 7–9: distribution layer (LinkedIn, newsletter, and Reddit variants from published pieces). Weeks 10–12: first measurement against the previous baseline.

What are the three AI marketing failure modes to name in advance?

Brand voice drift (the model converges on AI-default formatting without an active rubric and rejection gate), citation rot (the model is confident and wrong about specific numbers — every stat needs human verification against a primary source), and tooling churn (workflows built against one model version aren't automatically optimal against the next — re-audit every 90 days).

How do we stop AI from drifting into generic, over-bulleted content?

A rubric document that names the specific failure modes (dead-phrase headers, perfectly balanced tri-bullets, dashes used as commas), a dedicated check stage in the workflow, and a standing commitment to reject drafts that fail it. The fix is mechanical, not prompting harder.

The AI marketing operations playbook that's built for reality

Xander Sebastian Published 10 May 2026

Table of contents 6 sections

9 min read

The AI marketing operations playbook that's built for reality

The honest version of an AI marketing operations playbook is not a list of fifty agents to deploy by Q4. It is a triage. There are roughly a hundred things a services firm could build into an AI-first marketing function in 2026, and ninety of them will quietly waste the team's time. The work is choosing which ten are worth the engineering, in which order, and what gets killed if the first one underperforms.

The frame underneath all of this: humans direct, AI executes. Strategy, taste, voice, the editorial call on what is worth shipping, the spend decision: these stay with people. The AI handles the production work inside rails the team has drawn. Every workflow that flips this posture, where the AI decides strategy and humans rubber-stamp, fails inside ninety days. The playbook below assumes the human-in-the-driver-seat frame throughout; if a project requires the AI to make strategic calls, it is on the wrong list.

We work mostly with services firms in the £500k to £20M revenue band, sweet spot £1M to £10M: professional services, agencies, B2B consultancies, and the occasional founder-led services business at the smaller end where the "head of marketing" is also the head of everything else. Across these engagements the same pattern keeps showing up. The firms that ship choose a strange-looking first project, and the firms that stall all choose the same wrong one. This piece is the playbook we actually run, ordered the way we actually run it.

It's opinionated, and the opinions are earned by counting which builds survived the first 90 days and which got quietly abandoned.

Key Takeaways

Start with a research and briefing pipeline — not a content engine
A good first AI project is narrow, measurable in days, and owned by one person
Human direction + AI execution is the posture that survives the first 90 days
The 90-day shape: briefing (weeks 1–3) → drafting (weeks 4–6) → distribution (weeks 7–9) → measurement (weeks 10–12)
Name the three failure modes before they arrive: brand voice drift, citation rot, tooling churn

The wrong first project (and why almost every team picks it)

Most marketing teams adopting AI start with content. They commission a "content engine": a prompt, a CMS integration, maybe a Notion database, and they ask it to draft blog posts at 5x the previous cadence. It looks like the obvious win because content is the thing the team wants more of.

It's the wrong place to start. Three reasons.

First, content quality is the hardest signal to measure. The team will know a content engine is working three to six months after launch, when ranking, citations, and pipeline begin to move. By then the team has lost confidence in the project. They can't tell whether the drafts are good or simply more numerous, so the editor pulls everything back into manual mode and the engine becomes shelfware.

Second, content engines touch the most people. They sit between strategy, SEO, design, and sales enablement. Every shipped piece needs strategic sign-off, SEO input, design, and sales-enablement context. At this revenue band that's either three or four people, or one person wearing all the hats. Either way, a "first AI project" that requires that many cycles to align isn't a first AI project. It's a transformation programme dressed up as one.

Third, by the time the engine is producing usable drafts, the team has internalised one lesson: AI is unreliable, slow to integrate, and politically expensive. That lesson sticks, and it kills the next eight projects on the roadmap.

A good first project is the opposite of all of this: narrow, measurable in days not months, and owned by one person.

What we build first

The first build, almost always, is a research and briefing pipeline that compresses 4 to 6 hours of pre-work into 25 to 40 minutes. The shape of it:

A marketer pastes a topic, an ICP description, and three to five competitor URLs into an interface. Twenty-five minutes later they have a content brief: the search intent, the angle, the structure, a list of internal entities the piece should cover, the live SERP, the AI Overview citation pattern, and a draft headline set. We usually run this as a Claude Skill, sometimes as a Notion form wired to a webhook, sometimes as a small internal tool. The form factor matters less than the shape of the workflow.

This works as a first project because all four conditions are met:

It is narrow. One input, one output, one workflow.
It is measurable on Friday. The team can count briefs produced, time per brief, and the ratio of briefs that survive editorial without major rework. Within two weeks they know whether the system pays for itself.
It is owned by one person. Usually the senior content marketer or head of marketing, and at the smaller end of the band, often the founder. No cross-functional coordination required.
It teaches the team how to specify AI workflows. The skill of writing a Claude Skill (or equivalent specification document) transfers to every subsequent build. Spending the first six weeks on this skill is the highest-impact thing a services firm's marketing function can do in year one. It is also the moment the team learns the human-direction posture: writing rails the AI runs inside, rather than asking the AI for an opinion.

The brief pipeline is unglamorous. It doesn't look like the AI marketing future the vendor decks promised. That is precisely why it works.

What NOT to build, in priority order

The "do not build" list is at least as important as the build list. If a team avoids the following five projects in their first six months, their odds of shipping a real AI marketing function go up substantially.

1. A fully autonomous campaign manager. Every quarter someone asks us to wire a frontier model into the ad accounts so it can adjust budgets, pause underperformers, and write new creative on its own. Nobody who has run this in production for a real spend account thinks it works yet. The failure modes are nasty. The model misreads a seasonality dip as a creative problem, kills a campaign that was about to recover, and the team finds out on Monday. Strategy and budget stay with the human. The AI doesn't decide spend; it executes the call the human has made.

Any workflow where AI makes spend decisions autonomously is high-risk. Budget pauses and creative swaps should require human approval — even when the automation is otherwise running end-to-end.

2. A multi-agent "marketing department in a box." The marketing-multi-agent-system framework is genuinely interesting research. It isn't yet a product to run against a live services-firm pipeline. The orchestration overhead, the debugging cost, and the cascading hallucination risk all compound. Build single-purpose agents and chain them later, manually, when each link has earned its place. Keep the human as the orchestrator, not another agent in the graph.

3. A real-time chat agent on the homepage. They look impressive in a demo. In our experience the conversion lift from a homepage chat agent in B2B services is small, the brand-voice risk is large, and the ongoing tuning eats more attention than the leads it produces. Revisit in 18 months.

4. An "AI for personalisation" workstream. Personalisation is the consultant-deck answer to AI marketing. The honest answer: most services firms between £500k and £20M don't have the data infrastructure to personalise meaningfully, and AI doesn't fix that. Build the data layer first. Most firms in this band find they don't actually need it.

5. A custom-trained model on your brand voice. Fine-tuning is rarely the right tool for marketing brand voice in 2026. A well-written voice document, a small set of reference examples, and a frontier model with prompt caching will outperform a fine-tune for less money and more flexibility. We haven't seen a case in the last year where a fine-tune was the right answer for a services firm of this size.

The 90-day shape

The timeline that survives contact with reality looks roughly like this.

Weeks 1 to 3. Build the research and briefing pipeline. One person, one workflow, daily use. The marketing team learns to write a Claude Skill (or equivalent specification document) with enough rigour that the output is consistent. By week 3 the team knows what "good" looks like for AI workflow specification, and that knowledge becomes the foundation for everything after.

Weeks 4 to 6. Add the second workflow: usually a content drafting pipeline that takes the brief and produces a first draft against a brand-voice rubric. Crucially, the drafter is scored against the same brief specification format the team learned in weeks 1 to 3. The editor still edits. The AI isn't deciding what's good; it's drafting against rails the team has set, and the editorial call stays with a person.

Weeks 7 to 9. Add the distribution layer. Take a published piece and produce LinkedIn, newsletter, and Reddit-friendly variants from it. This is one of the workflows where the cost-benefit math is the cleanest, because distribution work is high-volume and low-judgement.

Weeks 10 to 12. First measurement. By week 12 the team has produced enough output to begin measuring against the previous baseline. The honest question is not "is the output better" but "are we producing more, of comparable quality, for less money, with the same headcount?" If the answer is yes, the function has earned its next sprint. If not, the team stops.

The 90-day shape is deliberately conservative. We have run faster timelines and we have run slower ones. The teams that try to compress this into 30 days almost always over-build the wrong thing in week one and unwind by week three. The teams that stretch it past six months lose the political tailwind that made the project possible to begin with.

Where AI breaks (and why every playbook should name this)

Three failure modes show up in every implementation, and any honest playbook names them.

Brand voice drift. The drafter slowly converges on AI-default formatting unless someone is actively pulling it back. Over-bulleted prose, dead-phrase headers, perfectly balanced tri-bullets, dashes used like commas. The fix is mechanical: a rubric document, a check stage, and a willingness to reject drafts that fail it.

Citation rot. The model is confident and wrong about specific numbers. Statistics, dates, attribution windows, and tool capabilities all need a verification step before they go to press. We treat any specific number in a draft as untrusted until a human checks it against a primary source. It isn't negotiable.

Tooling churn. The model and tooling landscape moves faster than internal training. A workflow built against one model version in March is not automatically optimal against the next one in May. Budget engineering time for re-tuning. We re-audit every workflow every 90 days.

A team that names these three failure modes in advance is unsurprised when they happen, and they happen in every implementation.

Run a 90-day tooling audit as a standing calendar event. The question isn't "is this still working?" — it's "is this still the best way to do this given what's changed in the last quarter?"

Decision criteria, not a recipe

This playbook isn't a recipe. We don't believe there's a single AI marketing operations playbook that works identically across professional services, SaaS, and agencies. There are universal principles (start narrow, measure in weeks not months, name the failure modes) and there are firm-specific decisions (which workflow first, which model, which tools, which review cadence).

The decision criteria worth keeping handy:

If a project requires more than two functions to align, it isn't a first project.
If a project's success can only be measured in months, it isn't a first project.
If the team can't tell whether the output is good without three reviewers, the brief is wrong, not the model.
If a workflow isn't used daily by week three, kill it.

Most of what's on the AI marketing roadmap of any £500k-£20M services firm in 2026 shouldn't be built this year. The playbook is mostly about choosing well from a long list, with the discipline to leave most of the list alone, and the discipline to keep the human in the strategist seat while the AI takes over the executor seat, not the other way around.

We're happy to be wrong on any specific call here. The general shape (start with research and briefing, avoid autonomous spend, keep humans on strategy and AI on execution, name the failure modes, measure on Friday) has held across every services firm we've worked with, and we haven't yet found a counter-example.

Want to map your AI marketing roadmap?

We work with services firms to scope and sequence their first AI marketing builds. Start with a free conversation about where to begin.

Book a Free Call

Frequently asked

What's the best first AI project for a services firm?: A research and briefing pipeline that compresses 4–6 hours of pre-work into 25–40 minutes. It is narrow (one input, one output), measurable within days, owned by one person, and teaches the team how to specify AI workflows correctly — a skill that transfers to every subsequent build.
Why is a content engine the wrong first project?: Content quality takes 3–6 months to show up in rankings, so the team can't tell if it's working. Content engines also require cross-functional coordination (strategy, SEO, design, sales enablement), making them transformation programmes disguised as first projects. The early struggle teaches teams that AI is unreliable, which kills downstream adoption.
What does the 90-day AI marketing build look like?: Weeks 1–3: research and briefing pipeline (one person, daily use). Weeks 4–6: content drafting pipeline against a brand-voice rubric. Weeks 7–9: distribution layer (LinkedIn, newsletter, and Reddit variants from published pieces). Weeks 10–12: first measurement against the previous baseline.
What are the three AI marketing failure modes to name in advance?: Brand voice drift (the model converges on AI-default formatting without an active rubric and rejection gate), citation rot (the model is confident and wrong about specific numbers — every stat needs human verification against a primary source), and tooling churn (workflows built against one model version aren't automatically optimal against the next — re-audit every 90 days).
Should we build a fully autonomous campaign manager?: No. The failure modes are severe: the model misreads a seasonality dip as a creative problem, pauses a campaign that was about to recover, and the team finds out on Monday. Strategy and spend decisions stay with humans. AI executes the call a human has made, not the other way around.
How do we stop AI from drifting into generic, over-bulleted content?: A rubric document that names the specific failure modes (dead-phrase headers, perfectly balanced tri-bullets, dashes used as commas), a dedicated check stage in the workflow, and a standing commitment to reject drafts that fail it. The fix is mechanical, not prompting harder.

Continue reading

Marketing Operations

Why most AI marketing automation projects stall at month 3

10 May 2026

Marketing Operations

How we built our own rank tracker with DataForSEO + Claude Code

We built our own rank tracker with DataForSEO + Claude Code — £27/month versus £99+ for Ahrefs. Complete architecture, TypeScript implementation, PostgreSQL schema, cost breakdown, and the five things that broke during development.

13 Apr 2026

Content Strategy

Build Ai Content Pipeline With N8n

How to wire together Claude, Perplexity, and your CMS in n8n to produce research-backed drafts on autopilot — including the exact workflow JSON.

28 Apr 2026

Ready to put AI to work in your marketing?

Book a Fit Call — 20 minutes to find out if we're the right fit. No pitch deck, no fluff. If we are, a Foundation Sprint sets the scope.

Book a Fit Call