Stage 03 · AI as an Operating Layer

A · Landscape

The 2026 toolkit

Agents you can actually deploy

Pick the surface that matches the work

Tool	Best for
Claude Code	Terminal-native agent. Writes, runs, ships code. Best for eng/data/repos.
Claude in Chrome	Browser agent. Navigates web apps, fills forms, takes actions in SaaS.
Cowork	Desktop agent for non-devs. Manages files and tasks across apps.
ChatGPT Agent	Browser actions, scheduled runs, deep research over 5–30 min.
Gemini Agents	Acts inside Gmail, Docs, Sheets. Pulls Drive context natively.

The Playbook

Build & pilot in 5 steps

01 · PICK NARROW One workflow, <10 steps Runs 5+ times a week. Owner can describe it in 1 page. Low stakes if it goes wrong.

02 · MAP THE STEPS Shadow the human Document every input, output, judgment call. Note what's automatable vs. what's not.

03 · SCOPE TIGHT Permissions matter Read-only first. Write actions need explicit approval. Dedicated credentials, never admin.

04 · PILOT 2 WEEKS One owner, daily standup Measure cycle time, error rate, sentiment. Compare to baseline.

05 · DECIDE Scale, kill, or harden Day 14: written decision. Document either way. Share the post-mortem.

B · Prescriptive

See a real agent. Pilot your first.

14-day commitment

SEE Pre-Meeting Brief Agent

A starter agent that actually works.

Trigger: 60 minutes before any external meeting on the calendar.

What it does:

Workflow 1. Read calendar invite. Identify external attendees. 2. For each, web search for recent news (last 90 days). 3. Check LinkedIn for role/company changes. 4. Pull last 3 email threads from inbox. 5. Synthesize: who, why they matter, talking points, 2 questions to ask. 6. Deliver as a 1-page brief in Slack DM.

What it can't touch: No outbound communications. No CRM writes. No scheduling. Read-only across the board.

What it saves: 30 min of manual prep per external meeting. 6 hours/week back.

DO Pilot one in 14 days

Your 2-week pilot plan.

Day 1: Pick the workflow. Write the 1-page scope (trigger, inputs, steps, outputs, what it can't touch).
Day 2-3: Build it. Use Cowork, Claude in Chrome, or ChatGPT Agent. Start read-only.
Day 4-10: Run it daily. Review every output. Note errors, log fixes.
Day 11-13: Tighten. Add the write actions you trust. Keep the rest read-only.
Day 14: Decide. Scale to the team, kill it, or harden for production. Write the post-mortem either way.

The point isn't the agent. The point is the muscle of piloting one. Once you've done one, the next ten are 10x faster.

C · Recipes

6 workflows worth piloting

Ranked from easiest to hardest. Start at the top. Don't skip ahead until each one is real.

Easy · Read-only

Pre-Meeting Brief

SCOPE Trigger: 60 min before external meetings. Pull attendee context (news, LinkedIn, email history). Deliver 1-page brief via Slack DM. Read-only, no outbound actions.

ROI

6 hours/week back. Better-prepared meetings.

Easy · Read-only

Competitive Intel

SCOPE Weekly: scan named competitors' news, blog posts, hiring, pricing pages, podcast appearances. Summarize material moves. Deliver as Monday-morning digest. Read-only.

ROI

Strategic awareness without a full-time analyst.

Medium

Lead Enrichment

SCOPE New CRM leads: enrich with company size, funding, tech stack, role context. Score against ICP. Write enrichment to CRM. Surface top 10 daily.

ROI

Sales focuses on top 10, not all 200.

Medium

Month-End Close Prep

SCOPE Day -3: pull subledgers, run reconciliation checks, flag variances, draft month-end narrative. Human approves before close. No journal entries.

ROI

Close cycle 2 days faster. Fewer Friday-night closes.

Hard

Customer Health Monitor

SCOPE Daily scan of usage, support tickets, NPS, exec communication. Flag accounts trending down. Draft proactive outreach (human sends). Never sends directly.

ROI

Catches churn risks 2-3 weeks earlier.

Hard

Contract Review

SCOPE Incoming MSAs/NDAs: compare to our playbook. Flag non-standard clauses. Score risk. Draft redlines. Human approves all changes. Legal owns final review.

ROI

Cycle time on standard contracts: days to hours.

D · Discipline

Watch-outs & what stays human

Watch-outs

Where pilots go wrong

Click each to see the mitigation.

The trap: Agent works well. Team wants it to do more. You expand scope without re-scoping permissions.

The fix: Every new capability requires explicit permission expansion, signed off by the owner. Dedicated service account. Never admin. Audit log on.

The trap: Agent ran. Output happened. No one knows what it touched.

The fix: Log every action, every input, every output. Daily review for the first month. Weekly thereafter.

The trap: Agent fails partially. Output looks plausible. No alarm goes off.

The fix: Build explicit success criteria into every run. If criteria not met, agent fails loudly. Human review on first 30 days regardless.

The trap: Worked in month 1. Quality slipped by month 4. No one noticed.

The fix: Quarterly recalibration. Sample outputs. Compare to baseline. The teammate from Stage 2 needed Friday iteration; the agent needs quarterly review.

Non-negotiable

What stays human

Always. No exceptions, no pilots.

Pricing: any change to what you charge
People: hiring, firing, comp, performance reviews
Legal: final contracts, settlements, regulatory filings
Customer escalations: anything labeled "urgent"
Public statements: press, social, investor relations
Financial transactions: anything that moves money
Irreversible deletes: data, accounts, records
Security responses: any incident, any breach

The principle: Agents handle the boring, repeated, low-stakes work. Humans handle the consequential, novel, or irreversible. Don't blur the line because the agent is "ready."

E · Self-check

Are you ready for Stage 3?

Answer honestly. If you can't say yes to most of these, go back to Stage 2 and build a few more teammates first.

If no: build more teammates first. The intuition for what AI can and can't do reliably comes from doing Stage 2 work. Skipping it is the #1 reason pilots fail.

If no: agents are bad investments for one-off work. The ROI comes from frequency. If the workflow doesn't recur, a Stage 2 teammate is the right answer.

If no: don't start. Pilots without owners drift. Owners without time produce theater, not learning.

If no: write it down before you build. The list of "never" is more important than the list of "can." Put it in the spec. Reference it in the audit.

If no: pick a different workflow. Pilots should be embarrassing if they fail, not catastrophic. Build the muscle on something boring before you try something consequential.

If no: don't start. Pilots that go indefinitely are zombie projects. The decision discipline is half the value of the exercise.

Do this

Try this quarter

Find your 4-hour workflow. One thing the team does 5+ times a week that costs at least 4 hours total. Boring beats glamorous.
Build a 2-week pilot. Read-only first. One owner. Daily standup. Written success criteria.
Decide on day 14. Scale, kill, or harden. Write the post-mortem. Tell your peers what you learned.

AI as an Operating Layer.
Let it run the workflow.

The 2026 toolkit

See a real agent. Pilot your first.

A starter agent that actually works.

Your 2-week pilot plan.

6 workflows worth piloting

Pre-Meeting Brief

Competitive Intel

Lead Enrichment

Month-End Close Prep

Customer Health Monitor

Contract Review

Watch-outs & what stays human

Are you ready for Stage 3?

Try this quarter

The Shortcuts

AI as an Operating Layer.Let it run the workflow.

The 2026 toolkit

See a real agent. Pilot your first.

A starter agent that actually works.

Your 2-week pilot plan.

6 workflows worth piloting

Pre-Meeting Brief

Competitive Intel

Lead Enrichment

Month-End Close Prep

Customer Health Monitor

Contract Review

Watch-outs & what stays human

Are you ready for Stage 3?

Try this quarter

The Shortcuts

AI as an Operating Layer.
Let it run the workflow.