AI Adoption SME Wiki

The shift from tools to operations

AI adoption is not about adding new software. Aaron Levie, chief executive of Box, describes a foundational transition: from read-only AI that helps workers retrieve or create information to read-write AI that autonomously executes complete workflow tasks. Chat interfaces and analysis tools are one stage; agents that can act inside your systems and approve decisions are another.

The maturity of this shift means organisational readiness is not optional. Levie places agents in 2024 at the stage cloud computing was in 2007—immature technology with no ceiling on enterprise impact once mature. This means the competitive value is real now, but the integration work is substantial: upgrading IT systems, providing agents with business context, redesigning workflows around human-agent collaboration, and managing adoption are not afterthoughts. Both Anthropic and OpenAI have launched new initiatives to help enterprises deploy agents, signalling that deployment infrastructure is becoming a core market for the AI labs themselves.

For SMEs, the stakes are different from those of large enterprises. Smaller firms historically lack access to specialist skills—legal expertise, sales-team scale, complex operations management—that larger competitors can afford. But Levie argues that agents could enable companies of all sizes the same access to resources that were once only available to large organisations. The playbook, however, requires treating AI as an operational input, not a feature. That requires foundational work inside your organisation.

Making your company legible

AI systems cannot work with knowledge that is hidden. The foundational readiness condition is making your organisation's knowledge and workflows legible to agents—legible meaning stored, recorded, and accessible in a form the models can learn from and reason about.

The most concrete form this takes is recording. Tom Blomfield, a general partner at Y Combinator, argues that recorded company meetings become the system of record for unstructured knowledge. David Haber, an investor at Andreessen Horowitz, observes that AI agents learn company culture through participation in meetings with higher fidelity than through documentation, the same way new employees do. Haber predicts the default will flip from opt-in to recording-by-default with explicit exemptions for sensitive meetings, driven by competitive advantage. Verbal-culture companies like Shopify, where important context historically evaporated in conversation, will compound an advantage fastest.

Beyond recording, legibility means centralisation. Pete Koomen, a general partner at Y Combinator who built the firm's internal AI infrastructure, describes a shared architecture: one centralised database holding every company funded, founder relationship, financial transaction, and internal note. When agents can access this unified context, they answer arbitrary business questions, and removing the back-and-forth cost prompted staff to ask far more complex questions.

Skill libraries are a third form of legibility. (Shah's credential was not available; this section draws on his published work on skill-driven AI adoption.) Shah argues that SMEs should prioritise building a documented repository of the ways experienced staff actually approach repeated work, including procedures, judgment calls, edge cases, checklists, and rules of thumb. Agents are only useful when they understand more than the task itself; they need to understand the method behind it. Your company's AI advantage will come from the work it teaches the model to do well, rather than from the model it chooses, and that advantage accrues fastest for company-specific methods. Your skills are sitting "in old docs, Slack threads, customer calls, review rituals, onboarding notes, and the heads of the people who know how the work really gets done."

Queryability is the readiness condition Diana Hu, a Y Combinator partner, calls essential. She describes AI-native companies as running as closed-loop systems where every important action produces an artifact—a ticket, Slack thread, meeting recording, or customer feedback feed—that agents can learn from. This requires minimal DMs, embedded recording and note-taking across all channels, and centralised dashboards spanning revenue, sales, engineering, hiring, and operations. The payoff is striking: engineering sprints halved in duration and output roughly tripled at companies that exposed full context to agents using a single unified agent system.

These are not one-time events. Satya Nadella, chief executive of Microsoft, argues that organisations must build human capital and "token capital" (owned AI capability) as complementary, compounding assets. Competitive advantage comes from building agentic systems that learn from internal workflows and domain expertise, refine that learning in private evals and reinforcement loops, and retain control when swapping underlying models. This loop becomes the new intellectual property of the firm.

The verification bottleneck

Capability alone does not produce value. Researchers at MIT Sloan, led by Christian Catalini, argue that AI's productivity gains do not translate to economic value without reliable verification of its outputs. This is urgent because the gap is widening: code generation jumped from 4.4% to 71.7% accuracy on standard benchmarks in a year, but human verification bandwidth remains scarce and fixed.

Verification has become a value-creation function, separate from compliance. Firms that understand and underwrite AI risks will profit; those treating outputs as reliable without checking accumulate technical debt. The temptation to use AI to verify AI is strong, but it fails when both systems share the same assumptions and errors, creating false confidence.

A structural risk looms: as AI displaces entry-level and junior roles, it erodes the training ground where verification skills develop in senior staff. The researchers call this the "missing junior loop"—the loss of the junior-level training that produced experienced auditors. Firms must scale automation only as fast as it can be trusted; competitive pressure to deploy unverified systems creates hidden risk and unobserved failures that can eventually cascade.

For an SME, this means building verification into your adoption strategy from the outset, not bolting it on after deployment. Readiness requires having staff with both the time and the judgment to audit agent outputs in your domain, and training a pipeline of people who will develop that judgment.

The data and access-control problem

Agents cannot be deployed safely when data permissions are tangled and unclear. Levie describes the bind as the "Bob and Sally problem": Bob has too much access, Sally has too little, and the agent either bounces off an entitlement wall or, worse, answers questions using data it should not have seen. Until companies clean up identity, permissions, and data scoping for every system the agent touches, agents cannot be deployed safely outside narrow use cases.

The second problem is data definition and scope. Data is the bottleneck. Contracts live in five places, roadmaps in thirty, and there is no consistent definition of metrics like net retention or FX-adjusted growth. When only data scientists needed to answer those questions, humans could compensate by understanding context implicitly. But when every employee gets a connection to the system, bad definitions become a company-wide problem. The same applies to unstructured content across email, documents, and customer interactions.

This is why both Anthropic and OpenAI have launched new initiatives to help enterprises deploy agents: the implementation gap is real and large. Readiness requires upgrading IT systems, provisioning agents with business context, defining what data sources they can access, and deciding which decisions or actions they can take without human approval. Pedro Franceschi, CEO of the fintech company Brex, describes building CrabTrap, an HTTP proxy that audits all agent requests and uses LLMs as judges for ambiguous ones, achieving 98% auto-approval and 2% human review—a visible model of how to scale agent permission enforcement without blocking every transaction.

For an SME, this is not trivial work. It may require overhauling how you store data, cleaning up user roles and entitlements, and defining what data agents can see in each workflow. It belongs on the readiness roadmap.

Organisational redesign, not integration

The most consequential readiness mistake is treating AI as a feature to bolt onto existing workflows. Real adoption requires asking: if you started today, how would you structure this process differently?

Pedro Franceschi, CEO of Brex, puts it starkly: the CEO must be the chief AI officer, because only they have organisation-wide context and authority to redesign core processes. Redesigning entire processes yields the largest gains. Brex redesigned Know Your Customer onboarding end-to-end and discovered that offering free KYC at the lead stage enabled risk-based funnel qualification—a gain that would not have appeared if the team had automated the existing process.

Blomfield describes the structural change this represents. Hierarchical companies designed for human information flow become redundant when organisational knowledge is legible to AI. Loops that self-improve transcend the coordination problem. To enable such loops, every important action must produce an artifact the AI can learn from: office hours, Slack, decisions, telemetry. Then, record everything, diarize it, aggregate it, and synthesise it into context the models can use.

The self-improving loop has five layers: sensors (data in), policy (what the AI can do without asking), tools (deterministic APIs), quality gate (checks and human review for high-risk actions), and learning (feedback and iteration). YC built a monitoring agent that watches queries, diagnoses failures, proposes fixes to tools and context, commits code, and deploys overnight. The next morning, the same query succeeds. That is not just productivity gain; it is self-improvement.

This requires flattening hierarchies. Middle management as a coordination layer is over. Every human must be an individual contributor; AI handles routing and decisioning. The constraint becomes token budget, not headcount; YC companies are seeing 5x revenue per employee as token spending scales.

For an SME, this does not mean overthrowing your org chart overnight. It means identifying a critical process (sales handoff, customer support ticket triage, contract review) and redesigning it as a human-agent loop where the agent handles legible work and humans handle judgment, exception, and escalation. Then iterate: as the agent improves and your staff learns what it can do, expand its scope and responsibility.

Diana Hu points out that startups have structural advantage: no legacy systems, no entrenched org charts, no installed base to retrain. But SMEs can move faster than large enterprises. The constraint is conviction. Founders must develop conviction on agent capability firsthand by sitting with coding agents until their priors on what is buildable break; outsourcing this judgment to another person defeats the purpose.

People and skills

AI adoption is not transparent to your workforce. The readiness work includes building skills and managing changing roles.

From optional to required. Tobi Lütke, CEO of Shopify, declared in April 2025 that AI usage is no longer optional; reflexive use of AI as a thought partner, researcher, critic, or pair programmer is now an expectation built into performance reviews. His memo grounds this in Shopify's growth rate (20–40% year-on-year) and the argument that using AI well requires deliberate practice and learning by repeated use, not one-off experimentation. The practical changes include mandatory AI exploration in the prototype phase, formal feedback on AI usage in peer reviews, and a standing requirement that teams justify hiring requests by demonstrating why AI cannot do the work instead. This is readiness at the culture level: the organisation treats AI as a baseline competency, not a specialist tool.

New hyper-technical roles. The internal FDE (Forward Deployed Engineer) is now the highest-demand hire in enterprise tech. A hyper-technical person embedded in a business function maps workflows, wires agents, and manages the data, permissions, and skills the agent needs. This is permanent work, not a one-time setup: each model upgrade creates fresh work to capture gains or redeploy scaffolding. Most enterprises lack this talent and must hire or retrain. For SMEs, hiring a pure FDE may not be feasible, but identifying someone with both technical capability and domain knowledge, and investing in their development as an internal AI engineer, is increasingly essential.

Andrew Ng, who co-founded Coursera and led the founding of Google Brain, points out that AI Engineer roles will far outnumber FDE roles, and that demand is surging for generalist AI Engineers who combine LLM prompting, agentic frameworks, evals, and AI coding agents. Most companies will hire more of their own staff than they will accept embedded vendor engineers. The role will likely fragment into specialisations—LLMOps, Evals, AI Data, Harness Engineering—similar to how Software Engineering split into frontend, backend, mobile, and data engineering.

Avoiding lock-in. Ng also flags a readiness principle: tight vendor integration reduces flexibility when the best platform choice is uncertain. If you hire people skilled in Claude, Copilot, or a bespoke agentic framework, you are investing in their relationship with that vendor. Vendor optionality matters when the best platform choice is still in flux. Readiness means hiring people who can work across layers and models, and structuring your agent work to avoid deep coupling to any one vendor.

Sustained learning. Levie's broader observation is important: the AI job-doomers are wrong, for three structural reasons. First, knowledge work has an irreducible last-mile human loop: a lawyer must attest that an agent-drafted contract is valid. Second, when one engineer becomes five-times more productive, firms take on five times the projects, not one-fifth the staff. Third, SMEs that could not afford a designer or marketer before can now hire the first one, because the agent handles bulk work and the human supervises. Readiness means building your workforce from the assumption that staff will shift to higher-value and higher-judgment work, and that you will need to hire people and retain them.

Token costs and budgets

AI agents cost money in a new and unfamiliar way. This is a readiness problem because it changes budgeting authority and approval flows.

Levie outlines the shift: token costs are breaking the subscription model and escaping IT budgets into line-of-business spending. A single agent run can cost $1,000, far above the $20-per-user-per-month ceiling that worked for chatbots. AI labs have pricing power because a 10-year hardware cycle compressed into 18 months has created a capacity squeeze, and frontier token prices are rising, not falling. The result: AI spending must flow out of the capped 3–7% IT budget and into line-of-business allocations (marketing, sales, operations), creating new friction between finance, IT, and business owners over compute spend.

Y Combinator's guidance is blunt: you have to be willing to spend $10,000 to $100,000 a year on tokens. But if you invest in skills and operate in an open way, you can leapfrog incumbents. Token budget, not headcount, becomes the constraint.

Franceschi's observation at Brex: token spend will become the largest company expense, so you need internal cost attribution by product, customer, and use case. Brex built a system to measure AI ROI and optimise spend. But the broader insight is this: founders should experiment aggressively at scale, because frugal budgets create false limits on what you can learn.

For an SME, readiness means building AI token spending into your financial model from the outset, allocating real budget (not a "pilot" amount that gets starved), and establishing cost attribution so you know which workflows, products, or customers the spending serves. This is not IT capital expenditure or training spend; it is ongoing operational expense.

Getting started

Readiness work can feel overwhelming because the scope is organisational. Where should an SME start?

Start with skills, not data access. Hiten Shah argues that SMEs should prioritise building a skill library—a documented repository of proven ways of working—as their foundational AI strategy, rather than focusing first on data. His reasoning: AI agents need to understand how a company does its work, not just have access to data. Without that understanding, agents can read all available information but still miss the shape of a decision. The most valuable skills are private and company-specific because the most valuable methods are specific to each organisation. SMEs should start by mapping repeated work where experienced people consistently outperform others—particularly tasks involving judgment rather than effort alone—then package those approaches as reusable skills.

This is concrete work. Your company already has these skills. They are sitting in old docs, Slack threads, customer calls, review rituals, onboarding notes, and the heads of the people who know how the work really gets done. Documenting them produces a foundation for learning loops and a starting point for agents.

Find the first outcome. McGrew, who pioneered the FDE model at Palantir and now leads AI research at OpenAI, emphasises demo-ready solutions. The playbook is simple: identify a problem where agents could deliver a visible outcome (resolved customer ticket, shipped feature, closed lead), build a solution for it with real staff involved (not in isolation), measure the result, and then generalise from it. Outcome pricing, not per-seat or installation pricing, aligns incentives: startups sell the successful result of solving a problem, not just software. For SMEs, this means finding one workflow where an agent could deliver a measurable result, funding a first attempt (whether through internal engineering or with help from an FDE), and learning from it.

Own your learning loop. Nadella's framing is crucial: organisations must build human capital and token capital as complementary, compounding assets. A firm that encodes its workflows, domain knowledge, and judgment into agentic systems that improve continuously while able to swap out underlying models will outpace competitors who treat AI as external vendor software. This means starting with the assumption that you will build and own your own learning loops, and hiring or training the people who will do that work.

The counterpoint is honest. McGrew also observes that if you can avoid embedding external FDEs, do so. Bringing in external specialists to do on-site discovery and prototyping is expensive and creates dependency. It is valuable when you cannot staff the work internally, but it is not a permanent solution. Readiness means building the internal capacity to own the work.

Verify the value. Before scaling broadly, confirm that verification is feasible. The MIT Sloan researchers warn that companies must scale automation only as fast as it can be trusted. For your first workflow, build in explicit verification steps: someone with domain judgment checks the agent's output before it affects a customer or a contract. If verification becomes a bottleneck or the agent's error rate is unacceptable, the workflow is not ready to automate. Use that learning to iterate.