Building AI copilots that actually fit business workflows

Most AI integrations we've seen in production fall into one of two failure modes. Either they are demos — technically functional, impressive in a presentation, but not connected to any real business workflow. Or they are LLM wrappers — a chat interface on top of a language model, with no integration into the data the business actually operates on.

Neither is useful. A copilot that cannot access your business data cannot tell you anything about your business. A copilot that can access your data but cannot take actions cannot change anything. The gap between AI demo and AI system is a full engineering project.

What a real copilot needs

—Data access: the ability to read from the business's actual data sources — databases, documents, APIs, spreadsheets
—Permission awareness: the copilot should only access data the current user is allowed to see
—Tool calling: the ability to take actions — create records, update states, trigger processes, send notifications
—Output validation: every response should be validated before it reaches the user or triggers an action
—Memory and context: the ability to maintain context across a conversation, and optionally across sessions

The permission problem

LLMs do not understand permissions. If you give an LLM access to your database, it will query whatever it can reach. This is fine in a demo with synthetic data. In a production system, it means a sales representative could potentially query compensation data, or a junior analyst could access confidential reports.

Every copilot we build has a permission layer that mirrors the main application's permission system. When a user asks a question, the system first determines what data sources are accessible to that user, then constructs a query scoped to those sources. The LLM operates within a constrained data environment, not the full database.

The copilot should know what you know. Not more, not less.

The tool calling architecture

The most powerful copilots are not just query interfaces — they can take actions. A copilot that can only answer questions is a search engine with a conversational wrapper. A copilot that can take actions — creating records, changing states, sending messages, generating documents — is an operational tool.

Tool calling is the mechanism by which LLMs take structured actions. The language model is given a set of defined functions it can invoke, with clear input schemas and output formats. The key engineering challenge is making sure the tool definitions are precise enough that the model uses them correctly, and that outputs are validated before they produce side effects.

Output validation and confidence thresholds

Language models produce plausible-sounding outputs. They do not always produce correct ones. For query tasks, a wrong answer is an inconvenience. For action tasks, a wrong action can corrupt data, trigger incorrect workflows, or send unintended communications.

Every action-capable copilot we build has an output validation layer. The model generates a proposed action. The validation layer checks that the action is structurally valid — correct format, valid identifiers — then optionally routes high-confidence actions to automatic execution and lower-confidence actions to a human review queue. The confidence threshold is tunable per action type.

The right way to think about AI in a business context

AI is not a product feature. It is an acceleration layer on top of existing processes. The question to ask before building any AI integration is not what can the AI do, but what workflow does this accelerate, and what is the current bottleneck?

If the bottleneck is data access — finding information that is hard to locate — a query copilot makes sense. If the bottleneck is repetitive decisions, an automation layer makes sense. If the bottleneck is document generation, an output copilot makes sense. The technology choice should follow the workflow analysis, not precede it.