Context engineering for finance AI

Context engineering for finance AI is the deliberate choice of which information, in which form, reaches the model through which channel — flat context in the prompt, documents as attachments, RAG through a knowledge base, or direct integration with the accounting system. For finance teams this choice determines whether the answer becomes usable or stays generic finance talk with a high hallucination risk.

The quality of an AI answer on a finance question is more often decided by the context than by the prompt. A three-sentence prompt with the right chart of accounts and the right KPI definition produces usable output; the same prompt without context produces generic finance talk with a high chance of hallucination. Context engineering is the deliberate choice of which information, in which form, reaches the model through which channel.

Why this matters extra in finance

Models are trained on the general internet. They know what a month-end close is, how IFRS disclosures flow, how a variance analysis is structured. They know nothing about your chart of accounts, your customer list, your KPI definitions, your close manual, or your numbers from last month. Every time you want the model to do something finance-specific, you have to bridge that gap.

For finance this is extra critical for two reasons. Numbers don't tolerate vagueness — an "approximately right" commentary with the wrong ledger account is wrong, not an approximation. And the output often lands in a formal context (board pack, annual report, VAT return) where audit evidence is needed about what the output is based on.

Four forms of context — plus one finance-specific

1. Flat context

Paste text straight into the prompt. A 50-line ledger extract, three customer examples, a KPI definition. Simplest form, works for short briefs. Limit: manual work and staying inside the context window. The fastest route to a first result.

2. Upload documents

One or more PDFs, Word, or Excel files. The model reads or scans them. Under the hood this can be two things: full context (everything loaded into the context window) or RAG (chopped into chunks and searched). Which it is depends on document size and platform — not always transparent.

3. RAG — Retrieval Augmented Generation

For large finance knowledge bases (a complete close manual, a tax reference, years of historical reports) full context is impossible. RAG splits all documents into chunks, indexes them semantically, and fetches the few most relevant chunks on each question. Scales to millions of pages, affordable, fast.

Downside for finance: RAG misses structure and overview. A tax clause only understandable in combination with a definition three chapters earlier can be lost. For exploratory Q&A: fine. For careful close reading of a transfer-pricing report where every clause matters: risky.

4. Full context injection

The entire document goes into the context window. The model reads everything. Best quality for close reading — nothing missed, context intact. Expensive and slow at large document sizes, bounded by the model's context window.

5. Live accounting integration (MCP) — finance-specific

A variant on top of the first four and often the most valuable for finance: the model pulls what it needs directly from Exact (or another accounting package) on demand. No export, no Excel intermediate step, no stale knowledge — it's always current. For finance questions ("which open debtors are older than 60 days") this is the reason platforms like Saldus exist.

Context limits per model — early 2026

For finance context: how much finance data fits in one request?

GPT-5 Fast: ~32K tokens, around 24,000 words or 50 pages. Suitable for a KPI definition + one report.
GPT-5 Thinking: ~196K tokens, around 147,000 words or 300 pages. Enough for a large reporting cluster.
Claude Opus 4.5 / Sonnet 4.5: 200K tokens standard, 1M on enterprise — enough for a full contract portfolio or annual-report package.
Gemini 3 Pro / Flash: 1M tokens (~750,000 words), over 1,500 pages — the largest window on the market.
Saldus Q&A agent: connects directly via MCP; only the relevant data is fetched, no window limit.

Practical rule of thumb for finance:

Under 50 pages: upload directly, pick a model that can full-context read it.
50-500 pages: Claude Opus 4.5 or Gemini 3 Pro. Full context stays possible; output is more reliable than RAG for formal work.
Above 500 pages: RAG or a hybrid approach. Indicate in the prompt which sections you suspect are relevant.
Questions on current data from the books: live integration via MCP — don't try to keep a weekly export fresh.

The trade-off: RAG vs full context — finance edition

For finance:

Breadth question on a large source ("which annual reports in our archive had a disclosure on goodwill impairment?") → RAG.
Depth question on a bounded document ("what are the liability clauses in this 80-page customer contract?") → full context.
Questions on current numbers ("what is the balance of account 4000 in April") → not via knowledge files, but via a live MCP integration with the books.
Mixed — a large source where you want to maintain structural coherence → full context on a filtered subset, possibly after a first RAG filter.

Concrete example: loading the 144 pages of the EU AI Act into ChatGPT to ask whether a specific finance use case is high-risk. ChatGPT chunks and retrieves — fast, but can miss clauses that need to be read in combination. The same document in Gemini 3 or Claude Opus 4.5 with full context: fully ingested, precise article-and-paragraph citations.

Three questions for every finance question you ask AI

Which context does the model need to answer this question correctly? Chart of accounts? KPI definition? Current balances? Prior commentary?
Is that context already in the model (general knowledge) or do I need to add it?
If I add it, do I pick flat context (paste), documents (upload), RAG (search a large source), or a live accounting integration?

Anyone asking these questions explicitly before the prompt eliminates 80% of hallucination risks. The remaining 20% is a matter of reading the output carefully and verifying numbers.

The moment you add context, you're sending finance data to the model. Who receives that data and what happens depends on the platform and the tier.

Consumer ChatGPT and Claude.ai: conversations are kept by default, possibly used for product improvement. For finance data with customer names or numbers, almost never an acceptable route.
OpenAI API and Anthropic API: zero data retention for business customers. Suitable for your own applications.
Claude Enterprise, ChatGPT Enterprise/Team, Microsoft Copilot for Business: contractual guarantees, data residency, audit logs. Suitable for most finance contexts — with a DPA and with attention to settings like Flex Routing in Copilot.
Saldus embedded: everything on customer-owned infrastructure. No data leaves the tenant. For the most sensitive finance context (M&A, salary-level, healthcare clients).

Concrete: uploading a client's ledger into a consumer ChatGPT account is a data leak under most client contracts. The same file in a Claude Enterprise environment is just work. The rule isn't "no AI with client numbers," it's "AI with client numbers only in the right tier."

Audit-grade perspective

Context choices belong in audit evidence: per AI answer it must be reconstructable which context was supplied (which version of the chart of accounts, which KPI definition, which balance from which period). Build this in as a logging requirement, not an option. An AI answer without a context trace is, for finance work that matters, the equivalent of an Excel formula without source references — technically correct but audit-unusable.

Saldus in practice

Saldus is built around context engineering specifically for finance: the Tenant Context Pack supplies chart of accounts, KPI definitions, period policy, and accounting semantics to every agent per tenant; the MCP layer pulls current numbers live from Exact; and per answer, the context source and version used are logged. This doesn't free the team from building the context (chart of accounts has to be clean, KPI definitions have to be right) — it makes sure the context discipline happens in a structured way instead of ad hoc.