What AI can — and can't — do for finance work today

AI for finance work rests on two distinct capacities: knowledge from training (static, broad, average) and capabilities + integrations (specific, current). Ignoring that distinction means treating AI like Google — and running into hallucinations precisely where it hurts: in numbers, in regulations, and in customer or vendor data.

An AI model is not an accounting system, and it is not a database. It is a neural network that, during training, absorbed patterns from billions of texts — general knowledge, not facts about your company. Once training stops, the model is frozen. What it "knows" about finance is what got baked into that training: how a general ledger works, how a set of financial statements is structured, what IFRS 15 mandates at a high level. What it does not know is your chart of accounts, your customer list, or last month's close. That has to be added every time — through capabilities and integrations.

That distinction — knowledge versus capabilities — is the single most important mental anchor for finance teams. Ignore it and you'll treat AI like Google, then walk straight into hallucinations in exactly the places where it hurts most: numbers, citations of regulation, and customer- or vendor-specific data.

What the model has in its head — and what that means for finance

During training a model reads hundreds of billions of words, including accounting handbooks, tax law, and annual reports. From these it learns patterns: how a journal entry is built, how a reconciliation works, which concepts belong together. Three concrete consequences for finance use:

The model knows nothing about your chart of accounts. Ask about "cost center 4150" and you'll get a plausible-sounding but invented answer — unless you supply the chart of accounts as context.
The model doesn't actually do math. Ask 4 × 4 and you'll get 16, because that pattern appears millions of times. Ask it to sum a 2,000-row Excel extract and you'll get a plausible number that is often wrong — unless the model runs code to actually compute it.
Tax rates, thresholds, and rules change. A model with a 2024 training cutoff doesn't know last quarter's VAT change, the updated investment-allowance rates, or the latest accounting bulletin. Quoting them confidently anyway? That's the problem.

Knowledge in an AI model is therefore: broad, static, and average. Strong on what is widely written across the internet, weak on what is specific to your business or has recently changed.

Capabilities — where the real finance value comes from

Capabilities are the tools that sit alongside the model. During training the model learns to reach for them when a question demands it. For finance work, a handful matter:

Code execution. The model runs Python in a sandbox to actually compute, filter an Excel extract, or analyze a time series. For anything numeric, this shifts the work from "guess by token" to "deterministic result." Always ask for the calculation, not just the answer — that way you can see whether code was used.
Vision. A scanned invoice, a handwritten receipt, a bank statement in PDF — vision capability pulls the numbers out. It folds work that used to live in a separate OCR tool into an AI workflow.
RAG and file uploads. A close manual, your chart-of-accounts memo, a client's VAT handbook — don't paste them into every prompt; attach them to an assistant as a knowledge base. The model retrieves on demand.
Connectors and MCP. Direct integration with Exact, Microsoft 365, banking portals — so the model fetches live data instead of whatever you happened to paste. For serious finance work, this is where it gets interesting.
Web search. For current tax rates, recent IFRS updates, or a VIES VAT-number check, web search is the difference between stale knowledge and a usable answer.

A "smarter" model usually solves less than a model with the right capability switched on. For a controller running a variance analysis, code execution matters more than extended thinking.

Why this matters more in finance than elsewhere

Three reasons the distinction is stricter in a finance context than in most other domains.

Reliability on numbers. A model that answers a numeric question from knowledge alone is pattern-matching, not verifying facts. A capability — code, file upload, MCP connection — puts ground under the answer. For external reporting, that gap is the gap between audit-proof and audit finding.

Model choice matters less than tier and capability set. Early-2026 frontier models — GPT-5, Claude Opus 4.5/Sonnet 4.5, Gemini 3 Pro — perform comparably on knowledge. The differences sit in which capabilities surround the model and which tier you use. Claude in Claude.ai with file uploads and MCP has a different capability set than the same Claude inside Microsoft Copilot. Same model, very different practical value for finance.

Data security per tier. Uploading a general-ledger extract into a consumer ChatGPT account is a different legal path than the same upload into Claude Enterprise with a DPA. For finance data, an enterprise or API path is almost always the only responsible route. See AI governance for finance for the tier classification.

A mental model that holds up

Picture the AI model as an experienced external controller who spent a year at your company, has heard nothing since, and now gets to advise again. He knows a lot about finance in general. He knows nothing about what's happened in your books since last year, nothing about the latest VAT-rate change, nothing specific about customer X.

Capabilities are his colleagues and tools: the Exact connection he uses to pull current balances, the calculator that means he doesn't have to guess, the folders with your close manual he is allowed to browse. The skill is not in making him smarter — it's in giving him the right tools and clear instructions with the right context.

Saldus in practice

Saldus is built on exactly this logic. The model is a commodity; value lives in the context layer around it: the direct integration with your accounting system through MCP, your tenant-specific context (chart of accounts, KPI definitions, period policy), and your own model choice per agent (Claude, GPT, Gemini, or local for the most sensitive data). That shifts the conversation from "which model is best" to "which capabilities does the model need for this finance question."