The AI capabilities that matter for controllers

AI capabilities are the specific skills of a large language model — text, code execution, vision (reading images), RAG (knowledge-base retrieval), file creation, and connector actions. For controllers and finance managers, deliberately choosing which capabilities to use — and which to exclude — determines whether the output is usable, or an audit finding in the making.

The model is one component. The capabilities around it — code execution, reasoning, vision, RAG, file creation, connectors — decide whether an AI tool works in finance practice. Not every capability matters equally; some (file creation, code, MCP) are structurally decisive, others (image generation, voice) are usually edge cases. This piece walks through each one through the eyes of a controller, AP/AR clerk, or finance manager.

Text — the baseline, and the least differentiating

Text in, text out. Concept explanations, summaries of management reports, rewrites of a board memo, translations of an English-language annual report. All frontier models handle this well; the differences sit in style. Claude is known for balanced, neutral writing — useful for formal finance work. GPT-5 is sharper on precision. Gemini excels on very long documents.

For finance: use this for the first draft of a variance commentary, a summary of an audit report, or a rewrite of an internal memo. Not for numbers — for numbers you need a different capability.

Code execution — where finance work actually benefits

Two variants. Code generation: the model writes Python, SQL, or VBA — useful for controllers who occasionally need an Excel formula or a SQL query. Code execution: the model runs the code itself in a sandbox and uses the result. The second is the one that matters for finance.

Running a financial scenario, filtering an Excel export, a margin analysis across thousands of rows, a DSO calculation per customer segment — all tasks where code execution turns unreliable guesswork into a deterministic result. Models can't do math. Python can.

Practical rule: switch code execution on explicitly, and nudge it in the prompt ("use Python to calculate the margin") if the model doesn't reach for the capability on its own. And always ask for the calculation, not just the answer — not "the margin is 23%," but "the margin is 23% (€460,000 gross profit / €2,000,000 revenue)." That way you see in a second whether it holds.

Reasoning — for what's heavier than an email

Reasoning models — GPT-5 Thinking, Claude Opus 4.5 with extended thinking, Gemini 3 Deep Think — do multi-step reasoning under the hood. They generate an internal reasoning trace before producing an answer. That costs more compute and money (3-10× compared to an "instant" variant), but yields dramatically better results on complex finance work: due diligence, contract analysis, multi-step reconciliations between ledger and tax return, IFRS disclosures, annual-report review.

It is not a free upgrade. For a variance commentary or an email, reasoning is overkill — slower and more expensive with no visible win. For an investment memo, a review of a shareholders' agreement, or a multi-step data analysis, it is the default tool.

Vision — invoices, receipts, bank statements

Interpreting images: scanned invoices, bank statements in PDF, screenshots of a dashboard, handwritten expense forms. The price per call is higher than text (roughly 3-5× per "image token"), but vision opens up whole classes of finance work: extracting invoice data from an inbound email, recognizing receipts for expense reports, reading bank statements from clients who still send PDFs, matching product photos to invoice lines for inventory work.

For PDFs with layout (tables from a ledger export, columns in an AP aging) Claude and Gemini are stronger than GPT in extraction accuracy. For handwritten material no model delivers 100%; assume sample-check.

RAG — for the close manual, tax handbook, and historical reports

Retrieval-Augmented Generation. Instead of stuffing an entire document into the context window, the source is chopped into chunks and indexed. On a question, the system retrieves the few most relevant chunks first. For finance this is useful for: the full close manual, tax handbooks, historical month-end reports, prior years' annual reports.

The downside: RAG misses structure. A tax rule only understandable in combination with a definition three chapters earlier can be lost. For exploratory Q&A on a large source: fine. For careful close reading of one important document (a shareholders' agreement, a transfer-pricing report): prefer full context. See Context engineering for finance AI for the trade-off.

File creation — board packs and annual-report templates

Since Claude Opus 4.5 (October 2025), models generate correctly formatted .xlsx, .docx, and .pptx files directly, with formulas, formatting, and the option to use your own template as the base. Upload your board-pack template, let the model fill in the variance analyses, KPI pages, and commentary, and the output comes back in your house style instead of AI default.

This shifts the threshold for reporting automation substantially. "The last mile" — from model output to a usable Excel or Word file — used to be more work than the generation itself. That mile is now largely gone, and for finance reporting in particular, that's exactly where most formatting hours used to disappear.

Memory — limited use in finance

Persistent facts about the user kept between sessions: preferences, recurring patterns, prior context. Useful for general productivity; treat carefully for finance. Audit trail requires explicitness and reproducibility, and memory stored "somewhere," vulnerable to session drift, is a poor fit for formal work. For most finance workflows, prefer an explicit Tenant Context Pack or knowledge base over the implicit memory mechanism of a chat tool.

Connectors and MCP — where finance work scales

Model Context Protocol (MCP), released by Anthropic in late 2024, has since become an open standard with more than 10,000 public servers. The MCP integrations that matter for finance: Exact and other accounting packages, MS365 (mail, calendar, SharePoint), banking portals, AR/collection systems. The model fetches data on demand, without you copying and pasting.

For serious business finance use, this is where it gets interesting. A CFO assistant that looks directly into the ledger, an AP agent that recognizes inbound invoices and creates them in Exact — without MCP or comparable connectors, you're stuck with manual copy-paste. See MCP servers in finance practice for the servers that are useful in a finance context.

Agents — capabilities combined

An agent is an AI that picks tools autonomously, makes decisions, and executes multi-step plans. The capability angle on agents is that the model has to figure out which tool when. Agents stand or fall on the underlying capabilities: an agent without good connectors, code execution, and file uploads is no more than a more expensive conversation. See From prompts to agents in finance for the maturity ladder.

Cost and privacy notes for finance

Capabilities vary in cost. A rough guide for finance:

Text, code execution, vision: a few cents to a dime per call.
Reasoning (extended thinking): dimes to several euros per heavy run.
Deep research, full-duplex voice: euros per run.

Privacy is — for finance — more critical than cost. Which capabilities are available and how data is handled depends on the tier. Consumer tiers retain conversations for product improvement by default (opt-out is often possible but rarely sufficient). The OpenAI API and Anthropic API offer zero retention for business customers. Claude Enterprise, ChatGPT Enterprise/Team, and Microsoft Copilot for Business offer contractual guarantees on data residency and GDPR compliance. For finance data, an enterprise or API path is almost always the only responsible route.

Saldus in practice

In Saldus, the capabilities that matter for finance — code execution for calculations, MCP integration with Exact, file uploads for the close manual and tenant context, model choice per agent — are integrated as building blocks in one platform. You don't have to piece together the capability set per tool yourself or work out which tier you need. That doesn't mean capabilities disappear — they're just delivered in a form that works in a finance context, with the audit and governance layer already built in.