A knowledge base for a finance assistant is the structured collection of documents — KPI definitions, chart of accounts, finance policies, standard templates, and relevant tax interpretations — the assistant uses as ground alongside the prompt. For finance teams: nine times out of ten, poor AI output on finance questions traces back to a sloppy knowledge base, not the prompt — format, structure, and version control make the difference.
An AI assistant in finance is only as good as its instructions and its knowledge. The instructions usually get all the attention; the knowledge gets stapled on at the last minute. It shows in the output: finance assistants that stay generic about your KPI definitions, reference things that aren't in the files, or miss exactly the relevant clause in a tax handbook — nine times out of ten it's the knowledge base, not the prompt.
Setting up a knowledge base for finance is not a technical job. It's an editorial job: which documents go in, in which format, with which structure, so the model can draw from them quickly and correctly — and so the output holds up under external scrutiny.
What belongs in the knowledge base, and what doesn't
A useful dividing line: instructions go in the system prompt, reference material goes in the knowledge base. Put differently — the prompt says how the model works, the knowledge base says with what.
Belongs in for finance:
- Examples of good output (prior variance commentary, board-pack sections, AR mails in the right tone).
- Style guides for finance communication (board tone, investor-update tone, customer-letter tone).
- Fixed reference data: chart of accounts with commentary, KPI definitions, cost-center structure, dimensions, tax settings.
- Policy and procedures: close manual, VAT handbook, authorization matrix, accounting manual.
- Templates the model has to fill (board-pack template, MT-report template).
- Historical material for comparison: last quarter's reports, prior annual-report disclosures.
Doesn't belong:
- Long instructions on how the assistant should think — those belong in the prompt.
- Dynamic data that changes daily — that belongs via a connector or MCP tool, not as a file in the knowledge.
- Files with personal-sensitive data the user group isn't allowed to see (knowledge is visible to anyone with access to the assistant).
- Raw dumps of the entire close archive "just in case." See below.
Number of documents — less is almost always better
The temptation is to throw in everything vaguely relevant. That works badly for two reasons.
First: every platform has limits. Custom GPTs allow 20 files, Claude Projects load up to 200K tokens (1M enterprise), Gemini Gems up to ten files. On full-context platforms you pay per run for everything you stuff in; on RAG platforms the retrieval mechanism kicks in at larger volumes and ranking starts to matter.
Second: more is not better for model performance. If five examples are good, fifty mediocre examples make it worse — the model learns the average instead of the excellent. A good rule of thumb: three to ten carefully picked references beat twenty collected documents.
For most finance assistants what works is: one chart-of-accounts memo, one KPI definitions document, three to five sample outputs (a board-pack section, a variance commentary, an AR mail), one close manual, one tax reference. That's it.
Format — readable for human and model
Models are best at plain text and markdown. They get by with PDFs and Word, but lose information as soon as layout gets complex: multi-column layouts, text boxes, footnotes, embedded images with text in them.
Practical guidelines for finance:
- Markdown (
.md) or plain text (.txt) is optimal. Headers, lists, and tables survive; no formatting noise. - Word: fine if text leads and the structure is linear. A close manual with text boxes and columns: flatten to markdown.
- PDFs: work, but scanned PDFs without OCR are pictures to the model — content unreadable. Switch OCR on before upload.
- Excel: works, but models read tables better when they're small and flat. For your chart-of-accounts Excel: export a core overview to markdown or CSV.
- Images and screenshots: only if the model is multimodal and you need the image content.
A common mistake in finance: uploading a chart-of-accounts memo as a PDF with stylish layout and multiple columns. The model reads a quarter of it. Same content in markdown — an hour's work — and the assistant uses it consistently.
Structure — write for scanning
Inside a document, the same logic applies as for a good internal wiki. Headers, short paragraphs, lists, and a clear hierarchy help the model — for both full-context reading and RAG retrieval.
A workable structure for a finance knowledge document:
# Title (what this document covers)
## Short summary (2-3 sentences, for quick orientation)
## Main section 1 (e.g. categories in the chart of accounts)
…
## Main section 2 (e.g. specific agreements per category)
…
## Exceptions and edge cases
…
## Related documents
The summary at the top is not decoration. RAG retrieval often weights the start of a document more heavily; a good summary raises the chance this document gets pulled for the right questions. With full-context reading, it helps the model prioritize.
Name documents the way you'd hand them to a human
File names matter. Models use them — sometimes explicitly, sometimes as a ranking signal. Chart-of-accounts-memo-v2-FINAL (3).pdf helps nobody. Chart-of-accounts-memo-2026-Q2.md or KPI-definitions-handbook-2026.md says immediately what the document is about.
Consistent naming inside one assistant works in favor of both the maintaining human and the model. Prefixes like template-, example-, handbook-, kpi- make it obvious at a glance what a document is for.
Reference from the system prompt
A knowledge base that gets ignored might as well not be there. Finance assistants benefit from an explicit instruction in the prompt about which documents are available and when to consult them:
"Always consult
KPI-definitions-handbook-2026.mdbefore calculating a KPI. UseVariance-commentary-examples.mdas a reference for tone and structure. On chart-of-accounts questions: checkChart-of-accounts-memo-2026-Q2.mdfirst before inventing anything — don't fabricate cost centers that aren't in this file."
This explicit link between prompt and knowledge is one of the most underrated improvements. Without the reference, the model often ignores the knowledge, especially in a RAG setup.
Maintenance — plan replacement, not accumulation
Knowledge goes stale. A finance assistant built in January with December's KPI definitions is outdated by June. A chart of accounts changed after a reorganization no longer references the old version.
A simple cadence for finance knowledge:
- Quarterly: walk the knowledge. Replace old examples with new ones, update reference data, remove what's no longer relevant.
- At every significant change (new fiscal year, new entity, changed chart of accounts, new tax policy): update immediately, don't wait.
- On a model update: quickly check whether the assistant still behaves the same way — sometimes models behave differently on identical knowledge.
- One owner per assistant, typically a senior controller or finance manager. Not a committee. Without an owner, no maintenance.
Date versions in the file name or in a header — both for the user and for the model.
Privacy and access in a finance context
What's in the knowledge base is visible to everyone allowed to use the assistant. That's the most important governance rule.
Practical finance guidelines:
- Customer names, IBANs, specific transactions in sample board packs or sample AR mails: anonymize. "Customer A" works fine as an example.
- Salary data, personnel files: not in a broadly shared finance assistant.
- M&A documentation under NDA: not in a regular assistant; possibly in a separate, customer-specific assistant with strict access limits.
- Historical annual reports: fine as reference, anonymized where needed (certainly on sensitive items).
See AI governance for finance for the broader tier classification.
Audit-grade perspective
A well-maintained knowledge base is a form of internal-control documentation: it proves that AI output on finance questions is based on controlled, current references. Keep the version history of knowledge files — an external auditor asking "which KPI definition was used to calculate this variance" wants to be able to reconstruct the version active at the moment of calculation.
Saldus in practice
In Saldus the knowledge layer is built as a "Tenant Context Pack": a collection of YAML and markdown files per tenant containing chart of accounts, KPI definitions, period policy, GL mapping, and accounting semantics. Fed to every agent as context, versioned (per file a sha checksum), and automatically referenced in audit logs ("this answer is based on KPI handbook v2.3"). For finance teams building their own knowledge base outside Saldus: the same discipline — markdown, versioning, explicit reference from the prompt — remains critical.