Why financial services governs AI differently
A bank does not get to treat a large language model the way a marketing team treats a writing assistant. Financial institutions already operate under model risk management expectations, data protection law, and supervisory scrutiny that predates generative AI by a decade. Drop an LLM into that environment and the question regulators ask is the one they have always asked: can you explain what the model did, prove it followed policy, and show the audit trail. The difference with generative AI is that the model is probabilistic, the data flowing into it is unstructured, and the usage is spreading faster than any approval process. Governance for financial services therefore has to do two things at once: satisfy existing model risk and privacy obligations, and keep pace with AI adoption that is happening across the front office, operations, and engineering whether or not risk has signed off.
The obligations that actually bite
Three pressures shape the control set. First, model risk management: supervisors expect institutions to inventory models, document their purpose, validate them, and monitor them in production. An LLM used to summarize client communications or draft credit memos is a model under most reasonable readings, and ungoverned shadow use of it is an inventory gap. Second, data protection and confidentiality: client data, transaction data, and material non-public information cannot leak into an external model that may retain or train on it. Third, auditability: when an examiner or internal audit asks how a given AI-assisted decision was reached, you need a record, not a recollection. Each of these obligations points to the same architectural requirement, which is a control point that sees and governs every AI request at the moment it happens.
Map the AI footprint before you write a control
Most banks underestimate their AI footprint by a wide margin because the official inventory only captures the models risk knows about. The real footprint includes the coding assistant in engineering, the embedded AI features inside SaaS tools the business already pays for, and the personal accounts staff use when the sanctioned tool is too slow. Start by discovering what is actually in use across every channel, then attribute each use to a system, a data class, and an owner. This inventory is the foundation of model risk compliance for AI, and it is also the thing most likely to surprise the executive committee. You cannot validate, monitor, or restrict a model you did not know was running.
Govern at runtime, not in review
Periodic review cannot govern a system that is used thousands of times a day. The control that holds in financial services is one that intercepts the request as it leaves: it redacts client identifiers and material non-public information before the prompt reaches the model, blocks usage that breaches policy, attributes every call to a named user or service, and fails closed when a request is ambiguous. Because the same control point covers every provider, a rule written once protects requests to a vendor model, a cloud model, and an internally hosted open-source model alike. This is what turns a model risk policy from a document into an enforced control, and it is the difference between catching a confidentiality breach in next quarter's audit and preventing it in the moment.
Build the evidence trail auditors expect
In a regulated institution, a control that cannot be evidenced is treated as a control that does not exist. Every governed AI interaction should produce a durable, queryable record: who or what made the request, what policy applied, what was redacted or blocked, and what the model returned. That record is what lets internal audit, model validation, and the regulator reconstruct an AI-assisted decision after the fact. It also shortens examinations, because the answer to how do you control this is a live system and a log rather than a binder of intentions. Treat the audit trail as a first-class output of the governance layer, not a report you assemble manually under deadline.
Where agents change the calculus
The next pressure is agentic AI: systems that take actions, such as moving data, calling internal services, or initiating a workflow, rather than only producing text. In a financial institution an agent that can act is closer to a junior employee with system access than to a chatbot, and it needs comparable controls. Governance has to intercept the agent's tool calls and actions at runtime, enforce least privilege, and log every action for audit. Institutions that built a runtime control point for LLM traffic are positioned to extend it to agents; those that relied on policy and periodic review will find that an autonomous system does not read the policy. Plan the control layer so that adding a governed agent inherits enforcement by default rather than requiring a new control program.
Frequently asked questions
Is a large language model a model under model risk management rules?
In most reasonable readings, yes. If an LLM informs a decision, drafts regulated communications, or processes client data, it falls within the scope supervisors expect institutions to inventory, validate, and monitor. The practical consequence is that ungoverned shadow use of LLMs is a model inventory gap that examiners can challenge.
What is the biggest AI governance gap in banks and insurers?
An incomplete AI inventory. The official list captures the models risk already knows about and misses coding assistants, embedded AI in existing SaaS, and staff using personal accounts. You cannot validate or restrict a model you did not know was running, so discovery of the real AI footprint is the first control to build.
How do you stop client data from leaking into an external model?
Route AI traffic through a runtime control point that redacts client identifiers and material non-public information before the prompt reaches the model, and that blocks usage which breaches policy. Because one control point covers every provider, the same rule protects requests to vendor, cloud, and internally hosted models.
What evidence do auditors want for AI governance?
A durable, queryable record of every governed AI interaction: who or what made the request, which policy applied, what was redacted or blocked, and what the model returned. In a regulated institution a control that cannot be evidenced is treated as absent, so the audit trail should be a direct output of the governance layer.