Benchmarks, technical reports, and in-depth research on reliable AI deployment in financial services.
BenchmarkApril 2026
David Ahn, Maximilian Eber, PhD, Sahith Jagarlamudi
The first public benchmark of AI-driven adverse media investigation. Evaluates detection accuracy, evidence quality, reliability across agent runs, and cost efficiency across eight frontier models and 31 configurations.
BenchmarkMarch 2026
Nico Klees, Maximilian Eber, PhD
The first public benchmark for agentic financial document processing. Evaluates extraction accuracy, cross-document reasoning, calculation correctness, and structured output quality across seven frontier models. Built on anonymized production data from financial institutions.
PaperMarch 2026
Dustin Eaton, Maximilian Eber, PhD
Why AML teams must now apply model risk management standards to AI systems. Published in ACAMS Today, exploring how regulators are extending MRM frameworks to AI deployed in compliance functions — and what institutions need to do to prepare.