Five Fatal Flaws: Milliman’s FICO 10T Mortgage “Analysis” Is Inaccurate and Should Be Withdrawn

VantageScore®

Published May 5, 2026
Share:

Milliman’s FICO 10T white paper, published on May 4th, is deeply inaccurate and suffers from more than five fatal analytical flaws that render its conclusions unreliable. The analysis should be withdrawn.

These five errors are not minor technical preferences. They go to the heart of how mortgage risk is evaluated in practice and how lenders, investors, and policymakers interpret comparative credit score performance and mortgage risk.

1. Single Credit Report Analysis Does Not Reflect How Mortgages Are Actually Underwritten Using Tri-Merge

The flawed analysis relies on a single credit report rather than a tri-merge (three credit reports) approach. This diverges from standard mortgage underwriting practice and renders the analytical results meritless. In mortgage lending, decisions are typically made using tri-merge data, not a single bureau. Model behavior, score distributions, and missing data patterns can vary materially by credit bureau.

This issue is particularly important because VantageScore 4.0 is designed to be consistent across bureaus, while FICO 10T scores are not standardized in the same way. Because FICO 10T scores vary meaningfully across bureaus, an analysis on only a single bureau (like Milliman’s FICO 10T study) is entirely unreliable. Milliman admits that restricting the analysis to one bureau materially limits the mortgage relevance of the findings because actual mortgage underwriting generally relies on tri-merge credit data and score aggregation rules.

2. Excluding the COVID Stress Period Cherry-Picks the Sample and Removes the Most Relevant Test of Model Performance

Milliman’s FICO 10T study cherry-picks the data sample by excluding the COVID era stress period, an environment in which it is critical for credit scores to prove their value. Credit losses are realized during stress, not during benign conditions, and models that appear similar in expansionary periods frequently diverge sharply when borrowers face real financial pressure.

The paper’s justification is not credible, as the forbearance complicates performance measurement. Forbearance is not a reason to exclude data. The study biases the analysis by excluding the stressed economic period, during which our own research, backed by independent analyses, demonstrates that VantageScore 4.0’s performance advantage is strongest.

3. Assumptions About LLPA Pricing Grids Misrepresent How Pricing Will Actually Be Implemented

Milliman FICO 10T white paper pricing analysis rests on a fundamentally false premise: that the same LLPA grids would be applied uniformly across different credit scoring models. That assumption directly contradicts FHFA policy and actual GSE practice.

Credit scoring models are calibrated differently by design. As a result, model-specific LLPA grids are developed to align pricing with risk and not with raw score values. Applying a FICO Classic–based LLPA framework to VantageScore 4.0 unfairly forces a pricing structure that was never intended for its risk distribution and distorts the entire analysis. Any conclusions about lender behavior, borrower sorting, or pricing consistency that rely on this assumption are therefore invalid.

4. Vintage Aggregation Misrepresents Predictive Performance and Reverses the Truth

The study demonstrates a critical failure by stating results primarily on an aggregated basis without showing vintage-level performance. Aggregation across vintages spanning 2011–2022 masks how models perform under vastly different economic, policy, and borrower behavior regimes. Best practice in credit model evaluation requires both aggregate and vintage level disclosures, particularly across periods that include economic expansion, contraction, and unprecedented policy intervention. Milliman’s FICO 10T paper does not provide this transparency. VantageScore 4.0 and multiple third-party analyses routinely perform this to give stakeholders clear visibility into where performance is consistent and where it diverges.

Most damning, the paper itself concedes that results can “invert” at the vintage level. If the results reverse depending on the mortgage origination cohort analyzed, then the Milliman study’s headline claim is highly misleading.

5. Including Loan Modifications in the Default Definition Distorts Performance Results

Milliman’s FICO 10T white paper falsely treats loan modifications similarly to defaults. This is a significant departure from established credit risk practice. A loan modification is not equivalent to a 90 day delinquency. It reflects servicing policy, investor rules, and loss mitigation programs rather than borrower risk.

By treating all modifications as evidence of credit failure, the analysis introduces high noise into the performance signal, artificially inflating default rates, particularly in earlier vintages, and, as a result, contaminates every downstream statistic reported (KS, bad capture rates, and AUC). The stated results are highly biased and confuse the true risk differentiation of VantageScore 4.0.

In conclusion, Milliman’s FICO 10T study relies on a series of methodological choices that systematically bias its results, exclude the most relevant evidence, and obscure true comparative performance. As a result, its conclusions about VantageScore 4.0 are significantly misleading and should not be relied upon for policy, pricing, or underwriting decisions.

By Dr. Andrada Pacheco, EVP and Chief Data Scientist

You May Also Like
Stay On Top Of The News
Subscribe to receive valuable credit insights from our team (monthly).
Want to Learn More About VantageScore Implementation for Your Business?

© 2026 VantageScore Solutions, LLC. All Rights Reserved.