Publications
Deep Future Analytics (DFA) is the result of 30 years of research and
experience in credit risk analytics, in all its many aspects decisions.

Reinventing Retail Lending Analytics - Forecasting, Stress Testing, Capital and Scoring for a World of Crises
Available on Amazon
Perspectives on Credit Risk, Portfolio Management, and Capital: Readings from The RMA Journal
Available on Amazon
Reinventing Retail Lending Analytics - Forecasting, Stress Testing, Capital and Scoring for a World of Crises
Available on Amazon
Perspectives on Credit Risk, Portfolio Management, and Capital: Readings from The RMA Journal
Available on AmazonArticles
-
Normalizing Pandemic Data for Credit Scoring
The COVID-19 pandemic created abnormal credit risk conditions that did not align well with pre-2020 credit scores. Since the pandemic, most organizations have either excluded the period 2020-2021 from their modeling or included it without adjustment, leaving it as noise in the data. Model validators and examiners have been divided about requiring one of these approaches or defaulting to model developer judgment. None of this is ideal from a model development perspective. We have found that a technical solution is available. Our analysis uses lifecycle and environment outputs from an Age-Period-Cohort analysis as fixed offsets to the credit score development. Panel data is used, so the credit score is developed with a discrete time survival model approach. We tested logistic regression and stochastic gradient boosted regression trees as estimators with the panel data and APC inputs. For this research, we used Fannie Mae data. The APC model was estimated on the full available history, from 2005 through 2024. The origination scores were estimated on two-year periods from 2016 through 2024 and tested on all other periods, including a score that was developed on the full period. All models were also tested on comparably prepared data from Freddie Mac for cross-validation.
10/Jul/2025 07:00 PM -
Current expected credit loss procyclicality: it depends on the model
Current expected credit loss procyclicality: it depends on the model
25/Jul/2025 07:00 PM -
When Big Data Isn’t Enough: Solving the long-range forecasting problem in supervised learning
n a world where big data is everywhere, no one has big data relative to the economic cycle. Data volume needs to be thought of along two dimensions. (1) How many accounts / transactions / data fields do we have? (2) How much time history do we have? Few, if any, big data sets include history covering one economic cycle (back to 2005) or two economic cycles (back to 1998). Therefore, unstructured learning algorithms will be unable to distinguish between long-term macroeconomic drivers and point-in-time variations across accounts or transactions. This is the colinearity problem that is well known in consumer lending.
25/Jul/2025 07:00 PM -
Consumer risk appetite, the credit cycle and the housing bubble
In this paper, we explore the role of consumer risk appetite in the initiation of credit cycles and as an early trigger of the US mortgage crisis. We analyze a panel data set of mortgages originated between 2000 and 2009 and follow their performance up to 2014. After controlling for all of the usual observable effects, we show that a strong residual vintage effect remains. This vintage effect correlates well with consumer mortgage demand, as measured by the Federal Reserve Board’s Senior Loan Officer Opinion Survey, and with changes in mortgage pricing at the time the loan was originated. Our findings are consistent with an economic environment in which the incentives of low-risk consumers to obtain a mortgage decrease when the cost of obtaining a loan rises. As a result, mortgage originators generate mortgages from a pool of consumers with changing risk profiles over the credit cycle. The unobservable component of the shift in credit risk, relative to the usual underwriting criteria, may be thought of as macroeconomic adverse selection.
25/Jul/2025 07:00 PM -
CECL ASSESSING THE ALTERNATIVES
This study evaluates various models for implementing the Current Expected Credit Loss (CECL) method, using a large mortgage data set from Fannie Mae and Freddie Mac
24/Jul/2025 07:00 PM -
Instabilities Using Cox Proportional Hazards Models in Credit Risk
When the underlying system or process that is being observed is based upon observations versus age, vintage (origination time) and calendar time, Cox proportional hazards models can exhibit instabilities because of embedded assumptions.
25/Jul/2025 07:00 PM -
Macroeconomic Adverse Selection in Machine Learning Models of Credit Risk
Macroeconomic Adverse Selection in Machine Learning Models of Credit Risk
25/Jul/2025 07:00 PM -
Scoring AI‐generated policy recommendations with Risk‐Adjusted Gain in Net Present Happiness
Scoring AI‐generated policy recommendations with Risk‐Adjusted Gain in Net Present Happiness
25/Jul/2025 07:00 PM -
Solving the Long-Range Forecasting Problem in Supervised Learning
In a world where big data is everywhere, no one has big data relative to the economic cycle. Data volume needs to be thought of along two dimensions.
10/Aug/2025 07:00 PM -
Creating Unbiased Machine Learning Models by Design
Unintended bias against protected groups has become a key obstacle to the widespread adoption of machine learning methods.
10/Aug/2025 07:00 PM -
A survey of machine learning in credit risk
Machine learning algorithms have come to dominate several industries. After decades of resistance from examiners and auditors, machine learning is now mov- ing from the research desk to the application stack for credit scoring and a range of other applications in credit risk.
25/Jul/2025 07:00 PM -
Journal of the Operational Research Society
The new accounting standards of CECL for the US and IFRS 9 elsewhere require predictions of lifetime losses for loans. The use of roll rates, state transition and “vintage” models has been proposed and indeed are used by practitioners. The first two methods are relatively more accurate for predictions of up to one year, because they include lagged delinquency as a predictor, whereas “vintage” models are more accurate for predictions for longer peri- ods, but not short periods because they omit delinquency as a predictor variable. In this paper we propose the use of survival models that include lagged delinquency as a covariate and show, using a large sample of 30 year mortgages, that the proposed method is more accurate than any of the other three methods for both short-term and long-term predictions of the probability of delinquency. We experiment extensively to find the appropriate lagging structure for the delinquency term. The results provide a new method to make lifetime loss predictions, as required by CECL and IFRS 9 Stage 2.
03/Sep/2025 07:00 PM -
-
Effective Generative AI Model Risk Management
Effective Generative AI Model Risk Management
03/Sep/2025 07:00 PM -
Classical and quantum computing methods for estimating loan-level risk distributions
Classical and quantum computing methods for estimating loan-level risk distributions
05/Sep/2025 07:00 PM -
Stabilizing machine learning models with Age-Period-Cohort inputs for scoring and stress testing
Machine learning models have been used extensively for credit scoring, but the architectures employed su er from a significant loss in accuracy out-of-sample and out-of-time. Further, the most common architectures do not e ectively integrate economic scenarios to enable stress testing, cash flow, or yield estimation. The present research demonstrates that providing lifecycle and environment functions from Age-Period-Cohort analysis can significantly improve out-of-sample and out-of-time performance as well as enabling the model’s use in both scoring and stress testing applications. This method is demonstrated for behavior scoring where account delinquency is one of the provided inputs, because behavior scoring has historically presented the most diculties for combining credit scoring and stress testing. Our method works well in both origination and behavior scoring. The results are also compared to multihorizon survival models, which share the same architectural design with Age-Period-Cohort inputs and coecients that vary with forecast horizon, but using a logistic regression estimation of the model. The analysis was performed on 30-year prime conforming US mortgage data. Nonlinear problems involving large amounts of alternate data are best at highlighting the advantages of machine learning. Data from Fannie Mae and Freddie Mac is not such a test case, but it serves the purpose of comparing these methods with and without Age-Period-Cohort inputs. In order to make a fair comparison, all models are given a panel structure where each account is observed monthly to determine default or non-default.
08/Sep/2025 07:00 PM -
An Age–Period–Cohort Framework for Profit and Profit Volatility Modeling
The greatest source of failure in portfolio analytics is not individual models that perform poorly, but rather an inability to integrate models quantitatively across management functions. The separable components of age–period–cohort models provide a framework for integrated credit risk modeling across an organization. Using a panel data structure, credit risk scores can be integrated with an APC framework using either logistic regression or machine learning. Such APC scores for default, payoff, and other key rates fit naturally into forward-looking cash flow estimates. Given an economic scenario, every applicant at the time of origination can be assigned profit and profit volatility estimates so that underwriting can truly be account-level. This process optimizes the most fallible part of underwriting, which is setting cutoff scores and assigning loan pricing and terms. This article provides a summary of applications of APC models across portfolio management roles, with a description of how to create the models to be directly integrated. As a consequence, cash flow calculations are available for each account, and cutoff scores can be set directly from portfolio financial targets.
08/Sep/2025 07:00 PM -
The evolution of goals in AI agents
Forced evolution has been proposed as a possible path to developing artificial general intelligence. For practical reasons, self-replicating robots are being proposed for missions where direct manufacture could be prohibitive or as a cost-effective means to maintain a stable working population of robots. If self-replication occurs in a harsh (i.e. selective) environment, the forces of evolution may distort the originally programmed objectives. Via millions of simulations of AI agents with nematode-level neural networks, this research explores the consequences of allowing replication in a hostile and competi- tive environment. As the selection pressures are tuned, the evolution of their neural networks and corresponding behav- ioral changes are tracked. As a consequence of these simulations, agents with multi-layer neural networks trained simply to retrieve resources, consume needed resources, and evade obstacles evolve behaviors that look like evasion of hostile overseers, the intended murder of enemies, and cannibalism of other agents. These simulations are intended to directly address safety concerns around creating self-replicating AI agents or robots. As designers, if we allow replication under selection pressure, regardless of initial designs, we risk allowing the emergence of unintended strategies. One solution to preventing evolution could be to enable AI agents with continuous backup– immortality.
08/Sep/2025 07:00 PM -
Podcast: The hidden math behind credit risk
The conversation covers why traditional machine learning models are missing critical components for accurate risk assessment and how adverse selection has dramatically impacted loan quality in recent vintages. Then Joe makes a bold prediction that software user interfaces are on the verge of a transformation that will render them unrecognizable from previous versions Curious? All is revealed in this fascinating conversation.
10/Sep/2025 07:00 PM