Whether you know it or not, you’re looking at a formula that set Wall Street – and global investors, too – up for one of the single largest financial asset bubbles and crashes in world history. But it’s not quite the villain that some have painted it to be, as we’ll see. Called the Gaussian copula function, a name that only a statistician could love, this wonder of applied-statistics-meets-financial-engineering is largely credited with giving the CDO market its wings; and for those of you with mortgage market backgrounds, you know that much of the subprime product originated in this country during the boom years found its way into CDO products, because that’s where the investor demand was strongest. In fact, CDOs grew from a $275 billion market in 2000 to $4.7 trillion by 2006, thanks largely to the subprime mortgage boom that fed global demand for CDO issuances. The formula’s author, David Li, introduced the formula in 2000 and quickly became a star in the world of finance — at the time, even often muttered among the names of those that might one day be considered for a Nobel Prize. In fact, Li’s insight helped fuel modern securitization’s boom, depending on whom you believe. Today? Li is no longer residing in the U.S., lives seemingly almost as an outcast in China, and has refused to speak publicly about the formula he introduced. His story has been the subject of some well-written journalism, both by Felix Salmon in Wired and Sam Jones at the Financial Times. It’s an amazing story about a math whiz from the actuarial ranks, applying theorems used in assessing death rates and broken hearts to the bond market in a way that was at the same time elegant and easy-to-understand. I highly recommend reading both features—well worth your time. But I want to get beyond the story here of how and discuss what really went wrong with the Gaussian copula—because I believe that understanding the answers here are critical towards re-establishing investor confidence in mortgage-backed bonds that go beyond the boundaries set by Fannie, Freddie and Ginnie. Breaking down securitization and asset performance Let’s start with the securitization machine. As complex as statistics may be, the bottom line in any mortgage-led structured financial derivative comes back to one very simple concept: prepayments. Whether voluntary (refis) or involuntary (borrower defaults), investors must get a handle on prepayment behavior to accurately assess the value and risk embedded in any potential investment. And to do that, two key factors must be understood: first, the probability of default/prepayment; and second, the correlations among individual assets. From the perspective of a CDO–many of which largely took tranches from previous RMBS securitizations and repackaged them–the same statistical issues remain, even if with a slightly different slant: understanding the probability of default, and also understanding the dependent probabilities of default that relate one bond to another bond. The first key factor (probability of default/prepayment) is important and tough enough to estimate on its own—but understanding the second factor (correlations between prepayment/default probabilities) becomes even more important and almost impossibly complex, especially in the case of CDOs. In fact, this very issue has been at the forefront of some heated debate as of late among the mathematically literate in the financial world. Mortgage servicers, however, already intuitively understand this so-called correlation risk in decidedly non-mathematical terms. Jobless rates go up? So does a borrower’s likelihood of default. A neighbor in a given neighborhood can’t keep up with their payment? Odds are that someone else in the neighborhood faces a similar problem. These correlations are how default activity tends to cluster itself over time, after all. And most servicing executives know that the number of variables affecting borrower defaults are nearly infinite in number, ranging from the incredibly micro to the most macroeconomic in nature. Consider the following example of the importance of correlations, courtesy of Felix Salmon in the aforementioned Wired feature:
To understand the mathematics of correlation better, consider something simple, like a kid in an elementary school: Let’s call her Alice. The probability that her parents will get divorced this year is about 5 percent, the risk of her getting head lice is about 5 percent, the chance of her seeing a teacher slip on a banana peel is about 5 percent, and the likelihood of her winning the class spelling bee is about 5 percent. If investors were trading securities based on the chances of those things happening only to Alice, they would all trade at more or less the same price. But something important happens when we start looking at two kids rather than one—not just Alice but also the girl she sits next to, Britney. If Britney’s parents get divorced, what are the chances that Alice’s parents will get divorced, too? Still about 5 percent: The correlation there is close to zero. But if Britney gets head lice, the chance that Alice will get head lice is much higher, about 50 percent—which means the correlation is probably up in the 0.5 range. If Britney sees a teacher slip on a banana peel, what is the chance that Alice will see it, too? Very high indeed, since they sit next to each other: It could be as much as 95 percent, which means the correlation is close to 1. And if Britney wins the class spelling bee, the chance of Alice winning it is zero, which means the correlation is negative: -1.
To properly understand the value of more complex mortgage securities and CDOs backed by them, Wall Street first needed to come to grips with this correlation risk, understanding how prepayments/defaults could cascade upon one another (or act in relative isolation). Li’s ground-breaking work seduced Wall Street’s money makers with its simplicity: there was no need to calculate an infinite set of correlations, when you could employ the copula function to simply estimate a multivariate distribution of correlations, and go forward from there. A technical knockout, in just two punches Like most statistical methods, the simplicity Li’s model afforded came at a cost. First of all, to estimate something, you must have data; second of all, you must still make some assumptions regarding the underlying thing being estimated. And it’s in these two areas where Wall Street’s financial wizards went terribly, terribly wrong; both delivered a 1-2 punch that killed private securitization, let alone the CDO market. Let’s start with the first punch of data, because this was the real breakthrough in Li’s work. Rather than working with data on the actual performance of bonds—which in the case of subprime securities was very limited at the time—Li’s research used prices of related credit default swaps as a proxy for bond performance. But the CDS market was itself a relatively new invention of financial alchemy at the time, too; as such, a reliance on price data to predict forward default correlations introduced what statisticians would call a “recency effect.” Others might more correctly call it a historical blind spot. The result was an exercise in circular logic: the data led the model to suggest that default correlations were low, so everyone assumed that default correlations were low. It’s sort of like saying that home prices can only go up in the future, because home prices have gone up for the past 5 years. When home prices did begin to decline, as they inevitably would have done anyway, model correlations for default risk soared, the market froze, and plenty of bonds started blowing up. If the market had only incorrectly estimated the probabilities of default, the results would still have been bad–but not horrific. The real knockout punch for the securitization market came in the form of a fundamental misunderstanding of the correlations in default risk, and not just in a failure to correctly estimate risk of default. In practice, this meant that not only were at-risk bonds imploding–but bonds that were supposed to be safe began self-destructing as well. The reason for this lies in the rating agencies’ decision to apply the normal (or Gaussian) distribution to the phenomena it was modeling. I’ve ranted privately about this for over two years, telling anyone who would listen that default correlations were not (and are not) normally distributed for any security backed by mortgage assets. A so-called “fatter tail” is needed. Martin Hutchinson does a great job of speaking to this issue in a recent column I ran across at the New York Times, even if he isn’t speaking specifically about mortgages or CDOs:
The Gaussian model is too optimistic about market stability, because it uses an unrealistically high number for the key variable, the exponential rate of decay, known to its friends as alpha (not the alpha of performance measurement). Gauss is at 2. If markets worked with an alpha of zero — known as the Cauchy distribution for its founder, Augustin-Louis Cauchy — [market cataclysms] would come around every 2.5 months. That is unrealistic in the other direction. In 1962, the mathematician Benoit Mandelbrot demonstrated that an alpha of 1.7 provided the best fit with a 100-year series of cotton prices. More recent market history — the 1987 crash, the Long Term Capital Management debacle and the 2007-8 crisis — suggest big bad events occur about once a decade. That goes better with an alpha of 0.5, the Pareto-Levy distribution.
The point here isn’t that the Pareto-Levy is the inherently correct distribution to use in modeling default correlations among bonds (or even among mortgages in a pool, for that matter). The point is that distributional analysis is, and long has been, a fundamental practice in sound statistical analysis. That the rating agencies seemingly blindly input the Gaussian copula function as their rating tool of choice for CDOs back in 2004, as Sam Jones at the FT has suggested in his work, borders on the criminal to anyone that has ever studied statistics. Doubly so for anyone that had their professors mark them down for failing to consider distribution of data. (Not that such a thing has ever occurred in my academic career. Ahem.) The alpha value for a given distribution can be tested to ensure underlying assumptions about the distribution of data hold water in the face of data being collected. And while he hasn’t given an interview on the matter, I have to think this is precisely what Li meant in 2005 when he told the Wall Street Journal: “Very few people understand the essence of the model.” Li’s tool was a statistical insight, a Swiss Army knife that could have led to a veritable Renaissance in how Wall Street understands and prices risk. And, to be sure, there were plenty working at Wall Street’s investment banks calling attention to the limitations of Li’s model ahead of the current crisis, and even proposing well-thought-out alternatives (there are things known as “empirical copulas,” for example, which can be constructed when the underlying distribution of data is unknown). As I’ve studied the mortgage meltdown in our country more and more, the single largest shame behind the mess we’re now in ultimately lies here — because at least this one aspect was preventable: better models clearly could have helped investors price risk more appropriately, which may have kept a lid on at least some of the crisis. Greedy consumers looking for a cheap loan without documentation of any sort may have found those loans a little harder to come by, if investor demand were restrained by an appropriate understanding of the risks involved. And make no mistake about it: this restraint was supposed to have come from the rating agencies. (Investors would have used a battered bucket to hold whatever was cheap, as a my colleague Linda Lowell is fond of saying.) The amazing thing is that despite all of the complex modeling we’ve seen rating agencies employ, the models used to assess risk in mortgage-related securities didn’t account for what really should have been our basest of all instincts: that is, lending money to individuals without the capacity to repay will never end well. For some reason, the rating agencies weren’t interested in listening to what should have been common sense, or they were simply content to read whatever their “black box” spit out at them. I’m honestly not sure which is a worse fate. Paul Jackson is the publisher of HousingWire.com and HousingWire Magazine. Follow him on Twitter: @pjackson