All Papers

Date Published
Asger Lau Andersen, Emil Toft Hansen, Niels Johannesen, Adam Sheridan
econ.GN q-fin.EC
This paper uses transaction data from a large bank in Scandinavia to estimate the effect of social distancing laws on consumer spending in the COVID-19 pandemic. The analysis exploits a natural experiment to disentangle the effects of the virus and the laws aiming to contain it: Denmark and Sweden were similarly exposed to the pandemic but only Denmark imposed significant restrictions on social and economic activities. We estimate that aggregate spending dropped by around 25 percent in Sweden and, as a result of the shutdown, by 4 additional percentage points in Denmark. This implies that most of the economic contraction is caused by the virus itself and occurs regardless of social distancing laws. The age gradient in the estimates suggest that social distancing reinforces the virus-induced drop in spending for low health-risk individuals but attenuates it for high-risk individuals by lowering the overall prevalence of the virus in the society.
Regression to the Tail: Why the Olympics Blow Up
Bent Flyvbjerg, Alexander Budzier, Daniel Lunn
q-fin.GN q-fin.RM
The Olympic Games are the largest, highest-profile, and most expensive megaevent hosted by cities and nations. Average sports-related costs of hosting are $12.0 billion. Non-sports-related costs are typically several times that. Every Olympics since 1960 has run over budget, at an average of 172 percent in real terms, the highest overrun on record for any type of megaproject. The paper tests theoretical statistical distributions against empirical data for the costs of the Games, in order to explain the cost risks faced by host cities and nations. It is documented, for the first time, that cost and cost overrun for the Games follow a power-law distribution. Olympic costs are subject to infinite mean and variance, with dire consequences for predictability and planning. We name this phenomenon "regression to the tail": it is only a matter of time until a new extreme event occurs, with an overrun larger than the largest so far, and thus more disruptive and less plannable. The generative mechanism for the Olympic power law is identified as strong convexity prompted by six causal drivers: irreversibility, fixed deadlines, the Blank Check Syndrome, tight coupling, long planning horizons, and an Eternal Beginner Syndrome. The power law explains why the Games are so difficult to plan and manage successfully, and why cities and nations should think twice before bidding to host. Based on the power law, two heuristics are identified for better decision making on hosting. Finally, the paper develops measures for good practice in planning and managing the Games, including how to mitigate the extreme risks of the Olympic power law.
It's a Trap: Emperor Palpatine's Poison Pill
Zachary Feinstein
In this paper we study the financial repercussions of the destruction of two fully armed and operational moon-sized battle stations ("Death Stars") in a 4-year period and the dissolution of the galactic government in Star Wars. The emphasis of this work is to calibrate and simulate a model of the banking and financial systems within the galaxy. Along these lines, we measure the level of systemic risk that may have been generated by the death of Emperor Palpatine and the destruction of the second Death Star. We conclude by finding the economic resources the Rebel Alliance would need to have in reserve in order to prevent a financial crisis from gripping the galaxy through an optimally allocated banking bailout.
Racial Disparities in Voting Wait Times: Evidence from Smartphone Data
M. Keith Chen, Kareem Haggag, Devin G. Pope, Ryne Rohla
econ.GN q-fin.EC stat.AP
Equal access to voting is a core feature of democratic government. Using data from millions of smartphone users, we quantify a racial disparity in voting wait times across a nationwide sample of polling places during the 2016 U.S. presidential election. Relative to entirely-white neighborhoods, residents of entirely-black neighborhoods waited 29% longer to vote and were 74% more likely to spend more than 30 minutes at their polling place. This disparity holds when comparing predominantly white and black polling places within the same states and counties, and survives numerous robustness and placebo tests. We shed light on the mechanism for these results and discuss how geospatial data can be an effective tool to both measure and monitor these disparities going forward.
Turing's Children: Representation of Sexual Minorities in STEM
Dario Sansone, Christopher S. Carpenter
econ.GN q-fin.EC
We provide the first nationally representative estimates of sexual minority representation in STEM fields by studying 142,641 men and women in same-sex couples from the 2009-2018 American Community Surveys. These data indicate that men in same-sex couples are 12 percentage points less likely to have completed a bachelor's degree in a STEM field compared to men in different-sex couples; there is no gap observed for women in same-sex couples compared to women in different-sex couples. The STEM gap between men in same-sex and different-sex couples is larger than the STEM gap between white and black men but is smaller than the gender STEM gap. We also document a gap in STEM occupations between men in same-sex and different-sex couples, and we replicate this finding using independently drawn data from the 2013-2018 National Health Interview Surveys. These differences persist after controlling for demographic characteristics, location, and fertility. Our findings further the call for interventions designed at increasing representation of sexual minorities in STEM.
Election Predictions as Martingales: An Arbitrage Approach
Nassim Nicholas Taleb
q-fin.PR physics.soc-ph
We consider the estimation of binary election outcomes as martingales and propose an arbitrage pricing when one continuously updates estimates. We argue that the estimator needs to be priced as a binary option as the arbitrage valuation minimizes the conventionally used Brier score for tracking the accuracy of probability assessors. We create a dual martingale process $Y$, in $[L,H]$ from the standard arithmetic Brownian motion, $X$ in $(-\infty, \infty)$ and price elections accordingly. The dual process $Y$ can represent the numerical votes needed for success. We show the relationship between the volatility of the estimator in relation to that of the underlying variable. When there is a high uncertainty about the final outcome, 1) the arbitrage value of the binary gets closer to 50\%, 2) the estimate should not undergo large changes even if polls or other bases show significant variations. There are arbitrage relationships between 1) the binary value, 2) the estimation of $Y$, 3) the volatility of the estimation of $Y$ over the remaining time to expiration. We note that these arbitrage relationships were often violated by the various forecasting groups in the U.S. presidential elections of 2016, as well as the notion that all intermediate assessments of the success of a candidate need to be considered, not just the final one.
The Strength of Absent Ties: Social Integration via Online Dating
Josue Ortega, Philipp Hergovich
physics.soc-ph cs.SI q-fin.EC
We used to marry people to whom we were somehow connected. Since we were more connected to people similar to us, we were also likely to marry someone from our own race. However, online dating has changed this pattern; people who meet online tend to be complete strangers. We investigate the effects of those previously absent ties on the diversity of modern societies. We find that social integration occurs rapidly when a society benefits from new connections. Our analysis of state-level data on interracial marriage and broadband adoption (proxy for online dating) suggests that this integration process is significant and ongoing.
On Single Point Forecasts for Fat-Tailed Variables
Nassim Nicholas Taleb, Yaneer Bar-Yam, Pasquale Cirillo
physics.soc-ph econ.GN q-fin.EC stat.AP stat.ME
We discuss common errors and fallacies when using naive "evidence based" empiricism and point forecasts for fat-tailed variables, as well as the insufficiency of using naive first-order scientific methods for tail risk management. We use the COVID-19 pandemic as the background for the discussion and as an example of a phenomenon characterized by a multiplicative nature, and what mitigating policies must result from the statistical properties and associated risks. In doing so, we also respond to the points raised by Ioannidis et al. (2020).
How predictable is technological progress?
J. Doyne Farmer, Francois Lafond
q-fin.EC physics.soc-ph
Recently it has become clear that many technologies follow a generalized version of Moore's law, i.e. costs tend to drop exponentially, at different rates that depend on the technology. Here we formulate Moore's law as a correlated geometric random walk with drift, and apply it to historical data on 53 technologies. We derive a closed form expression approximating the distribution of forecast errors as a function of time. Based on hind-casting experiments we show that this works well, making it possible to collapse the forecast errors for many different technologies at different time horizons onto the same universal distribution. This is valuable because it allows us to make forecasts for any given technology with a clear understanding of the quality of the forecasts. As a practical demonstration we make distributional forecasts at different time horizons for solar photovoltaic modules, and show how our method can be used to estimate the probability that a given technology will outperform another technology at a given point in the future.
Global labor flow network reveals the hierarchical organization and dynamics of geo-industrial clusters in the world economy
Jaehyuk Park, Ian Wood, Elise Jing, Azadeh Nematzadeh, Souvik Ghosh, Michael Conover, Yong-Yeol Ahn
cs.SI physics.soc-ph q-fin.GN
Groups of firms often achieve a competitive advantage through the formation of geo-industrial clusters. Although many exemplary clusters, such as Hollywood or Silicon Valley, have been frequently studied, systematic approaches to identify and analyze the hierarchical structure of the geo-industrial clusters at the global scale are rare. In this work, we use LinkedIn's employment histories of more than 500 million users over 25 years to construct a labor flow network of over 4 million firms across the world and apply a recursive network community detection algorithm to reveal the hierarchical structure of geo-industrial clusters. We show that the resulting geo-industrial clusters exhibit a stronger association between the influx of educated-workers and financial performance, compared to existing aggregation units. Furthermore, our additional analysis of the skill sets of educated-workers supplements the relationship between the labor flow of educated-workers and productivity growth. We argue that geo-industrial clusters defined by labor flow provide better insights into the growth and the decline of the economy than other common economic units.
The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies
Stephan Zheng, Alexander Trott, Sunil Srinivasa, Nikhil Naik, Melvin Gruesbeck, David C. Parkes, Richard Socher
econ.GN cs.LG q-fin.EC stat.ML
Tackling real-world socio-economic challenges requires designing and testing economic policies. However, this is hard in practice, due to a lack of appropriate (micro-level) economic data and limited opportunity to experiment. In this work, we train social planners that discover tax policies in dynamic economies that can effectively trade-off economic equality and productivity. We propose a two-level deep reinforcement learning approach to learn dynamic tax policies, based on economic simulations in which both agents and a government learn and adapt. Our data-driven approach does not make use of economic modeling assumptions, and learns from observational data alone. We make four main contributions. First, we present an economic simulation environment that features competitive pressures and market dynamics. We validate the simulation by showing that baseline tax systems perform in a way that is consistent with economic theory, including in regard to learned agent behaviors and specializations. Second, we show that AI-driven tax policies improve the trade-off between equality and productivity by 16% over baseline policies, including the prominent Saez tax framework. Third, we showcase several emergent features: AI-driven tax policies are qualitatively different from baselines, setting a higher top tax rate and higher net subsidies for low incomes. Moreover, AI-driven tax policies perform strongly in the face of emergent tax-gaming strategies learned by AI agents. Lastly, AI-driven tax policies are also effective when used in experiments with human participants. In experiments conducted on MTurk, an AI tax policy provides an equality-productivity trade-off that is similar to that provided by the Saez framework along with higher inverse-income weighted social welfare.
Uncovering Offshore Financial Centers: Conduits and Sinks in the Global Corporate Ownership Network
Javier Garcia-Bernardo, Jan Fichtner, Eelke M. Heemskerk, Frank W. Takes
physics.soc-ph q-fin.GN
Multinational corporations use highly complex structures of parents and subsidiaries to organize their operations and ownership. Offshore Financial Centers (OFCs) facilitate these structures through low taxation and lenient regulation, but are increasingly under scrutiny, for instance for enabling tax avoidance. Therefore, the identification of OFC jurisdictions has become a politicized and contested issue. We introduce a novel data-driven approach for identifying OFCs based on the global corporate ownership network, in which over 98 million firms (nodes) are connected through 71 million ownership relations. This granular firm-level network data uniquely allows identifying both sink-OFCs and conduit-OFCs. Sink-OFCs attract and retain foreign capital while conduit-OFCs are attractive intermediate destinations in the routing of international investments and enable the transfer of capital without taxation. We identify 24 sink-OFCs. In addition, a small set of five countries -- the Netherlands, the United Kingdom, Ireland, Singapore and Switzerland -- canalize the majority of corporate offshore investment as conduit-OFCs. Each conduit jurisdiction is specialized in a geographical area and there is significant specialization based on industrial sectors. Against the idea of OFCs as exotic small islands that cannot be regulated, we show that many sink and conduit-OFCs are highly developed countries.
Valid t-ratio Inference for IV
David S. Lee, Justin McCrary, Marcelo J. Moreira, Jack Porter
econ.EM econ.GN q-fin.EC
In the single IV model, current practice relies on the first-stage F exceeding some threshold (e.g., 10) as a criterion for trusting t-ratio inferences, even though this yields an anti-conservative test. We show that a true 5 percent test instead requires an F greater than 104.7. Maintaining 10 as a threshold requires replacing the critical value 1.96 with 3.43. We re-examine 57 AER papers and find that corrected inference causes half of the initially presumed statistically significant results to be insignificant. We introduce a more powerful test, the tF procedure, which provides F-dependent adjusted t-ratio critical values.
Curse of Democracy: Evidence from the 21st Century
Yusuke Narita, Ayumi Sudo
econ.GN q-fin.EC stat.AP
Democracy is widely believed to contribute to economic growth and public health in the 20th and earlier centuries. We find that this conventional wisdom is reversed in this century, i.e., democracy has persistent negative impacts on GDP growth during 2001-2020. This finding emerges from five different instrumental variable strategies. Our analysis suggests that democracies cause slower growth through less investment and trade. For 2020, democracy is also found to cause more deaths from Covid-19.
Does Infrastructure Investment Lead to Economic Growth or Economic Fragility? Evidence from China
Atif Ansar, Bent Flyvbjerg, Alexander Budzier, Daniel Lunn
q-fin.GN q-fin.EC
The prevalent view in the economics literature is that a high level of infrastructure investment is a precursor to economic growth. China is especially held up as a model to emulate. Based on the largest dataset of its kind, this paper punctures the twin myths that, first, infrastructure creates economic value, and, second, China has a distinct advantage in its delivery. Far from being an engine of economic growth, the typical infrastructure investment fails to deliver a positive risk adjusted return. Moreover, China's track record in delivering infrastructure is no better than that of rich democracies. Where investments are debt-financed, overinvesting in unproductive projects results in the buildup of debt, monetary expansion, instability in financial markets, and economic fragility, exactly as we see in China today. We conclude that poorly managed infrastructure investments are a main explanation of surfacing economic and financial problems in China. We predict that, unless China shifts to a lower level of higher-quality infrastructure investments, the country is headed for an infrastructure-led national financial and economic crisis, which is likely also to be a crisis for the international economy. China's infrastructure investment model is not one to follow for other countries but one to avoid.
Evaluating gambles using dynamics
Ole Peters, Murray Gell-Mann
q-fin.EC cond-mat.stat-mech q-fin.GN
Gambles are random variables that model possible changes in monetary wealth. Classic decision theory transforms money into utility through a utility function and defines the value of a gamble as the expectation value of utility changes. Utility functions aim to capture individual psychological characteristics, but their generality limits predictive power. Expectation value maximizers are defined as rational in economics, but expectation values are only meaningful in the presence of ensembles or in systems with ergodic properties, whereas decision-makers have no access to ensembles and the variables representing wealth in the usual growth models do not have the relevant ergodic properties. Simultaneously addressing the shortcomings of utility and those of expectations, we propose to evaluate gambles by averaging wealth growth over time. No utility function is needed, but a dynamic must be specified to compute time averages. Linear and logarithmic "utility functions" appear as transformations that generate ergodic observables for purely additive and purely multiplicative dynamics, respectively. We highlight inconsistencies throughout the development of decision theory, whose correction clarifies that our perspective is legitimate. These invalidate a commonly cited argument for bounded utility functions.
Statistical Basis for Predicting Technological Progress
Bela Nagy, J. Doyne Farmer, Quan M. Bui, Jessika E. Trancik
physics.soc-ph q-fin.GN stat.AP
Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly tied. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation.
On the Statistical Differences between Binary Forecasts and Real World Payoffs
Nassim Nicholas Taleb
q-fin.GN physics.soc-ph q-fin.RM
What do binary (or probabilistic) forecasting abilities have to do with overall performance? We map the difference between (univariate) binary predictions, bets and "beliefs" (expressed as a specific "event" will happen/will not happen) and real-world continuous payoffs (numerical benefits or harm from an event) and show the effect of their conflation and mischaracterization in the decision-science literature. We also examine the differences under thin and fat tails. The effects are: A- Spuriousness of many psychological results particularly those documenting that humans overestimate tail probabilities and rare events, or that they overreact to fears of market crashes, ecological calamities, etc. Many perceived "biases" are just mischaracterizations by psychologists. There is also a misuse of Hayekian arguments in promoting prediction markets. We quantify such conflations with a metric for "pseudo-overestimation". B- Being a "good forecaster" in binary space doesn't lead to having a good actual performance}, and vice versa, especially under nonlinearities. A binary forecasting record is likely to be a reverse indicator under some classes of distributions. Deeper uncertainty or more complicated and realistic probability distribution worsen the conflation . C- Machine Learning: Some nonlinear payoff functions, while not lending themselves to verbalistic expressions and "forecasts", are well captured by ML or expressed in option contracts. D- Fattailedness: The difference is exacerbated in the power law classes of probability distributions.
Bitcoin, Currencies, and Fragility
Nassim Nicholas Taleb
econ.GN physics.soc-ph q-fin.EC q-fin.GN
This discussion applies quantitative finance methods and economic arguments to cryptocurrencies in general and bitcoin in particular -- as there are about $10,000$ cryptocurrencies, we focus (unless otherwise specified) on the most discussed crypto of those that claim to hew to the original protocol (Nakamoto 2009) and the one with, by far, the largest market capitalization. In its current version, in spite of the hype, bitcoin failed to satisfy the notion of "currency without government" (it proved to not even be a currency at all), can be neither a short nor long term store of value (its expected value is no higher than $0$), cannot operate as a reliable inflation hedge, and, worst of all, does not constitute, not even remotely, a safe haven for one's investments, a shield against government tyranny, or a tail protection vehicle for catastrophic episodes. Furthermore, bitcoin promoters appear to conflate the success of a payment mechanism (as a decentralized mode of exchange), which so far has failed, with the speculative variations in the price of a zero-sum maximally fragile asset with massive negative externalities. Going through monetary history, we show how a true numeraire must be one of minimum variance with respect to an arbitrary basket of goods and services, how gold and silver lost their inflation hedge status during the Hunt brothers squeeze in the late 1970s and what would be required from a true inflation hedged store of value.
Statistical Consequences of Fat Tails: Real World Preasymptotics,...
Nassim Nicholas Taleb
The monograph investigates the misapplication of conventional statistical techniques to fat tailed distributions and looks for remedies, when possible. Switching from thin tailed to fat tailed distributions requires more than "changing the color of the dress". Traditional asymptotics deal mainly with either n=1 or $n=\infty$, and the real world is in between, under of the "laws of the medium numbers" --which vary widely across specific distributions. Both the law of large numbers and the generalized central limit mechanisms operate in highly idiosyncratic ways outside the standard Gaussian or Levy-Stable basins of convergence. A few examples: + The sample mean is rarely in line with the population mean, with effect on "naive empiricism", but can be sometimes be estimated via parametric methods. + The "empirical distribution" is rarely empirical. + Parameter uncertainty has compounding effects on statistical metrics. + Dimension reduction (principal components) fails. + Inequality estimators (GINI or quantile contributions) are not additive and produce wrong results. + Many "biases" found in psychology become entirely rational under more sophisticated probability distributions + Most of the failures of financial economics, econometrics, and behavioral economics can be attributed to using the wrong distributions. This book, the first volume of the Technical Incerto, weaves a narrative around published journal articles.
Quantum attacks on Bitcoin, and how to protect against them
Divesh Aggarwal, Gavin K. Brennen, Troy Lee, Miklos Santha, Marco Tomamichel
quant-ph q-fin.GN
The key cryptographic protocols used to secure the internet and financial transactions of today are all susceptible to attack by the development of a sufficiently large quantum computer. One particular area at risk are cryptocurrencies, a market currently worth over 150 billion USD. We investigate the risk of Bitcoin, and other cryptocurrencies, to attacks by quantum computers. We find that the proof-of-work used by Bitcoin is relatively resistant to substantial speedup by quantum computers in the next 10 years, mainly because specialized ASIC miners are extremely fast compared to the estimated clock speed of near-term quantum computers. On the other hand, the elliptic curve signature scheme used by Bitcoin is much more at risk, and could be completely broken by a quantum computer as early as 2027, by the most optimistic estimates. We analyze an alternative proof-of-work called Momentum, based on finding collisions in a hash function, that is even more resistant to speedup by a quantum computer. We also review the available post-quantum signature schemes to see which one would best meet the security and efficiency requirements of blockchain applications.
A Short Note on P-Value Hacking
Nassim Nicholas Taleb
stat.AP q-fin.ST
We present the expected values from p-value hacking as a choice of the minimum p-value among $m$ independents tests, which can be considerably lower than the "true" p-value, even with a single trial, owing to the extreme skewness of the meta-distribution. We first present an exact probability distribution (meta-distribution) for p-values across ensembles of statistically identical phenomena. We derive the distribution for small samples $2<n \leq n^*\approx 30$ as well as the limiting one as the sample size $n$ becomes large. We also look at the properties of the "power" of a test through the distribution of its inverse for a given p-value and parametrization. The formulas allow the investigation of the stability of the reproduction of results and "p-hacking" and other aspects of meta-analysis. P-values are shown to be extremely skewed and volatile, regardless of the sample size $n$, and vary greatly across repetitions of exactly same protocols under identical stochastic copies of the phenomenon; such volatility makes the minimum $p$ value diverge significantly from the "true" one. Setting the power is shown to offer little remedy unless sample size is increased markedly or the p-value is lowered by at least one order of magnitude.
Managing COVID-19 Pandemic without Destructing the Economy
David Gershon, Alexander Lipton, Hagai Levine
q-bio.PE q-fin.MF
We analyze an approach to managing the COVID-19 pandemic without shutting down the economy while staying within the capacity of the healthcare system. We base our analysis on a detailed heterogeneous epidemiological model, which takes into account different population groups and phases of the disease, including incubation, infection period, hospitalization, and treatment in the intensive care unit (ICU). We model the healthcare capacity as the total number of hospital and ICU beds for the whole country. We calibrate the model parameters to data reported in several recent research papers. For high- and low-risk population groups, we calculate the number of total and intensive care hospitalizations, and deaths as functions of time. The main conclusion is that countries, which enforce reasonable hygienic measures on time can avoid lockdowns throughout the pandemic provided that the number of spare ICU beds per million is above the threshold of about 100. In countries where the total number of ICU beds is below this threshold, a limited period quarantine to specific high-risk groups of the population suffices. Furthermore, in the case of an inadequate capacity of the healthcare system, we incorporate a feedback loop and demonstrate that quantitative impact of the lack of ICU units on the death curve. In the case of inadequate ICU beds, full- and partial-quarantine scenarios outcomes are almost identical, making it unnecessary to shut down the whole economy. We conclude that only a limited-time quarantine of the high-risk group might be necessary, while the rest of the economy can remain operational.
The Oxford Olympics Study 2016: Cost and Cost Overrun at the Games
Bent Flyvbjerg, Allison Stewart, Alexander Budzier
Given that Olympic Games held over the past decade each have cost USD 8.9 billion on average, the size and financial risks of the Games warrant study. The objectives of the Oxford Olympics study are to (1) establish the actual outturn costs of previous Olympic Games in a manner where cost can consistently be compared across Games; (2) establish cost overruns for previous Games, i.e., the degree to which final outturn costs reflect projected budgets at the bid stage, again in a way that allows comparison across Games; (3) test whether the Olympic Games Knowledge Management Program has reduced cost risk for the Games, and, finally, (4) benchmark cost and cost overrun for the Rio 2016 Olympics against previous Games. The main contribution of the Oxford study is to establish a phenomenology of cost and cost overrun at the Olympics, which allows consistent and systematic comparison across Games. This has not been done before. The study concludes that for a city and nation to decide to stage the Olympic Games is to decide to take on one of the most costly and financially most risky type of megaproject that exists, something that many cities and nations have learned to their peril.
Statistical Consequences of Fat Tails: Real World Preasymptotics, Epistemology, and Applications
Nassim Nicholas Taleb
stat.OT q-fin.RM stat.AP stat.ME
The monograph investigates the misapplication of conventional statistical techniques to fat tailed distributions and looks for remedies, when possible. Switching from thin tailed to fat tailed distributions requires more than "changing the color of the dress". Traditional asymptotics deal mainly with either n=1 or $n=\infty$, and the real world is in between, under of the "laws of the medium numbers" --which vary widely across specific distributions. Both the law of large numbers and the generalized central limit mechanisms operate in highly idiosyncratic ways outside the standard Gaussian or Levy-Stable basins of convergence. A few examples: + The sample mean is rarely in line with the population mean, with effect on "naive empiricism", but can be sometimes be estimated via parametric methods. + The "empirical distribution" is rarely empirical. + Parameter uncertainty has compounding effects on statistical metrics. + Dimension reduction (principal components) fails. + Inequality estimators (GINI or quantile contributions) are not additive and produce wrong results. + Many "biases" found in psychology become entirely rational under more sophisticated probability distributions + Most of the failures of financial economics, econometrics, and behavioral economics can be attributed to using the wrong distributions. This book, the first volume of the Technical Incerto, weaves a narrative around published journal articles.
Load more

Monthly Topics

ML Papers
Crypto Papers
Monotonicity for AI ethics and society: An empirical study of the...
Algorithm fairness in the application of artificial intelligence (AI) is essential for a better society. As the foundational axiom of social mechanisms, fairness consists of multiple facets. Although the machine learning (ML) community has focused on intersectionality as a matter of statistical parity, especially in discrimination issues, an emerging body of literature addresses another facet -- monotonicity. Based on domain expertise, monotonicity plays a vital role in numerous fairness-related areas, where violations could misguide human decisions and lead to disastrous consequences. In this paper, we first systematically evaluate the significance of applying monotonic neural additive models (MNAMs), which use a fairness-aware ML algorithm to enforce both individual and pairwise monotonicity principles, for the fairness of AI ethics and society. We have found, through a hybrid method of theoretical reasoning, simulation, and extensive empirical analysis, that considering monotonicity axioms is essential in all areas of fairness, including criminology, education, health care, and finance. Our research contributes to the interdisciplinary research at the interface of AI ethics, explainable AI (XAI), and human-computer interactions (HCIs). By evidencing the catastrophic consequences if monotonicity is not met, we address the significance of monotonicity requirements in AI applications. Furthermore, we demonstrate that MNAMs are an effective fairness-aware ML approach by imposing monotonicity restrictions integrating human intelligence.
Quant 4.0: Engineering Quantitative Investment with Automated,...
Quantitative investment (``quant'') is an interdisciplinary field combining financial engineering, computer science, mathematics, statistics, etc. Quant has become one of the mainstream investment methodologies over the past decades, and has experienced three generations: Quant 1.0, trading by mathematical modeling to discover mis-priced assets in markets; Quant 2.0, shifting quant research pipeline from small ``strategy workshops'' to large ``alpha factories''; Quant 3.0, applying deep learning techniques to discover complex nonlinear pricing rules. Despite its advantage in prediction, deep learning relies on extremely large data volume and labor-intensive tuning of ``black-box'' neural network models. To address these limitations, in this paper, we introduce Quant 4.0 and provide an engineering perspective for next-generation quant. Quant 4.0 has three key differentiating components. First, automated AI changes quant pipeline from traditional hand-craft modeling to the state-of-the-art automated modeling, practicing the philosophy of ``algorithm produces algorithm, model builds model, and eventually AI creates AI''. Second, explainable AI develops new techniques to better understand and interpret investment decisions made by machine learning black-boxes, and explains complicated and hidden risk exposures. Third, knowledge-driven AI is a supplement to data-driven AI such as deep learning and it incorporates prior knowledge into modeling to improve investment decision, in particular for quantitative value investing. Moreover, we discuss how to build a system that practices the Quant 4.0 concept. Finally, we propose ten challenging research problems for quant technology, and discuss potential solutions, research directions, and future trends.
Robust machine learning pipelines for trading market-neutral stock...

Cover Papers