Author Archives: lstake

False discoveries everywhere

Since John Ioannidis published a paper in 2005 provocatively titled Why Most Published Research Findings are False, the general public and also researchers have gained a greater awareness of the unreliability of scientific discoveries based on “a single study assessed by formal … Continue reading

Posted in Uncategorized | Leave a comment

Is two still the magic number?

When doing data analysis, we have come to regard two as the threshold that a t-statistic must clear in order to declare a variable statistically significant. As most readers will know, this critical value ensures a 5% level of significance given a … Continue reading

Posted in statistics, stock market forecasting | Tagged , , , | Leave a comment

Data snooping in a nutshell

Data snooping is pervasive in financial research, both in academia and in industry. In my experience, the level of awareness about data snooping varies widely among practitioners. All too often, however, huge amounts of time and effort are wasted by following a … Continue reading

Posted in machine learning, stock market forecasting | Tagged , , | 1 Comment

Noise in asset returns

One of the goals of this blog is to discuss various approaches to forecasting asset returns taken from both the economics and machine learning fields. Before diving into specific models and techniques, however, I begin by discussing the issue of noise in … Continue reading

Posted in stock market, stock market forecasting | Tagged | Leave a comment

The Hedgehog and the Fox Redux

Many fund managers will be aware of Philip Tetlock’s book “Expert Political Judgment” published in 2005. In the book, Tetlock analyzes forecasts collected from 284 experts over twenty years. While he focuses primarily on the ability of political experts to … Continue reading

Posted in Uncategorized | Tagged , , | Leave a comment

Is out-of-sample testing of forecasting models a myth?

When working with forecasting models, a well-known observation is that in-sample performance is usually better, often much better, than out-of-sample performance. That is, a model generally produces better forecasts over the data that it was constructed on than over new data. … Continue reading

Posted in machine learning, stock market | Tagged , , | 1 Comment

R/Finance 2014 Conference

Last week I attended the R/Finance conference held in Chicago. About 300 developers, academics, and practitioners gathered at the two-day conference to discuss the latest applications of the R open-source programming language to finance. I’ve mostly coded in Matlab, but … Continue reading

Posted in software | Tagged , | Leave a comment

Big Data and Economics

Lorie and Fisher. Big Data circa 1960. For much of its history as a discipline, economics has been trapped in a Small Data paradigm. Macroeconomists analyzed output and inflation using annual or quarterly data spanning several decades at best. Microeconomic … Continue reading

Posted in Big Data, machine learning | Tagged , , | Leave a comment