AI News, Why is machine learning in finance so hard?

Why is machine learning in finance so hard?

Financial markets have been one of the earliest adopters of machine learning (ML).

Even though ML has had enormous successes in predicting the market outcomes in the past, the recent advances in deep learning haven’t helped financial market predictions much.

Even though there are a number of papers claiming the successful application of deep learning models, I view those results with skepticism.

The issue of data distribution is crucial - almost all research papers doing financial predictions miss this point.

We expect the distribution of pixel weights in the training set for the dog class to be similar to the distribution in the test set for the dog class.

In addition to making sure the test and train sets have similar distributions, you also have to make sure the trained model is used in production only when the future data adheres to the train/validation distribution.

While most researchers have been mindful not to incorporate look-ahead bias into their research, almost everyone fails to acknowledge the issue of evolving data distributions.

For example, even if we have a complete understanding of what happened during the great depression of the 1930s, it’s hard to convert it to a form that makes it usable for an automated learning process.

(Please note that mixture of experts is a very common technique to combine the models from the same scale - almost all quant asset management firms employ this technique.) I

If there is one thing you take away from this post, let it be this: Financial time-series is a partial information game (POMDP) that’s really hard even for humans - we shouldn’t expect machines and algorithms to suddenly surpass human ability there.

What these algorithms are good at is the ability to unemotionally spot a hardcoded pattern and act on it - this unemotionality is a double-edged sword though - sometimes it helps and other times it doesn’t.

Financial modeling

Financial modeling is the task of building an abstract representation (a model) of a real world financial situation.[1] This is a mathematical model designed to represent (a simplified version of) the performance of a financial asset or portfolio of a business, project, or any other investment.

While there has been some debate in the industry as to the nature of financial modeling—whether it is a tradecraft, such as welding, or a science—the task of financial modeling has been gaining acceptance and rigor over the years.[2] Typically, financial modeling is understood to mean an exercise in either asset pricing or corporate finance, of a quantitative nature.

line items, often incorporate “unrealistic implicit assumptions” and “internal inconsistencies”.[8] (For example, a forecast for growth in revenue but without corresponding increases in working capital, fixed assets and the associated financing, may imbed unrealistic assumptions about asset turnover, leverage and / or equity financing.) What is required, but often lacking, is that all key elements are explicitly and consistently forecasted.

Related to this, is that modellers often additionally 'fail to identify crucial assumptions' relating to inputs, 'and to explore what can go wrong'.[9] Here, in general, modellers 'use point values and simple arithmetic instead of probability distributions and statistical measures'[10] — i.e., as mentioned, the problems are treated as deterministic in nature — and thus calculate a single value for the asset or project, but without providing information on the range, variance and sensitivity of outcomes.[11] Other critiques discuss the lack of basic computer programming concepts.[12] More serious criticism, in fact, relates to the nature of budgeting itself, and its impact on the organization.[13][14] The Financial Modeling World Championships, known as ModelOff, have been held since 2012.

Relatedly, applications include: These problems are generally stochastic and continuous in nature, and models here thus require complex algorithms, entailing computer simulation, advanced numerical methods (such as numerical differential equations, numerical linear algebra, dynamic programming) and/or the development of optimization models.

This Model risk is the subject of ongoing research by finance academics, and is a topic of great, and growing, interest in the risk management arena.[19] Criticism of the discipline (often preceding the financial crisis of 2007–08 by several years) emphasizes the differences between the mathematical and physical sciences, and finance, and the resultant caution to be applied by modelers, and by traders and risk managers using their models.

Predictive analytics

Predictive analytics encompasses a variety of statistical techniques from predictive modelling, machine learning, and data mining that analyze current and historical facts to make predictions about future or otherwise unknown events.[1][2] The term ‘predictive analytics’ was first coined by Dr Barbara Lond on her Linked In profile </ref>.

The term was used to describe one aspect of her PhD work in which she analyzed a range of variables (numeric and qualititative (coded to become ‘numeric’ for analysis purposes)) to ‘predict’ men and women’s career progression in an organization.

The qualitative research used the results from the quantitative analysis to understand further the ‘gender’ aspect of the research and where she interviewed 8 top women bankers to understand their experiences within their organizations.

Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions.[3] The defining functional effect of these technical approaches is that predictive analytics provides a predictive score (probability) for each individual (customer, employee, healthcare patient, product SKU, vehicle, component, machine, or other organizational unit) in order to determine, inform, or influence organizational processes that pertain across large numbers of individuals, such as in marketing, credit risk assessment, fraud detection, manufacturing, healthcare, and government operations including law enforcement.

Predictive analytics is used in actuarial science,[4] marketing,[5] financial services,[6] insurance, telecommunications,[7] retail,[8] travel,[9] mobility, [10] healthcare,[11] child protection,[12][13] pharmaceuticals,[14] capacity planning[citation needed] and other fields.

Scoring models process a customer's credit history, loan application, customer data, etc., in order to rank-order individuals by their likelihood of making future credit payments on time.

For example, identifying suspects after a crime has been committed, or credit card fraud as it occurs.[15] The core of predictive analytics relies on capturing relationships between explanatory variables and the predicted variables from past occurrences, and exploiting them to predict the unknown outcome.

For example, 'Predictive analytics—Technology that learns from experience (data) to predict the future behavior of individuals in order to drive better decisions.'[16] In future industrial systems, the value of predictive analytics will be to predict and prevent potential issues to achieve near-zero break-down and further be integrated into prescriptive analytics for decision optimization.[citation needed] Furthermore, the converted data can be used for closed-loop product life cycle improvement[17] which is the vision of the Industrial Internet Consortium.

Decision models describe the relationship between all the elements of a decision—the known data (including results of predictive models), the decision, and the forecast results of the decision—in order to predict the results of decisions involving many variables.

They must analyze and understand the products in demand or have the potential for high demand, predict customers' buying habits in order to promote relevant products at multiple touch points, and proactively identify and mitigate issues that have the potential to lose customers or reduce their ability to gain new ones.

Over the last 5 years, some child welfare agencies have started using predictive analytics to flag high risk cases.[18] The approach has been called 'innovative' by the Commission to Eliminate Child Abuse and Neglect Fatalities (CECANF),[19] and in Hillsborough County, Florida, where the lead child welfare agency uses a predictive modeling tool, there have been no abuse-related child deaths in the target population as of this writing.[20] Experts use predictive analysis in health care primarily to determine which patients are at risk of developing certain conditions, like diabetes, asthma, heart disease, and other lifetime illnesses.

Osheroff and colleagues:[21] Clinical decision support (CDS) provides clinicians, staff, patients, or other individuals with knowledge and person-specific information, intelligently filtered or presented at appropriate times, to enhance health and health care.

It encompasses a variety of tools and interventions such as computerized alerts and reminders, clinical guidelines, order sets, patient data reports and dashboards, documentation templates, diagnostic support, and clinical workflow tools.

2016 study of neurodegenerative disorders provides a powerful example of a CDS platform to diagnose, track, predict and monitor the progression of Parkinson's disease.[22] Using large and multi-source imaging, genetics, clinical and demographic data, these investigators developed a decision support system that can predict the state of the disease with high accuracy, consistency and precision.

Predictive analytics can help optimize the allocation of collection resources by identifying the most effective collection agencies, contact strategies, legal actions and other strategies to each customer, thus significantly increasing recovery at the same time reducing collection costs.

For an organization that offers multiple products, predictive analytics can help analyze customers' spending, usage and other behavior, leading to efficient cross sales, or selling additional products to current customers.[2] This directly leads to higher profitability per customer and stronger customer relationships.

One study concluded that a 5% increase in customer retention rates will increase profits by 25% to 95%.[23] Businesses tend to respond to customer attrition on a reactive basis, acting only after the customer has initiated the process to terminate service.

By a frequent examination of a customer's past service usage, service performance, spending and other behavior patterns, predictive models can determine the likelihood of a customer terminating service sometime soon.[7] An intervention with lucrative offers can increase the chance of retaining the customer.

Apart from identifying prospects, predictive analytics can also help to identify the most effective combination of product versions, marketing material, communication channels and timing that should be used to target a given consumer.

A reasonably complex model was used to identify fraudulent monthly reports submitted by divisional controllers.[25] The Internal Revenue Service (IRS) of the United States also uses predictive analytics to mine tax returns and identify tax fraud.[24] Recent[when?] advancements in technology have also introduced predictive behavior analysis for web fraud detection.

They can also be addressed via machine learning approaches which transform the original time series into a feature vector space, where the learning algorithm finds patterns that have predictive power.[26][27] When employing risk management techniques, the results are always to predict and benefit from a future scenario.

For a health insurance provider, predictive analytics can analyze a few years of past medical claims data, as well as lab, pharmacy and other records where available, to predict how expensive an enrollee is likely to be in the future.

Predictive analytics can streamline the process of customer acquisition by predicting the future risk behavior of a customer using application level data.[4] Predictive analytics in the form of credit scores have reduced the amount of time it takes for loan approvals, especially in the mortgage market where lending decisions are now made in a matter of hours rather than days or even weeks.

Examples of big data sources include web logs, RFID, sensor data, social networks, Internet search indexing, call detail records, military surveillance, and complex data in astronomic, biogeochemical, genomics, and atmospheric sciences.

Big Data is the core of most predictive analytic services offered by IT organizations.[28] Thanks to technological advances in computer hardware—faster CPUs, cheaper memory, and MPP architectures—and new technologies such as Hadoop, MapReduce, and in-database and text analytics for processing big data, it is now feasible to collect, analyze, and mine massive amounts of structured and unstructured data for new insights.[24] It is also possible to run predictive algorithms on streaming data.[29] Today, exploring big data and using predictive analytics is within reach of more organizations than ever before and new methods that are capable for handling such datasets are proposed.[30][31] The approaches and techniques used to conduct predictive analytics can broadly be grouped into regression techniques and machine learning techniques.

While mathematically it is feasible to apply multiple regression to discrete ordered dependent variables, some of the assumptions behind the theory of multiple linear regression no longer hold, and there are other techniques such as discrete choice models which are better suited for this type of analysis.

In a classification setting, assigning outcome probabilities to observations can be achieved through the use of a logistic model, which is basically a method which transforms information about the binary dependent variable into an unbounded continuous variable and estimates a regular multivariate model (See Allison's Logistic Regression for more information on the theory of logistic regression).

good way to understand the key difference between probit and logit models is to assume that the dependent variable is driven by a latent variable z, which is a sum of a linear combination of explanatory variables and a random noise term.

In recent years time series models have become more sophisticated and attempt to model conditional heteroskedasticity with models such as ARCH (autoregressive conditional heteroskedasticity) and GARCH (generalized autoregressive conditional heteroskedasticity) models frequently used for financial time series.

A distribution whose hazard function slopes upward is said to have positive duration dependence, a decreasing hazard shows negative duration dependence whereas constant hazard is a process with no memory usually characterized by the exponential distribution.

Globally-optimal classification tree analysis (GO-CTA) (also called hierarchical optimal discriminant analysis) is a generalization of optimal discriminant analysis that may be used to identify the statistical model that has maximum accuracy for predicting the value of a categorical dependent variable for a dataset consisting of categorical and continuous variables.

The output of HODA is a non-orthogonal tree that combines categorical variables and cut points for continuous variables that yields maximum predictive accuracy, an assessment of the exact Type I error rate, and an evaluation of potential cross-generalizability of the statistical model.

Today, since it includes a number of advanced statistical methods for regression and classification, it finds application in a wide variety of fields including medical diagnostics, credit card fraud detection, face and speech recognition and analysis of the stock market.

It can be proved that, unlike other methods, this method is universally asymptotically convergent, i.e.: as the size of the training set increases, if the observations are independent and identically distributed (i.i.d.), regardless of the distribution from which the sample is drawn, the predicted class will converge to the class assignment that minimizes misclassification error.

However, modern predictive analytics tools are no longer restricted to IT specialists.[citation needed] As more organizations adopt predictive analytics into decision-making processes and integrate it into their operations, they are creating a shift in the market toward business users as the primary consumers of the information.

Vendors are responding by creating new software that removes the mathematical complexity, provides user-friendly graphic interfaces and/or builds in short cuts that can, for example, recognize the kind of data available and suggest an appropriate predictive model.[32] Predictive analytics tools have become sophisticated enough to adequately present and dissect data problems,[citation needed] so that any data-savvy information worker can utilize them to analyze data and retrieve meaningful, useful results.[2] For example, modern tools present findings using simple charts, graphs, and scores that indicate the likelihood of possible outcomes.[33] There are numerous tools available in the marketplace that help with the execution of predictive analytics.

This means that a statistical prediction is only valid in sterile laboratory conditions, which suddenly isn't as useful as it seemed before.'[41] In a study of 1072 papers published in Information Systems Research and MIS Quarterly between 1990 and 2006, only 52 empirical papers attempted predictive claims, of which only 7 carried out proper predictive modeling or testing.[42]

Neural Networks Face Unexpected Problems in Analyzing Financial Data

One area where machine learning and neural networks are set to make a huge impact is in financial markets.

This field is rich in the two key factors that make machine-learning techniques successful: the computing resources necessary to run powerful neural networks;

And yet their price does vary according to factors such as interest rates, potential changes in interest rates, company performance, and the likelihood the debt will be repaid, the time until the bond must be repaid, and so on.

But bond traders are much less well served, say Ganguli and Dunnmon, because “the analogous information on bonds is only available for a fee and even then only in relatively small subsets compared to the overall volume of bond trades.” And that leads to the curious situation.

Their approach uses a data set of bond prices and other information that was posted to the online predictive modeling and dataset host,, by Breakthrough Securities in 2014.

This data set consists of the last 10 trades in each of 750,000 different bonds, along with a wide range of other parameters for each bond, such as whether it can be called in early, whether a trade was a customer buy or sell or a deal between traders, a fair price estimate based on the hazard associated with the bond, and so on.

The techniques include principal component analysis, which removes redundant parameters and leaves those with true predictive power;

and neural networks, which can find patterns in highly non-linear data sets and is thought of as a deeper form of machine learning.

Perhaps unsurprisingly, the best predictions come from the neural networks, which forecast future prices with an error of around 70 cents.

That’s interesting work that throws some light onto the dilemma that financial institutions must be facing in applying machine-learning techniques to financial data.

Predicting Stock Prices - Learn Python for Data Science #4

In this video, we build an Apple Stock Prediction script in 40 lines of Python using the scikit-learn library and plot the graph using the matplotlib library. The challenge for this video...

Entity Relationship Diagram (ERD) Training Video

4 Big Challenges in Machine Learning (ft. Martin Jaggi)

This video presents the rise of machine learning as the leading approach to artificial intelligence. It features Martin Jaggi, assistant professor of the IC School at EPFL.

Practical Solutions for Annoying Machine Learning Problems | DataEngConf SF '17

Don't miss the next DataEngConf in San Francisco: Recorded at DataEngConf SF '17 Solving problems in the real world with machine learning can be challenging:..

6. Monte Carlo Simulation

MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016 View the complete course: Instructor: John Guttag Prof. Guttag discusses the Monte..

9. Understanding Experimental Data

MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016 View the complete course: Instructor: Eric Grimson Prof. Grimson talks about how..

Panel Data Models with Individual and Time Fixed Effects

An introduction to basic panel data econometrics. Also watch my video on "Fixed Effects vs Random Effects". Go to my website ( under "Files" if you want to download the..

How can Digital Agriculture Feed Nine Billion People | Jim Ethington | TEDxUCDavisSF

Jim shows us how to use big data to help solve world hunger problems. Jim Ethington technologist and entrepreneur with 15 years of experience building data analytics and machine learning products...

Excel Data Analysis: Sort, Filter, PivotTable, Formulas (25 Examples): HCC Professional Day 2012

Download workbook: Learn the basics of Data Analysis at Highline Community College Professional Development Day 2012: Topics in Video: 1. What..

Professor Gunnar Carlsson Introduces Topological Data Analysis

An Introduction to Topological Data Analysis by Ayasdi's Gunnar Carlsson.