You won’t become an expert in machine learning and finance overnight, this course would be highly exploratory, and my hope is to expose you to a board range of pertinent topics, as a result you are expected to do some self-study to perform well on the course assigned projects.
This class preparation focuses on Pandas and NumPy exercises, these are crucial tools in finance.
1st Class Preparation Exercise
Many Forms of Machine Learning
Focus on two main areas:
There are at least one firm that doesn’t spend much money on data but has good performance. There are also firms that spend a lot on data, but they don’t perform well.
The above example looks at the relationship between how much funds spend on data and their performance.
In the first graph there is a weak, but visible relationship between money spent and performance. For every $1 spent, one more $1 is made in profit.
It generally seems to be true that firms that did not spend enough on data shut down.
There is no clear relationship, but the firms that spent a lot on data without additional performance closed down.
VI = Value Investment, HF= High Frequency, QF= Quantamental Fund, PE = Private Equity Fund.
Clustering (left): you can see a clustering model successfully creating a statistical cluster that almost perfectly recovers the four types of funds (we eventually verify it with labels).
A. Introduction to Machine Learning
Finance is any activity that assesses and exchanges items of economic value.
Because machine learning is broadly used to (1) make predictions, (2) estimate values, and (3) uncover patterns, it can use be used for many tasks within finance.
Every time you make a transaction that data is recorded, stored → this data can later be processed and used in a machine learning model.
Machine learning allows you to use the data you collected to:
Predict what will happen next for an item of interest (predicting the stock market)
Estimate the value or other characteristic for an item of interest (valuing a firm)
Map and uncover relationships and patterns between items (optimizing a portfolio)
In a traditional finance class one come across various models and types of analysis.
We can use machine learning to replace and support some of the above abstractions.
We will be looking at a valuation model below, to see how machine learning can support our use of the Discounted cash flow model.
B. Machine Learning Applied to Finance
History of Machine Learning in Finance
How did we, for example, get to the point where we can use machine learning in asset management?
You might be surprised to find that machine learning as we know it today have been part of a collective effort for a couple of centuries.
The first two use cases in Finance and Economics was indeed supervised machine learning, but not long after classification did reinforcement learning and unsupervised learning appear in applied finance research.
Wall Street Journal, April 23, 1962, p. 4
Although economics and to extent finance has been slow to acknowledge modern machine learning, the development of machine learning and these domains are fundamentally intertwined as this sub-module will reveal.
Future Supervised and Unsupervised Innovations
C. History of Machine Learning in Finance
The Linear Regression
Before investigating advanced machine learning methods let's start with standard linear regression.
Machine learning researchers typically tries to avoid making strong assumptions about f such as assuming that f is linear and that x is low-dimensional.
Traditional linear regression models, such as OLS regressions unfortunately makes these assumption and is therefore a highly constrained machine learning model.
However linear regression methods can provide us with a strong intuition of how more advanced methods work — moreover they have certain benefits that become useful when applied to finance.
We see many examples of linear models in finance and econometrics:
The Capital Asset Pricing Model (CAPM) as well as
Arbitrage Pricing Theory (APT) come to mind.
In fact, the Chartered Financial Analyst (CFA) Institute did a study that confirmed that linear regression remains the primary workhorse for financial modeling.
Sklearn Python Example
D. Finance and Linear Regressions
1. Linear Regression Research (30 points)
I had a question about the first question on the homework. In the question, you ask for Tesla and S&P returns, is this just the stock price at close or is this something else? Thanks in advance.
F. Industry Factors (HW1)
Submit on Brightspace as PDF
A ten-slide presentation deck that describes:
(a) How machine learning can be used as part of the traditional discretionary investment process, i.e., quantamental investing;
(b) The advantages and disadvantages of using machine learning in the investment decision-making process;
(c) Describe one unique idea using machine learning in finance that can not yet be found in literature, on GitHub, or in a blog-post.
This counts 20% towards your final grade so give the appropriate effort.
You can use any type of software e.g., google slides, power-point, etc as long as you can export the final result as a PDF.
This project does not yet require any coding or programming, the purpose it familiarize yourself with the language
You can have ten slides or less, but no more;
You will not be presenting the deck, it would only be read by the assessor.
You can decide how many slides you allocate to each question e.g: for example, one approach might be 8 slides to answer question (a) , 1 slide to answer (b), and 2 slides to answer (c). This distribution won’t affect your score.
E. Presentation Deck