## Machine Learning Methods

There are hundreds of use-cases of machine learning within asset management, that holds true for supervised, unsupervised, and reinforcement learning methods.

Some Examples | (A) Input Strategies | (B) Weight Optimization | (C) Risk Management | (E) Back-office Efficiency |

Supervised Learning | Direction Prediction, Ranking, Regression | Bet Sizing, Triple Barrier, Duration Model | Risk Factors, Market Regime | Conversion, Satisfaction, Turnover |

Unsupervised Learning | Statistical-Factor Portfolio | Hierarchical Risk Parity; Network Graphs | GAN-VAR, Historical simulations | Customer Cluster, Product Cluster |

Reinforcement Learning | Automatic Trading | Automatic Weight | Automatic Risk | Automatic Mgmt. |

## Portfolio Optimization

$\text { Portfolio Optimization = Risk-Managed(Weighted (Trading)) }$

Portfolio optimization is the process of turning input trading strategies into optimal portfolios using weighted-optimization and risk-managed methods.

•

This definition benefits from the belief that trading strategies are not static and can constantly be improved.

•

Expected risk-adjusted return is ordinarily used as a measure for optimality.

•

We can see input strategies, weight-optimization, and risk management as three different layers to the overall strategy.

For MV portfolios, assets are the input strategies, weights-optimization are the efficient frontier drawn, risk-management is the position on the frontier (e.g., X EF min Vol) based on preferences.

### Trading Strategies

A trading strategy is a rolling process of trading instructions $f_t(S^{A}_{t-1})$ to ↑ increase or decrease ↓ exposure to assets $A$ using information $S$ known up to time $t_{-1}$.

•

Instructions like: $\textbf{if}$ signal/s $f(S)$ is $\textbf{true}$ then $\textbf{do}$ $A_{i}$, $\textbf{else if}$ ... $\textbf{do}$ $A_j$:

◦

If top 10% Momentum stocks (High Prior 12 month return) Buy else if bottom 10% Sell.

◦

If price reversions Buy Pepsi Sell Coca Cola else Nothing.

◦

If predicted probability above 80%, Buy, else if below 15% Sell.

◦

If reinforcement learning model issues Buy then Buy; Sell then Sell (identity strategy).

•

It is often the case that the overall trading strategy is many layers deep:

◦

Starting with the four input strategies above, we can also use a weight optimization strategy like mean-variance or equal-weighting to allocate funds between them.

◦

The final trading strategy could be a compound strategy: $g(f_1(S),f_2(S),f_3(S))$ with 20%, 35%, and 45% allocated to each input strategy respectively.

•

The final strategy generally seeks to maximize the risk adjusted portfolio return, that doesn’t mean individual input strategy has to be profitable.

•

Individual strategies might be part of a bigger plan and one might perhaps provide a simple hedge against price appreciation.

#### Portfolio Example

•

You could have ten input strategies with names president_tweet, media_stock, profitable_stock, famous_hf, meme_stock, google_trends, warren_buffet, stock_twist, bankruptcy, momentum.

•

Having the last 5-years return series for each of the 10 strategies you can optimize the allocation between them to maximize your expected risk-adjusted portfolio return.

•

The methods you can use include equal-weighting, performance-weighting, mean-variance weighting, hierarchical risk parity weighting etc.

•

A good weight optimization strategy might realize that meme_stock, google_trends, stock_twist, and momentum are highly correlated and as such equal weighting would be risky.

•

The benefit of most weight-optimization methods is that they learn the statistical properties between input trading strategies.

•

Risk-management strategies can then also be used to measure and improve the robustness of the strategy; perhaps in very volatile markets the strategy should be switched off.

#### Russian Doll Portfolio (Matryoshka)

•

A fund-of-funds might invest in the new portfolio we built and in those of other firms by once more applying a weighting strategy $h(g_1(f_1(S)...)...)$ themselves.

•

Some funds of funds will even apply regime shifting models that decide which weighting strategy to apply leading to portfolios in portfolios, $k(h_1(g_1(f_1(S)...)...)...)$ as a way to maximize their risk-adjusted return.

•

It is generally true that the deeper the trading strategy is, the larger the transaction cost become and that is sometimes to the benefit of improved diversification.

Therefore only the first layer trading strategy (initial doll) enters and exits positions in the open market while the weighting trading strategies optimizes the inputs collectively.

### Philosophy

•

Everything that has the potential to deliver a future return is a trading strategy, including the ownership of an asset, e.g., an identity trading strategy: if APPL exist Buy else None.

•

According to this definition, we can only perform weight optimization and risk-management once we have a list of trading strategies.

•

Example strategies would be (1) APPL, (2) Momentum Trading, (3) Equal Weight Nasdaq, (4) RL Strategy, (5) Mean Variance Portfolio.

•

Clearly some trading strategies are behaving very different from others.

Perhaps we can come up with a good classification system for all of the different strategies.

### Strategy Classification

Trading strategies can be classified according to many dimensions, some would divide them according to:

1.

Asset classes: Equities, Cryptocurrencies, Volatility, Derivatives, Real Estate, Fixed Income, Commodities.

2.

Function task: Statistical Arbitrage, Event-driven, Pairs Trading, Dollar Neutral, Market Making

3.

Quantity of signals: Single Factor, Multi Factor

4.

Type of data: Fundamental, Technical, Macro

5.

Risk Anomaly: Momentum, Mean-Reversion, Value, Profitability

### Functional Classification

The classification that decomposes trading strategies into the most amount of categories, is probably the functional classification (you can even still notice some overlap).

•

Technical trading is the use of market data and its transformations to predict the future price of an asset.

•

Trend trading are strategies where one takes a position in the asset only after you predict a change in trend.

•

Statistical arbitrage seeks mispricing by detecting asset relationships and/or potential anomalies, believing the anomaly will return to normal.

•

Risk parity strategies diversifies across assets according to the volatility they exhibit; when one asset class's volatility exceeds another rebalancing can occur by selecting individual units within each asset class or simply by using leverage.

•

Event trading involves the prediction of hard or soft financial events like corporate defaults, mergers and acquisitions, and earnings surprises.

•

Factor investing attempts to buy assets that exhibit a trait historically associated with promising investment returns.

•

Systematic global macro relies on macroeconomic principles to trade across asset classes and countries.

•

Fundamental trading relies on the use of accounting, management and sentiment data to predict whether a stock is over or undervalued.

### Factor Investing

#### A. Procedure

1.

Construct a signal for each stock, which we will call a "characteristic"

a.

It can be based on accounting data, return data, or whatever you please, what is important is that for each stock-date combination you have a value for this characteristic.

b.

We could create machine learning factors too like a patent factor using NLP and DL and a bankruptcy factor using GBTs.

2.

On a given date sort stocks by this characteristic.

a.

It is key that this characteristic is in fact known at sorting date! You should be careful not to introduce "look-ahead" bias

3.

Construct portfolios by dividing stocks in deciles/quantiles (groups) and form portfolios of stocks that have similar characteristic

a.

You can either equal weight or value weight within portfolio, but what is important is that each portfolio will have stocks of different characteristic values

4.

Construct the long-short where you go long portfolio 10 and short portfolio 1 (or vice versa)

a.

The point is to take a bet on the characteristic spread because you believe the longs will increase in values, and that the shorts will decrease in value.

#### B. Dynamic

Although this strategy is cross-sectional, the characteristics is still dynamic.

•

Let's take MSFT (Microsoft) as an example:

•

MSFT transitioned from being small in the 80's to be gigantic in the 90's, as a result, it moved up from the small portfolio to the big portfolio

•

Duing the Tech boom when MSFT had a huge valuation relative to it's book value, it went to the low BM portfolio

•

But then MSFT transitioned back to the high BM portfolio once it's market valuation collapsed in the aftermath of the tech bubble

•

The key is that firms' characteristics change over time, by constructing portfolios, we hope to estimate some stable relationship between risk and return.

### Momentum Factor

Momentum is the fact that stocks that have performed relatively well in the past continue to perform relatively well in the future, and stocks that have performed relatively poorly, continue to perform relatively poorly.

•

A Momentum investment or "relative strength" strategy buys stocks which have performed relatively well in the past and sells (shorts) stocks which have performed relatively poorly.

•

Over the 1963-1990 time period, Jegadeesh and Titman (1993) found that a strategy that by ranking stocks based on their past 6-12 months returns, you can buy the top $10 \%$ and short the bottom $10 \%$ to produce abnormal returns of $12 \%$ per year.

•

Momentum portfolios bet on cross-sectional continuation; that is, if a stock did well relative to others in the recent past, the bet is that will continue to do better than the average going forward.

•

In some ways momentum is the opposite of value, since value stocks are stocks that have had very low returns (however, momentum captures shorted effects compare to value which captures all historical returns).

•

To be frank, we don’t know what the best look-back period should be to calculate the optimal portfolio.

•

What can do is extract multiple momentum-related features and use deep learning models to extract the expected returns, i.e., construct a more universal factor.

### Multiple Factor

There could be fascinating ways to create long-short portfolios based on four factors, like combining a size, momentum, profitability, and investment factor.

•

Size - Low market cap stocks earn high abnormal returns.

•

Momentum - High Prior 6/12 month returns earn high abnormal returns.

•

Profitability - High Profitability firms earn high abnormal returns.

•

Investment - Low capital investment requirement firms earn high abnormal returns.

#### Unsupervised Asset Management

•

Let’s explore one specific area and technique of machine learning asset management that can combine various factors in a statistical fashion.

•

We will go back to a section that we left for the last lecture. D. Portfolio Management

•

Here we would specifically work on an unsupervised method called Hierarchical Risk Parity (HRP).

### Strategy Robustness

•

It is very important that the trading strategy only uses information that is known at the time of the trade, otherwise it is not a valid trading strategy- you obviously cannot trade on info that you don't know.

•

It is very important that you are clear about where the dollars that you invest come from, so that the weights can add up to 1 (if you have some capital) or zero (if the strategy is self-financed and demand zero capital).

•

In practice all strategies demand capital as no bank allow us to borrow without putting some capital in, but these self financed strategies are quite convenient to work with as they are all in the "Excess Return Space" since they have zero cost.

•

You can’t pull the characteristic out of thin air, for example if you just sort stocks alphabetically with their first letter and go long all the ABCs, and short all the XYZs we should expect amazing performance over time.

•

In all likelihood the portfolio will just be a more volatile version (less stocks) of the market portfolio because the sorting doesn’t have much economic rationale (garbage in; garbage out)

•

if you use the first letter of stock ticker to construct 26 portfolios you are unlikely to get spread in average returns and most likely each portfolio will resemble the market portfolio but with much more volatility.

•

And even if you do find something-> likely garbage, very hard to think about an economic model that would deliver that pattern!