We use deep distributional reinforcement learning (RL) to develop hedging
strategies for a trader responsible for derivatives dependent on a particular
underlying asset. The transaction costs associated with trading the underlying
asset are usually quite small. Traders therefore tend to carry out delta
hedging daily, or even more frequently, to ensure that the portfolio is almost
completely insensitive to small movements in the asset's price. Hedging the
portfolio's exposure to large asset price movements and volatility changes
(gamma and vega hedging) is more expensive because this requires trades in
derivatives, for which transaction costs are quite large. Our analysis takes
account of these transaction cost differences. It shows how RL can be used to
develop a strategy for using options to manage gamma and vega risk with three
different objective functions. These objective functions involve a
mean-variance trade-off, value at risk, and conditional value at risk. We
illustrate how the optimal hedging strategy depends on the asset price process,
the trader's objective function, the level of transaction costs when options
are traded, and the maturity of the options used for hedging.