Has validation/test data, a lot of papers only have test, this is good.
Actions are buy/sell/hold
Selects 1 of A2C, DDPG, PPO from validation period of 3 months, uses that for next 3 months
In period where DIJA is 7.78%, PPO 9.7%, DDPG 11.33%, A2C 9.81%
Initially uses turbulence index to select model, but later shows that this did no better than random chiose
In the end they change everything to only use DDPG because it performed the best and to add vix to the observation space since it simplified the model and was consistently performing the best