2019. 07. 01. 10:00 - 2019. 07. 01. 11:30
MTA Rényi Intézet, kutyás terem (harmadik emelet)
-
-
Esemény típusa: szeminárium
Szervezés: Intézeti
-
Valószínűségelmélet szeminárium

Leírás

We approach the continuous-time mean-variance (MV) portfolio selection with reinforcement learning (RL). The problem is to achieve the best tradeoff between exploration and exploitation, and is formulated as an entropy-regularized, relaxed stochastic control problem. We prove that the optimal feedback policy
for this problem must be Gaussian, with time-decaying variance. We then establish connections between the entropy-regularized MV and the classical MV, including the solvability equivalence and the convergence as exploration weighting parameter decays to zero. Finally, we prove a policy improvement theorem, based
on which we devise an implementable and data-driven RL algorithm. We find that our algorithm outperforms both an adaptive control based method and a deep neural networks based algorithm by a large margin in our simulations. Joint work with Haoran Wang (Columbia).