平衡投资组合管理的利润，风险和可持续性

论文标题

平衡投资组合管理的利润，风险和可持续性

Balancing Profit, Risk, and Sustainability for Portfolio Management

论文作者

Maree, Charl, Omlin, Christian W.

论文摘要

股票投资组合优化是将资金持续重新分配到股票选择的过程。这是一个特别适合加强学习的问题，因为每日奖励是复杂的，并且目标功能可能不仅包括利润，例如风险和可持续性。我们开发了一种新颖的效用功能，其尖锐比率代表了代表可持续性的风险，环境，社会和治理评分（ESG）。我们表明，最先进的策略梯度方法 - 多代理深度确定性策略梯度（MADDPG） - 由于平坦的策略梯度而无法找到最佳策略，因此我们用参数优化的遗传算法取代了梯度下降。我们表明，通过允许连续的动作空间，我们的系统在改进深度Q学习方法的同时都优于MADDPG。至关重要的是，通过将风险和可持续性标准纳入公用事业功能，我们改善了用于投资组合优化的增强学习的最新学习；风险和可持续性在任何现代交易策略中都是必不可少的，我们提出了一个不仅报告这些指标的系统，而且还积极优化投资组合以改进它们。

Stock portfolio optimization is the process of continuous reallocation of funds to a selection of stocks. This is a particularly well-suited problem for reinforcement learning, as daily rewards are compounding and objective functions may include more than just profit, e.g., risk and sustainability. We developed a novel utility function with the Sharpe ratio representing risk and the environmental, social, and governance score (ESG) representing sustainability. We show that a state-of-the-art policy gradient method - multi-agent deep deterministic policy gradients (MADDPG) - fails to find the optimum policy due to flat policy gradients and we therefore replaced gradient descent with a genetic algorithm for parameter optimization. We show that our system outperforms MADDPG while improving on deep Q-learning approaches by allowing for continuous action spaces. Crucially, by incorporating risk and sustainability criteria in the utility function, we improve on the state-of-the-art in reinforcement learning for portfolio optimization; risk and sustainability are essential in any modern trading strategy and we propose a system that does not merely report these metrics, but that actively optimizes the portfolio to improve on them.

下载PDF全文

下载文献需遵守相关版权规定

论文标题