论文标题

订单书超出最佳水平的信息有多信息?机器学习观点

How informative is the Order Book Beyond the Best Levels? Machine Learning Perspective

论文作者

Tran, Dat Thanh, Kanniainen, Juho, Iosifidis, Alexandros

论文摘要

关于限制顺序簿市场的研究一直在迅速增长,如今的高频全订单数据数据可用于研究人员和从业人员。但是,很常见的研究论文仅使用最佳级别数据,这促使我们询问排除书中在多个价格水平上更深入的报价是否会导致性能下降。在本文中,我们通过使用现代机器学习(ML)技术来预测中价运动,而不假设限制订单簿市场代表线性系统。我们提供了许多结果,这些结果在ML预测模型,特征选择算法,数据集和预测范围内都具有鲁棒性。我们发现,最佳的出价和询问级别不仅被系统地识别为订单书中最有用的级别,而且还可以携带大多数良好预测性能所需的信息。另一方面,即使书本上的级别包含大多数相关信息,也可以最大程度地提高模型的性能,应该在所有级别上使用所有数据。此外,订单账面水平的信息性从第一级到第四级显然降低,而其余级别也大致同样重要。

Research on limit order book markets has been rapidly growing and nowadays high-frequency full order book data is widely available for researchers and practitioners. However, it is common that research papers use the best level data only, which motivates us to ask whether the exclusion of the quotes deeper in the book over multiple price levels causes performance degradation. In this paper, we address this question by using modern Machine Learning (ML) techniques to predict mid-price movements without assuming that limit order book markets represent a linear system. We provide a number of results that are robust across ML prediction models, feature selection algorithms, data sets, and prediction horizons. We find that the best bid and ask levels are systematically identified not only as the most informative levels in the order books, but also to carry most of the information needed for good prediction performance. On the other hand, even if the top-of-the-book levels contain most of the relevant information, to maximize models' performance one should use all data across all the levels. Additionally, the informativeness of the order book levels clearly decreases from the first to the fourth level while the rest of the levels are approximately equally important.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源