A simple learning agent interacting with an agent-based market model

被引：6

作者：

Dicks, Matthew ^{[1
]}

Paskaramoorthy, Andrew ^{[1
]}

Gebbie, Tim ^{[1
]}

机构：

[1] Univ Cape Town, Dept Stat Sci, ZA-7700 Cape Town, South Africa

来源：

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS | 2024年 / 633卷

关键词：

Strategic order-splitting; Reinforcement learning; Market simulation; Agent-based model; PRICE-IMPACT;

D O I：

10.1016/j.physa.2023.129363

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

We consider the learning dynamics of a single reinforcement learning optimal execution trading agent when it interacts with an event-driven agent-based financial market model. Trading takes place asynchronously through a matching engine in event time. The optimal execution agent is considered at different levels of initial order sizes and differently sized state spaces. The resulting impact on the agent-based model and market is considered using a calibration approach that explores changes in the empirical stylised facts and price impact curves. Convergence, volume trajectory and action trace plots are used to visualise the learning dynamics. The smaller state space agents had the number of states they visited converge much faster than the larger state space agents, and they were able to start learning to trade intuitively using the spread and volume states. We find that the moments of the model are robust to the impact of the learning agents, except for the Hurst exponent, which was lowered by the introduction of strategic order-splitting. The introduction of the learning agent preserves the shape of the price impact curves but can reduce the trade-sign auto-correlations and increase the micro-price volatility when the trading volumes increase.

引用

页数：18

共 57 条

[1]

Almgren R., 2000, J RISK, V3, P5, DOI DOI 10.21314/JOR.2001.041

[2] MODELING THE HIGH-FREQUENCY FX MARKET: AN AGENT-BASED APPROACH [J].

Aloud, Monira ;

Fasli, Maria ;

Tsang, Edward ;

Dupuis, Alexander ;

Olsen, Richard .

COMPUTATIONAL INTELLIGENCE, 2017, 33 (04) :771-825

[3] Top-down causation by information control: from a philosophical problem to a scientific research programme [J].

Auletta, G. ;

Ellis, G. F. R. ;

Jaeger, L. .

JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2008, 5 (27) :1159-1172

[4] Recent Advances in Hierarchical Reinforcement Learning [J].

Andrew G. Barto ;

Sridhar Mahadevan .

Discrete Event Dynamic Systems, 2003, 13 (4) :341-379

[5] THE THEORY OF DYNAMIC PROGRAMMING [J].

BELLMAN, R .

BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 1954, 60 (06) :503-515

[6]

Bertsimas Dimitris., 1998, J FINANC MARK, V1, P1, DOI [DOI 10.1016/S1386-4181(97)00012-8, 10.1016/s1386-4181(97)00012-8]

[7]

Bezanson J, 2021, The Julia programming language

[8]

biasLab, 2022, Rocket.jl documentation

[9]

Bouchaud J.-P., 2002, Quantitative Finance, V2, P251, DOI 10.1088/1469-7688/2/4/301

[10] Fluctuations and response in financial markets: the subtle nature of 'random' price changes [J].

Bouchaud, JP ;

Gefen, Y ;

Potters, M ;

Wyart, M .

QUANTITATIVE FINANCE, 2004, 4 (02) :176-190

← 1 2 3 4 5 6 →