A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

被引：11

作者：

Wen, Chao ^{[1
,3
]}

Xu, Miao ^{[3
]}

Zhang, Zhilin ^{[3
]}

Zheng, Zhenzhe ^{[2
]}

Wang, Yuhui ^{[1
]}

Liu, Xiangyu ^{[3
]}

Rong, Yu ^{[3
]}

Xie, Dong ^{[3
]}

Tan, Xiaoyang ^{[1
]}

Yu, Chuan ^{[3
]}

Xu, Jian ^{[3
]}

Wu, Fan ^{[2
]}

Chen, Guihai ^{[2
]}

Zhu, Xiaoqiang ^{[3
]}

Zheng, Bo ^{[3
]}

机构：

[1] MIIT Key Lab Pattern Anal & Machine Intelligence, Nanjing, Peoples R China

[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

[3] Alibaba Grp, Hangzhou, Peoples R China

来源：

WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING | 2022年

关键词：

Auto-bidding; Bid Optimization; Multi-Agent Reinforcement Learning; E-commerce Advertising; AUCTION;

D O I：

10.1145/3488560.3498373

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In online advertising, auto-bidding has become an essential tool for advertisers to optimize their preferred ad performance metrics by simply expressing high-level campaign objectives and constraints. Previous works designed auto-bidding tools from the view of single-agent, without modeling the mutual influence between agents. In this paper, we instead consider this problem from a distributed multi-agent perspective, and propose a general Multi-Agent reinforcement learning framework for Auto-Bidding, namely MAAB, to learn the auto-bidding strategies. First, we investigate the competition and cooperation relation among auto-bidding agents, and propose a temperature-regularized credit assignment to establish a mixed cooperative-competitive paradigm. By carefully making a competition and cooperation trade-off among agents, we can reach an equilibrium state that guarantees not only individual advertiser's utility but also the system performance (i.e., social welfare). Second, to avoid the potential collusion behaviors of bidding low prices underlying the cooperation, we further propose bar agents to set a personalized bidding bar for each agent, and then alleviate the revenue degradation due to the cooperation. Third, to deploy MAAB in the large-scale advertising system with millions of advertisers, we propose a mean-field approach. By grouping advertisers with the same objective as a mean auto-bidding agent, the interactions among the large-scale advertisers are greatly simplified, making it practical to train MAAB efficiently. Extensive experiments on the offline industrial dataset and Alibaba advertising platform demonstrate that our approach outperforms several baseline methods in terms of social welfare and revenue.

引用

页码：1129 / 1139

页数：11

共 34 条

[1] Autobidding with Constraints [J].

Aggarwal, Gagan ;

Badanidiyuru, Ashwinkumar ;

Mehta, Aranyak .

WEB AND INTERNET ECONOMICS, WINE 2019, 2019, 11920 :17-30

[2]

[Anonymous], 2015, WORLDW RET EC SAL EM

[3]

[Anonymous], 2021, About facebook

[4] Real-Time Bidding by Reinforcement Learning in Display Advertising [J].

Cai, Han ;

Ren, Kan ;

Zhang, Weinan ;

Malialis, Kleanthis ;

Wang, Jun ;

Yu, Yong ;

Guo, Defeng .

WSDM'17: PROCEEDINGS OF THE TENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2017, :661-670

[5] LONG-RUN COMPETITION IN CAPACITY, SHORT-RUN COMPETITION IN PRICE, AND THE COURNOT MODEL [J].

DAVIDSON, C ;

DENECKERE, R .

RAND JOURNAL OF ECONOMICS, 1986, 17 (03) :404-415

[6]

Duetting Paul., 2019, P 36 INT C MACHINE L, P1706

[7] Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords [J].

Edelman, Benjamin ;

Ostrovsky, Michael ;

Schwarz, Michael .

AMERICAN ECONOMIC REVIEW, 2007, 97 (01) :242-259

[8]

Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974

[9]

Google, 2021, GOOGLE ADWORDS API

[10]

Google Ads Help Center, 2021, AUT BIDD

← 1 2 3 4 →