DHPA: Dynamic Human Preference Analytics Framework: A Case Study on Taxi Drivers' Learning Curve Analysis

被引：27

作者：

Pan, Menghai ^{[1
]}

Huang, Weixiao ^{[1
]}

Li, Yanhua ^{[1
]}

Zhou, Xun ^{[2
]}

Liu, Zhenming ^{[3
]}

Song, Rui ^{[4
]}

Lu, Hui ^{[5
]}

Tian, Zhihong ^{[5
]}

Luo, Jun ^{[6
]}

机构：

[1] Worcester Polytech Inst, Worcester, MA 01609 USA

[2] Univ Iowa, Iowa City, IA 52242 USA

[3] Coll William & Mary, Williamsburg, VA 23187 USA

[4] North Carolina State Univ, Raleigh, NC USA

[5] Guangzhou Univ, Guangzhou, Peoples R China

[6] Lenovo Grp Ltd, Beijing, Peoples R China

来源：

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY | 2020年 / 11卷 / 01期

关键词：

Urban computing; inverse reinforcement learning; preference dynamics;

D O I：

10.1145/3360312

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many real-world human behaviors can be modeled and characterized as sequential decision-making processes, such as a taxi driver's choices of working regions and times. Each driver possesses unique preferences on the sequential choices over time and improves the driver's working efficiency. Understanding the dynamics of such preferences helps accelerate the learning process of taxi drivers. Prior works on taxi operation management mostly focus on finding optimal driving strategies or routes, lacking in-depth analysis on what the drivers learned during the process and how they affect the performance of the driver. In this work, we make the first attempt to establish Dynamic Human Preference Analytics. We inversely learn the taxi drivers' preferences from data and characterize the dynamics of such preferences over time. We extract two types of features (i.e., profile features and habit features) to model the decision space of drivers. Then through inverse reinforcement learning, we learn the preferences of drivers with respect to these features. The results illustrate that self-improving drivers tend to keep adjusting their preferences to habit features to increase their earning efficiency while keeping the preferences to profile features invariant. However, experienced drivers have stable preferences over time. The exploring drivers tend to randomly adjust the preferences over time.

引用

页数：19

共 29 条

[1]

Abbeel P., 2004, P 21 INT C MACH LEAR, P1

[2] A MARKOVIAN DECISION PROCESS [J].

BELLMAN, R .

JOURNAL OF MATHEMATICS AND MECHANICS, 1957, 6 (05) :679-684

[3]

Boularias A., 2011, P 14 INT C ART INT S, P182

[4]

Ding Y, 2017, IEEE T BIG DATA, V3, P126, DOI 10.1109/TBDATA.2016.2623320

[5]

Ge Y., 2010, P 16 ACM SIGKDD INT, P899, DOI DOI 10.1145/1835804.1835918

[6]

Ge Y., 2011, P 17 ACM SIGKDD INT, P735, DOI DOI 10.1145/2020408.2020523

[7] Detecting Vehicle Illegal Parking Events using Sharing Bikes' Trajectories [J].

He, Tianfu ;

Bao, Jie ;

Li, Ruiyan ;

Ruan, Sijie ;

Li, Yanhua ;

Tian, Chao ;

Zheng, Yu .

KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, :340-349

[8] A Traffic Flow Approach to Early Detection of Gathering Events: Comprehensive Results [J].

Khezerlou, Amin Vahedian ;

Zhou, Xun ;

Li, Lufan ;

Shafiq, Zubair ;

Liu, Alex X. ;

Zhang, Fan .

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2017, 8 (06)

[9]

Li Yanhua, 2014, P 2014 IEEE 30 INT C

[10]

Liu Chen, 2016, P IEEE INT C DAT MIN

← 1 2 3 →