Using Decision Tree Classification Model to Predict Payment Type in NYC Yellow Taxi

被引:0
作者
Ismaeil, Hadeer [1 ]
Kholeif, Sherif [1 ]
Abdel-Fattah, Manal A. [1 ]
机构
[1] Helwan Univ, Informat Syst Dept, Fac Comp & Artificial Intelligence, Cairo, Egypt
关键词
Big data analytics; apache spark; decision tree classification; taxi trips; machine learning;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The taxi services are growing rapidly as reliable services. The demand and competition between service providers is so high. A billion trip records need to be analyzed to raise the spirit of competition, understand the service users, and improve the business. Although decision tree classification is a common algorithm which generates rules that are easy to understand, there is no implementation for classification on taxi dataset. This research applies the decision tree classification model on taxi dataset to classify instances correctly, build a decision tree, and calculate accuracy. This experiment collected decision tree algorithm with Spark framework to present the good performance and high accuracy when predicting payment type. Applied decision tree algorithm with different aspects on NYC taxi dataset results in high accuracy.
引用
收藏
页码:238 / 244
页数:7
相关论文
共 23 条
[1]  
[Anonymous], 2018, NYC taxi and limousine commission
[2]  
[Anonymous], 2018, INTR BIG DAT
[3]   An Efficient CRM-Data Mining Framework for the Prediction of Customer Behaviour [J].
Bahari, Femina T. ;
Elayidom, Sudheep M. .
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES, ICICT 2014, 2015, 46 :725-731
[4]  
Charbuty B., 2021, J. Appl. Sci. Technol. Trends, V2, P20, DOI [DOI 10.38094/JASTT20165, 10.38094/jastt20165]
[5]  
Chaurasia V., 2014, International Journal of Innovative Research in Computer and Communication Engineering, V2, P2456
[6]  
Dunren Che, 2013, Database Systems for Advanced Applications. 18th International Conference, DASFAA 2013. International Workshops: BDMA, SNSM, SeCop. Proceedings: LNCS 7827, P1, DOI 10.1007/978-3-642-40270-8_1
[7]  
Joseji L., 2014, 6 SPARKLING FEATURES
[8]   Approx-SMOTE: Fast SMOTE for Big Data on Apache Spark [J].
Juez-Gil, Mario ;
Arnaiz-Gonzalez, Alvar ;
Rodriguez, Juan J. ;
Lopez-Nozal, Carlos ;
Garcia-Osorio, Cesar .
NEUROCOMPUTING, 2021, 464 :432-437
[9]  
Karau H., 2017, HIGH PERFORMANCE SPA, P219
[10]   Big Data Analytics for Healthcare Industry: Impact, Applications, and Tools [J].
Kumar, Sunil ;
Singh, Maninder .
BIG DATA MINING AND ANALYTICS, 2019, 2 (01) :48-57