Improving Information Systems Sustainability by Applying Machine Learning to Detect and Reduce Data Waste

被引:1
作者
Savarimuthu, Bastin Tony Roy [1 ]
Corbett, Jacqueline [2 ]
Yasir, Muhammad [1 ]
Lakshmi, Vijaya [3 ]
机构
[1] Univ Otago, Informat Sci, Dunedin, New Zealand
[2] Univ Laval, Management Informat Syst, Fac Business Adm, Quebec City, PQ, Canada
[3] Univ Laval, Management Informat Syst, Quebec City, PQ, Canada
来源
COMMUNICATIONS OF THE ASSOCIATION FOR INFORMATION SYSTEMS | 2023年 / 53卷
基金
加拿大自然科学与工程研究理事会;
关键词
Data Waste; Information Systems; Information Management; Sustainability; Machine Learning; Deep Learning; Reviews; DESIGN SCIENCE RESEARCH; ONLINE REVIEWS; METHODOLOGY; MANAGEMENT; ENERGY;
D O I
10.17705/1CAIS.05308
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big data are key building blocks for creating information value. However, information systems are increasingly plagued with useless, waste data that can impede their effective use and threaten sustainability objectives. Using a constructive design science approach, this work first, defines digital data waste. Then, it develops an ensemble artifact comprising two components. The first component comprises 13 machine learning models for detecting data waste. Applying these to 35,576 online reviews in two domains reveals data waste of 1.9% for restaurant reviews compared to 35.8% for app reviews. Machine learning can accurately identify 83% to 99.8% of data waste; deep learning models are particularly promising, with accuracy ranging from 96.4% to 99.8%. The second component comprises a sustainability cost calculator to quantify the social, economic, and environmental benefits of reducing data waste. Eliminating 5948 useless reviews in the sample would result in saving 6.9 person hours, $2.93 in server, middleware and client costs, and 9.52 kg of carbon emissions. Extrapolating these results to reviews on the internet shows substantially greater savings. This work contributes to design knowledge relating to sustainable information systems by highlighting the new class of problem of data waste and by designing approaches for addressing this problem.
引用
收藏
页码:189 / 213
页数:27
相关论文
共 119 条
  • [1] Machine learning in information systems-a bibliographic review and open research issues
    Abdel-Karim, Benjamin M.
    Pfeuffer, Nicolas
    Hinz, Oliver
    [J]. ELECTRONIC MARKETS, 2021, 31 (03) : 643 - 670
  • [2] Chatbots: History, technology, and applications
    Adamopoulou, Eleni
    Moussiades, Lefteris
    [J]. MACHINE LEARNING WITH APPLICATIONS, 2020, 2
  • [3] COMPARATIVE ANALYSES OF BERT, ROBERTA, DISTILBERT, AND XLNET FOR TEXT-BASED EMOTION RECOGNITION
    Adoma, Acheampong Francisca
    Henry, Nunoo-Mensah
    Chen, Wenyu
    [J]. 2020 17TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2020, : 117 - 121
  • [4] Predicting the helpfulness of online customer reviews: The role of title features
    Akbarabadi, Mina
    Hosseini, Monireh
    [J]. INTERNATIONAL JOURNAL OF MARKET RESEARCH, 2020, 62 (03) : 272 - 287
  • [5] Al Marouf A., 2019, IEEE INT C EL COMP C
  • [6] Amrit C., 2015, EUR C INF SYST
  • [7] Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes
    Austin, Peter C.
    Tu, Jack V.
    Ho, Jennifer E.
    Levy, Daniel
    Lee, Douglas S.
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2013, 66 (04) : 398 - 407
  • [8] Ayodele TO, 2010, New advances in machine learning. In-Tech, DOI DOI 10.5772/9385
  • [9] Baglee D., 2018, 14 INT C DATA SCI IC
  • [10] Green Cloud Computing: Balancing Energy in Processing, Storage, and Transport
    Baliga, Jayant
    Ayre, Robert W. A.
    Hinton, Kerry
    Tucker, Rodney S.
    [J]. PROCEEDINGS OF THE IEEE, 2011, 99 (01) : 149 - 167