Automated machine learning tool: The first stop for data science and statistical model building

被引:0
|
作者
Gopagoni D. [1 ]
Lakshmi P.V. [1 ]
机构
[1] Department of Computer Science and Engineering, GIT GITAM (Deemed to be University), Vishakhapatnam, Andhra Pradesh
来源
International Journal of Advanced Computer Science and Applications | 2020年 / 02期
关键词
Artificial neural networks; Automated machine learning; Drug design; K-means clustering; Market analysis; Naive bayes classification; QSAR; QSPR; R program; Regression models; Shiny web app; Supervised learning; Support vector machines;
D O I
10.14569/ijacsa.2020.0110253
中图分类号
学科分类号
摘要
Machine learning techniques are designed to derive knowledge out of existing data. Increased computational power, use of natural language processing, image processing methods made easy creation of rich data. Good domain knowledge is required to build useful models. Uncertainty remains around choosing the right sample data, variables reduction and selection of statistical algorithm. A suitable statistical method coupled with explaining variables is critical for model building and analysis. There are multiple choices around each parameter. An automated system which could help the scientists to select an appropriate data set coupled with learning algorithm will be very useful. A freely available web-based platform, named automated machine learning tool (AMLT), is developed in this study. AMLT will automate the entire model building process. AMLT is equipped with all most commonly used variable selection methods, statistical methods both for supervised and unsupervised learning. AMLT can also do the clustering. AMLT uses statistical principles like R2 to rank the models and automatic test set validation. Tool is validated for connectivity and capability by reproducing two published works. © Science and Information Organization.
引用
收藏
页码:410 / 418
页数:8
相关论文
共 50 条
  • [1] Automated Machine Learning Tool: The First Stop for Data Science and Statistical Model Building
    Gopagoni, DeepaRani
    Lakshmi, P., V
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (02) : 410 - 418
  • [2] Self-Service Data Science in Healthcare with Automated Machine Learning
    Ooms, Richard
    Spruit, Marco
    APPLIED SCIENCES-BASEL, 2020, 10 (09):
  • [3] State of the Art and Outlook of Data Science and Machine Learning in Organic Chemistry
    Stefani, Ricardo
    CURRENT ORGANIC CHEMISTRY, 2023, 27 (16) : 1393 - 1397
  • [4] Automated data-driven modeling of building energy systems via machine learning algorithms
    Raetz, Martin
    Javadi, Amir Pasha
    Baranski, Marc
    Finkbeiner, Konstantin
    Mueller, Dirk
    ENERGY AND BUILDINGS, 2019, 202
  • [5] Encoding dissimilarity data for statistical model building
    Wahba, Grace
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2010, 140 (12) : 3580 - 3596
  • [6] Analysing the Overfit of the Auto-sklearn Automated Machine Learning Tool
    Fabris, Fabio
    Freitas, Alex A.
    MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, 2019, 11943 : 508 - 520
  • [7] Recent advances in data mining and machine learning for enhanced building energy management
    Zhou, Xinlei
    Du, Han
    Xue, Shan
    Ma, Zhenjun
    ENERGY, 2024, 307
  • [8] XAutoML: A Visual Analytics Tool for Understanding and Validating Automated Machine Learning
    Zoeller, Marc-Andre
    Titov, Waldemar
    Schlegel, Thomas
    Huber, Marco F.
    ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2023, 13 (04)
  • [9] Machine learning and data science in soft materials engineering
    Ferguson, Andrew L.
    JOURNAL OF PHYSICS-CONDENSED MATTER, 2018, 30 (04)
  • [10] Automated Machine Learning for the Classification of Normal and Abnormal Electromyography Data
    Kefalas, Marios
    Koch, Milan
    Geraedts, Victor
    Wang, Hao
    Tannemaat, Martijn
    Back, Thomas
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 1176 - 1185