Machine learning as a service for enabling Internet of Things and People

被引:0
作者
Haytham Assem
Lei Xu
Teodora Sandra Buda
Declan O’Sullivan
机构
[1] IBM Ireland,Cognitive Computing Group, Innovation Exchange
[2] Trinity College Dublin,ADAPT Center, School of Computer Science and Statistics
来源
Personal and Ubiquitous Computing | 2016年 / 20卷
关键词
Machine learning; Predictive modelling; Supervised learning; Regression models; Classification models;
D O I
暂无
中图分类号
学科分类号
摘要
The future Internet is expected to connect billions of people, things and services having the potential to deliver a new set of applications by deriving new insights from the data generated from these diverse data sources. This highly interconnected global network brings new types of challenges in analysing and making sense of data. This is why machine learning is expected to be a crucial technology in the future, in making sense of data, in improving business and decision making, and in doing so, providing the potential to solve a wide range of problems in health care, telecommunications, urban computing, and others. Machine learning algorithms can learn how to perform certain tasks by generalizing examples from a range of sampling. This is a totally different paradigm than traditional programming language approaches, which are based on writing programs that process data to produce an output. However, choosing a suitable machine learning algorithm for a particular application requires a substantial amount of time and effort that is hard to undertake even with excellent research papers and textbooks. In order to reduce the time and effort, this paper introduces the TCDC (train, compare, decide, and change) approach, which can be thought as a ‘Machine Learning as a Service’ approach, to aid machine learning researchers and practitioners to choose the optimum machine learning model to use for achieving the best trade-off between accuracy and interpretability, computational complexity, and ease of implementation. The paper includes the results of testing and evaluating the recommenders based on the TCDC approach (in comparison with the traditional default approach) applied to 12 datasets that are available as open-source datasets drawn from diverse domains including health care, agriculture, aerodynamics and others. Our results indicate that the proposed approach selects the best model in terms of predictive accuracy in 62.5 % for regression tests performed and 75 % for classification tests.
引用
收藏
页码:899 / 914
页数:15
相关论文
共 36 条
  • [1] Bengio Y(2007)Greedy layer-wise training of deep networks Adv Neural Inf Process Syst 19 153-836
  • [2] Lamblin P(1979)Robust locally weighted regression and smoothing scatterplots J Am Stat Assoc 74 829-87
  • [3] Popovici D(2012)A few useful things to know about machine learning Commun ACM 55 78-1554
  • [4] Larochelle H(2010)Regularization paths for generalized linear models via coordinate descent J Stat Softw 33 1-699
  • [5] Cleveland WS(2006)A fast learning algorithm for deep belief nets Neural Comput 18 1527-60
  • [6] Domingos P(2005)The design and analysis of benchmark experiments J Comput Gr Stat 14 675-3307
  • [7] Friedman J(1995)Comparison of learning algorithms for handwritten digit recognition Int Conf Artif Neural Netw 60 53-6572
  • [8] Hastie T(2005)Prediction error estimation: a comparison of resampling methods Bioinformatics 21 3301-1390
  • [9] Tibshirani R(1984)Classification and regression trees Wadsworth Int Gr 93 101-undefined
  • [10] Hinton GE(2002)Diagnosis of multiple cancer types by shrunken centroids of gene expression Proc Nat Acad Sci 99 6567-undefined