Introducing Data Science Techniques by Connecting Database Concepts and dplyr

被引:14
作者
Broatch, Jennifer E. [1 ]
Dietrich, Suzanne [1 ]
Goelman, Don [2 ]
机构
[1] Arizona State Univ, Sch Math & Nat Sci, POB 37100,Mail Code 2352, Phoenix, AZ 85069 USA
[2] Villanova Univ, Dept Comp Sci, Villanova, PA 19085 USA
来源
JOURNAL OF STATISTICS EDUCATION | 2019年 / 27卷 / 03期
基金
美国国家科学基金会;
关键词
Data science; Databases; Education; Teaching tool;
D O I
10.1080/10691898.2019.1647768
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Early exposure to data science skills, such as relational databases, is essential for students in statistics as well as many other disciplines in an increasingly data driven society. The goal of the presented pedagogy is to introduce undergraduate students to fundamental database concepts and to illuminate the connection between these database concepts and the functionality provided by the dplyr package for R. Specifically, students are introduced to relational database concepts using visualizations that are specifically designed for students with no data science or computing background. These educational tools, which are freely available on the Web, engage students in the learning process through a dynamic presentation that gently introduces relational databases and how to ask questions of data stored in a relational database. The visualizations are specifically designed for self-study by students, including a formative self-assessment feature. Students are then assigned a corresponding statistics lesson to utilize statistical software in R within the dplyr framework and to emphasize the need for these database skills. This article describes a pilot experience of introducing this pedagogy into a calculus-based introductory statistics course for mathematics and statistics majors, and provides a brief evaluation of the student perspective of the experience. for this article are available online.
引用
收藏
页码:147 / 153
页数:7
相关论文
共 17 条
[1]  
ASA, 2014, CURR GUID UND PROGR
[2]  
Baumer B.S., 2015, CHANCE, V28, P40, DOI DOI 10.1080/09332480.2015.1042739
[3]   A Data Science Course for Undergraduates: Thinking With Data [J].
Baumer, Ben .
AMERICAN STATISTICIAN, 2015, 69 (04) :334-342
[4]  
Dietrich S. W., 2017, 2017 ASEE ANN C EXP
[5]   An Animated Introduction to Relational Databases for Many Majors [J].
Dietrich, Suzanne W. ;
Goelman, Don ;
Borror, Connie M. ;
Crook, Sharon M. .
IEEE TRANSACTIONS ON EDUCATION, 2015, 58 (02) :81-89
[6]  
Friendly M., 2017, LAHMAN SEAN LAHMAN B
[7]   A Visual Introduction to Conceptual Database Design for All [J].
Goelman, Don ;
Dietrich, Suzanne W. .
SIGCSE'18: PROCEEDINGS OF THE 49TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, 2018, :320-325
[8]  
Grothendieck G., 2017, sqldf: Manipulate R Data Frames Using SQL
[9]   Data Science in Statistics Curricula: Preparing Students to "Think with Data" [J].
Hardin, J. ;
Hoerl, R. ;
Horton, Nicholas J. ;
Nolan, D. ;
Baumer, B. ;
Hall-Holt, O. ;
Murrell, P. ;
Peng, R. ;
Roback, P. ;
Lang, D. Temple ;
Ward, M. D. .
AMERICAN STATISTICIAN, 2015, 69 (04) :343-353
[10]   Nonmathematical statistics: A new direction for the undergraduate discipline [J].
Higgins, JJ .
AMERICAN STATISTICIAN, 1999, 53 (01) :1-6