Teaching Natural Language Processing through Big Data Text Summarization with Problem-Based Learning

被引:2
作者
Li L. [1 ]
Geissinger J. [2 ]
Ingram W.A. [3 ]
Fox E.A. [1 ]
机构
[1] Department of Computer Science, Virginia Tech, 24061, VA
[2] Department of Electrical and Computer Engineering, Virginia Tech, 24061, VA
[3] University Libraries, Virginia Tech, 24061, VA
基金
美国国家科学基金会;
关键词
big data text analytics; computer science education; deep learning; information system education; machine learning; natural language processing; NLP; problem-based learning;
D O I
10.2478/dim-2020-0003
中图分类号
学科分类号
摘要
Natural language processing (NLP) covers a large number of topics and tasks related to data and information management, leading to a complex and challenging teaching process. Meanwhile, problem-based learning is a teaching technique specifically designed to motivate students to learn efficiently, work collaboratively, and communicate effectively. With this aim, we developed a problem-based learning course for both undergraduate and graduate students to teach NLP. We provided student teams with big data sets, basic guidelines, cloud computing resources, and other aids to help different teams in summarizing two types of big collections: Web pages related to events, and electronic theses and dissertations (ETDs). Student teams then deployed different libraries, tools, methods, and algorithms to solve the task of big data text summarization. Summarization is an ideal problem to address learning NLP since it involves all levels of linguistics, as well as many of the tools and techniques used by NLP practitioners. The evaluation results showed that all teams generated coherent and readable summaries. Many summaries were of high quality and accurately described their corresponding events or ETD chapters, and the teams produced them along with NLP pipelines in a single semester. Further, both undergraduate and graduate students gave statistically significant positive feedback, relative to other courses in the Department of Computer Science. Accordingly, we encourage educators in the data and information management field to use our approach or similar methods in their teaching and hope that other researchers will also use our data sets and synergistic solutions to approach the new and challenging tasks we addressed. © 2020 Liuqing Li et al., published by Sciendo
引用
收藏
页码:18 / 43
页数:25
相关论文
共 69 条
[1]  
Abadi M., Barham P., Chen J., Chen Z., Davis A., Dean J., Kudlur M., TensorFlow: a system for large-scale machine learning, Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation, pp. 265-283, (2016)
[2]  
AI2, Science parse, (2019)
[3]  
Allen D.E., Duch B.J., Groh S.E., The power of problem-based learning in teaching introductory science courses, New Directions for Teaching and Learning, 1996, 68, pp. 43-52, (1996)
[4]  
Apache Solr (8.4.1)[Computer software]
[5]  
Apache Spark. [Computer software]
[6]  
Baeza-Yates R., Ribeiro-Neto B., Modern information retrieval, 463, (1999)
[7]  
Bahdanau D., Cho K., Bengio Y., Neural machine translation by jointly learning to align and translate, Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, pp. 1-15, (2015)
[8]  
Baldridge J., The opennlp project, (2005)
[9]  
Barrows H.S., A taxonomy of problem-based learning methods, Medical Education, 20, 6, pp. 481-486, (1986)
[10]  
Barrows H.S., Tamblyn R.M., Problem-based learning: An approach to medical education, (1980)