Big data management challenges in health research-a literature review

被引:39
作者
Wang, Xiaoming [1 ]
Williams, Carolyn [1 ]
Liu, Zhen Hua [2 ]
Croghan, Joe [3 ]
机构
[1] NIAID, NIH, 5601 Fishers Lane, Rockville, MD 20852 USA
[2] Oracle Corp, Redwood City, CA USA
[3] NIAID, Software Engn, Rockville, MD USA
关键词
big data management; system performance; data quality; machine learning; SQL and NoSQL; GENETIC ARCHITECTURE; CLINICAL-RESEARCH; BLOOD-PRESSURE; BECKMAN REPORT; DATA SCIENCE; GENOMIC DATA; ENTITY; INFORMATION; ATTRIBUTE; BIOLOGY;
D O I
10.1093/bib/bbx086
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Big data management for information centralization (i.e. making data of interest findable) and integration (i.e. making related data connectable) in health research is a defining challenge in biomedical informatics. While essential to create a foundation for knowledge discovery, optimized solutions to deliver high-quality and easy-to-use information resources are not thoroughly explored. In this review, we identify the gaps between current data management approaches and the need for new capacity to manage big data generated in advanced health research. Focusing on these unmet needs and well-recognized problems, we introduce state-of-the-art concepts, approaches and technologies for data management from computing academia and industry to explore improvement solutions. We explain the potential and significance of these advances for biomedical informatics. In addition, we discuss specific issues that have a great impact on technical solutions for developing the next generation of digital products (tools and data) to facilitate the raw-data-to-knowledge process in health research.
引用
收藏
页码:156 / 167
页数:12
相关论文
共 156 条
[1]  
Abadi D, 2014, SIGMOD REC, V43, P61
[2]  
Alex Beatrice, 2008, Pac Symp Biocomput, P556
[3]   From big data analysis to personalized medicine for all: challenges and opportunities [J].
Alyass, Akram ;
Turcotte, Michelle ;
Meyre, David .
BMC MEDICAL GENOMICS, 2015, 8
[4]  
[Anonymous], NIH data sharing policies
[5]  
Atikoglu Berk, 2012, Performance Evaluation Review, V40, P53, DOI 10.1145/2318857.2254766
[6]   Making sense of big data in health research: Towards an EU action plan [J].
Auffray, Charles ;
Balling, Rudi ;
Barroso, Ines ;
Bencze, Laszlo ;
Benson, Mikael ;
Bergeron, Jay ;
Bernal-Delgado, Enrique ;
Blomberg, Niklas ;
Bock, Christoph ;
Conesa, Ana ;
Del Signore, Susanna ;
Delogne, Christophe ;
Devilee, Peter ;
Di Meglio, Alberto ;
Eijkemans, Marinus ;
Flicek, Paul ;
Graf, Norbert ;
Grimm, Vera ;
Guchelaar, Henk-Jan ;
Guo, Yi-Ke ;
Gut, Ivo Glynne ;
Hanbury, Allan ;
Hanif, Shahid ;
Hilgers, Ralf-Dieter ;
Honrado, Angel ;
Hose, D. Rod ;
Houwing-Duistermaat, Jeanine ;
Hubbard, Tim ;
Janacek, Sophie Helen ;
Karanikas, Haralampos ;
Kievits, Tim ;
Kohler, Manfred ;
Kremer, Andreas ;
Lanfear, Jerry ;
Lengauer, Thomas ;
Maes, Edith ;
Meert, Theo ;
Mueller, Werner ;
Nickel, Dorthe ;
Oledzki, Peter ;
Pedersen, Bertrand ;
Petkovic, Milan ;
Pliakos, Konstantinos ;
Rattray, Magnus ;
Redon i Mas, Josep ;
Schneider, Reinhard ;
Sengstag, Thierry ;
Serra-Picamal, Xavier ;
Spek, Wouter ;
Vaas, Lea A. I. .
GENOME MEDICINE, 2016, 8
[7]   Big Data Clinical Research: Validity, Ethics, and Regulation [J].
Balas, E. Andrew ;
Vernon, Marlo ;
Magrabi, Farah ;
Gordon, Lynne Thomas ;
Sexton, Joanne .
MEDINFO 2015: EHEALTH-ENABLED HEALTH, 2015, 216 :448-452
[8]   Proteogenomics: Key Driver for Clinical Discovery and Personalized Medicine [J].
Barbieri, Ruggero ;
Guryev, Victor ;
Brandsma, Corry-Anke ;
Suits, Frank ;
Bischoff, Rainer ;
Horvatovich, Peter .
PROTEOGENOMICS, 2016, 926 :21-47
[9]  
Bellinger G, 2004, MENTAL MODEL MUSINGS, P1
[10]   Preserving an integrated view of informatics [J].
Bernstam, Elmer V. ;
Tenenbaum, Jessica D. ;
Kuperman, Gilad J. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (E1) :E178-E179