Genomics and Privacy: Implications of the New Reality of Closed Data for the Field

被引:45
作者
Greenbaum, Dov [1 ,2 ,3 ,4 ,5 ]
Sboner, Andrea [1 ,2 ]
Mu, Xinmeng Jasmine [1 ]
Gerstein, Mark [1 ,2 ,6 ]
机构
[1] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[2] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT USA
[3] Sanford T Colb & Co Intellectual Property Law, Rehovot, Israel
[4] Kiryat Ono Coll, Ctr Hlth Law Bioeth & Hlth Policy, Tel Aviv, Israel
[5] Stanford Univ, Stanford Law Sch, Ctr Law & Biosci, Stanford, CA 94305 USA
[6] Yale Univ, Dept Comp Sci, New Haven, CT 06520 USA
关键词
RNA-SEQ; ARCHIVE;
D O I
10.1371/journal.pcbi.1002278
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can "slice" and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches-for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the university setting will, in turn, create additional privacy issues and social conundrums.
引用
收藏
页数:6
相关论文
共 20 条
[1]  
[Anonymous], DUKE L TECH REV
[2]  
[Anonymous], BMC MED RES METHODOL
[3]  
[Anonymous], MOL SYSTEMS BIOL
[4]   NCBI GEO: archive for high-throughput functional genomic data [J].
Barrett, Tanya ;
Troup, Dennis B. ;
Wilhite, Stephen E. ;
Ledoux, Pierre ;
Rudnev, Dmitry ;
Evangelista, Carlos ;
Kim, Irene F. ;
Soboleva, Alexandra ;
Tomashevsky, Maxim ;
Marshall, Kimberly A. ;
Phillippy, Katherine H. ;
Sherman, Patti M. ;
Muertter, Rolf N. ;
Edgar, Ron .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D885-D890
[5]  
Benson DA, 2013, NUCLEIC ACIDS RES, V41, pD36, DOI [10.1093/nar/gkn723, 10.1093/nar/gkp1024, 10.1093/nar/gkw1070, 10.1093/nar/gkr1202, 10.1093/nar/gkx1094, 10.1093/nar/gkl986, 10.1093/nar/gkq1079, 10.1093/nar/gks1195, 10.1093/nar/gkg057]
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]  
Davies K., 2010, The $1,000 Genome: The Revolution in DNA Sequencing and the New Era of Personalized Medicine
[8]   Ensembl 2011 [J].
Flicek, Paul ;
Amode, M. Ridwan ;
Barrell, Daniel ;
Beal, Kathryn ;
Brent, Simon ;
Chen, Yuan ;
Clapham, Peter ;
Coates, Guy ;
Fairley, Susan ;
Fitzgerald, Stephen ;
Gordon, Leo ;
Hendrix, Maurice ;
Hourlier, Thibaut ;
Johnson, Nathan ;
Kaehaeri, Andreas ;
Keefe, Damian ;
Keenan, Stephen ;
Kinsella, Rhoda ;
Kokocinski, Felix ;
Kulesha, Eugene ;
Larsson, Pontus ;
Longden, Ian ;
McLaren, William ;
Overduin, Bert ;
Pritchard, Bethan ;
Riat, Harpreet Singh ;
Rios, Daniel ;
Ritchie, Graham R. S. ;
Ruffier, Magali ;
Schuster, Michael ;
Sobral, Daniel ;
Spudich, Giulietta ;
Tang, Y. Amy ;
Trevanion, Stephen ;
Vandrovcova, Jana ;
Vilella, Albert J. ;
White, Simon ;
Wilder, Steven P. ;
Zadissa, Amonida ;
Zamora, Jorge ;
Aken, Bronwen L. ;
Birney, Ewan ;
Cunningham, Fiona ;
Dunham, Ian ;
Durbin, Richard ;
Fernandez-Suarez, Xose M. ;
Herrero, Javier ;
Hubbard, Tim J. P. ;
Parker, Anne ;
Proctor, Glenn .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D800-D806
[9]   Efficient storage of high throughput DNA sequencing data using reference-based compression [J].
Fritz, Markus Hsi-Yang ;
Leinonen, Rasko ;
Cochrane, Guy ;
Birney, Ewan .
GENOME RESEARCH, 2011, 21 (05) :734-740
[10]   The UCSC Genome Browser database: update 2011 [J].
Fujita, Pauline A. ;
Rhead, Brooke ;
Zweig, Ann S. ;
Hinrichs, Angie S. ;
Karolchik, Donna ;
Cline, Melissa S. ;
Goldman, Mary ;
Barber, Galt P. ;
Clawson, Hiram ;
Coelho, Antonio ;
Diekhans, Mark ;
Dreszer, Timothy R. ;
Giardine, Belinda M. ;
Harte, Rachel A. ;
Hillman-Jackson, Jennifer ;
Hsu, Fan ;
Kirkup, Vanessa ;
Kuhn, Robert M. ;
Learned, Katrina ;
Li, Chin H. ;
Meyer, Laurence R. ;
Pohl, Andy ;
Raney, Brian J. ;
Rosenbloom, Kate R. ;
Smith, Kayla E. ;
Haussler, David ;
Kent, W. James .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D876-D882