Extensive genomic and transcriptional diversity identified through massively parallel DNA and RNA sequencing of eighteen Korean individuals

被引:109
作者
Ju, Young Seok [1 ,2 ]
Kim, Jong-Il [1 ,3 ,4 ,5 ]
Kim, Sheehyun [1 ,2 ]
Hong, Dongwan [1 ]
Park, Hansoo [1 ,6 ,7 ]
Shin, Jong-Yeon [1 ,5 ]
Lee, Seungbok [1 ,4 ]
Lee, Won-Chul [1 ,4 ]
Kim, Sujung [5 ]
Yu, Saet-Byeol [5 ]
Park, Sung-Soo [5 ]
Seo, Seung-Hyun [5 ]
Yun, Ji-Young [5 ]
Kim, Hyun-Jin [1 ,4 ]
Lee, Dong-Sung [1 ,4 ]
Yavartanoo, Maryam [1 ,4 ]
Kang, Hyunseok Peter [1 ]
Gokcumen, Omer [6 ,7 ]
Govindaraju, Diddahally R. [6 ,7 ]
Jung, Jung Hee [2 ]
Chong, Hyonyong [2 ,8 ]
Yang, Kap-Seok [2 ]
Kim, Hyungtae [2 ]
Lee, Charles [6 ,7 ]
Seo, Jeong-Sun [1 ,2 ,3 ,4 ,5 ,8 ]
机构
[1] Seoul Natl Univ, Med Res Ctr, GMI, Seoul, South Korea
[2] Macrogen Inc, Seoul, South Korea
[3] Seoul Natl Univ, Coll Med, Dept Biochem, Seoul, South Korea
[4] Seoul Natl Univ, Grad Sch, Dept Biomed Sci, Seoul, South Korea
[5] Psoma Therapeut Inc, Seoul, South Korea
[6] Brigham & Womens Hosp, Dept Pathol, Boston, MA 02115 USA
[7] Harvard Univ, Sch Med, Boston, MA USA
[8] Axeq Technol, Rockville, MD USA
基金
美国国家卫生研究院;
关键词
GENE-EXPRESSION; STRUCTURAL VARIANTS; EDITING SITES; FAMILY; COMMON; SNP;
D O I
10.1038/ng.872
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Massively parallel sequencing technologies have identified a broad spectrum of human genome diversity. Here we deep sequenced and correlated 18 genomes and 17 transcriptomes of unrelated Korean individuals. This has allowed us to construct a genome-wide map of common and rare variants and also identify variants formed during DNA-RNA transcription. We identified 9.56 million genomic variants, 23.2% of which appear to be previously unidentified. From transcriptome sequencing, we discovered 4,414 transcripts not previously annotated. Finally, we revealed 1,809 sites of transcriptional base modification, where the transcriptional landscape is different from the corresponding genomic sequences, and 580 sites of allele-specific expression. Our findings suggest that a considerable number of unexplored genomic variants still remain to be identified in the human genome, and that the integrated analysis of genome and transcriptome sequencing is powerful for understanding the diversity and functional aspects of human genomic variants.
引用
收藏
页码:745 / U47
页数:10
相关论文
共 50 条
  • [11] Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays
    Drmanac, Radoje
    Sparks, Andrew B.
    Callow, Matthew J.
    Halpern, Aaron L.
    Burns, Norman L.
    Kermani, Bahram G.
    Carnevali, Paolo
    Nazarenko, Igor
    Nilsen, Geoffrey B.
    Yeung, George
    Dahl, Fredrik
    Fernandez, Andres
    Staker, Bryan
    Pant, Krishna P.
    Baccash, Jonathan
    Borcherding, Adam P.
    Brownley, Anushka
    Cedeno, Ryan
    Chen, Linsu
    Chernikoff, Dan
    Cheung, Alex
    Chirita, Razvan
    Curson, Benjamin
    Ebert, Jessica C.
    Hacker, Coleen R.
    Hartlage, Robert
    Hauser, Brian
    Huang, Steve
    Jiang, Yuan
    Karpinchyk, Vitali
    Koenig, Mark
    Kong, Calvin
    Landers, Tom
    Le, Catherine
    Liu, Jia
    McBride, Celeste E.
    Morenzoni, Matt
    Morey, Robert E.
    Mutch, Karl
    Perazich, Helena
    Perry, Kimberly
    Peters, Brock A.
    Peterson, Joe
    Pethiyagoda, Charit L.
    Pothuraju, Kaliprasad
    Richter, Claudia
    Rosenbaum, Abraham M.
    Roy, Shaunak
    Shafto, Jay
    Sharanhovich, Uladzislau
    [J]. SCIENCE, 2010, 327 (5961) : 78 - 81
  • [12] A scan for genetic determinants of human hair morphology:: EDAR is associated with Asian hair thickness
    Fujimoto, Akihiro
    Kimura, Ryosuke
    Ohashi, Jun
    Omi, Kazuya
    Yuliwulandari, Rika
    Batubara, Lilian
    Mustofa, Mohammad Syamsul
    Samakkarn, Urai
    Settheetham-Ishida, Wannapa
    Ishida, Takafumi
    Morishita, Yasuyuki
    Furusawa, Takuro
    Nakazawa, Minato
    Ohtsuka, Ryutaro
    Tokunaga, Katsushi
    [J]. HUMAN MOLECULAR GENETICS, 2008, 17 (06) : 835 - 843
  • [13] TIARA: a database for accurate analysis of multiple personal genomes based on cross-technology
    Hong, Dongwan
    Park, Sung-Soo
    Ju, Young Seok
    Kim, Sheehyun
    Shin, Jong-Yeon
    Kim, Sujung
    Yu, Saet-Byeol
    Lee, Won-Chul
    Lee, Seungbok
    Park, Hansoo
    Kim, Jong-Il
    Seo, Jeong-Sun
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D883 - D888
  • [14] Detection of large-scale variation in the human genome
    Iafrate, AJ
    Feuk, L
    Rivera, MN
    Listewnik, ML
    Donahoe, PK
    Qi, Y
    Scherer, SW
    Lee, C
    [J]. NATURE GENETICS, 2004, 36 (09) : 949 - 951
  • [15] Reference-unbiased copy number variant analysis using CGH microarrays
    Ju, Young Seok
    Hong, Dongwan
    Kim, Sheehyun
    Park, Sung-Soo
    Kim, Sujung
    Lee, Seungbok
    Park, Hansoo
    Kim, Jong-Il
    Seo, Jeong-Sun
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 (20) : e190
  • [16] The first Irish genome and ways of improving sequence accuracy
    Ju, Young Seok
    Yoo, Yun Joo
    Kim, Jong-Il
    Seo, Jeong-Sun
    [J]. GENOME BIOLOGY, 2010, 11 (09)
  • [17] Genome assembly comparison identifies structural variants in the human genome
    Khaja, Razi
    Zhang, Junjun
    MacDonald, Jeffrey R.
    He, Yongshu
    Joseph-George, Ann M.
    Wei, John
    Rafiq, Muhammad A.
    Qian, Cheng
    Shago, Mary
    Pantano, Lorena
    Aburatani, Hiroyuki
    Jones, Keith
    Redon, Richard
    Hurles, Matthew
    Armengol, Lluis
    Estivill, Xavier
    Mural, Richard J.
    Lee, Charles
    Scherer, Stephen W.
    Feuk, Lars
    [J]. NATURE GENETICS, 2006, 38 (12) : 1413 - 1418
  • [18] Kim Jong-Il, 2009, Genomics & Informatics, V7, P159
  • [19] A highly annotated whole-genome sequence of a Korean individual
    Kim, Jong-Il
    Ju, Young Seok
    Park, Hansoo
    Kim, Sheehyun
    Lee, Seonwook
    Yi, Jae-Hyuk
    Mudge, Joann
    Miller, Neil A.
    Hong, Dongwan
    Bell, Callum J.
    Kim, Hye-Sun
    Chung, In-Soon
    Lee, Woo-Chung
    Lee, Ji-Sun
    Seo, Seung-Hyun
    Yun, Ji-Young
    Woo, Hyun Nyun
    Lee, Heewook
    Suh, Dongwhan
    Lee, Seungbok
    Kim, Hyun-Jin
    Yavartanoo, Maryam
    Kwak, Minhye
    Zheng, Ying
    Lee, Mi Kyeong
    Park, Hyunjun
    Kim, Jeong Yeon
    Gokcumen, Omer
    Mills, Ryan E.
    Zaranek, Alexander Wait
    Thakuria, Joseph
    Wu, Xiaodi
    Kim, Ryan W.
    Huntley, Jim J.
    Luo, Shujun
    Schroth, Gary P.
    Wu, Thomas D.
    Kim, HyeRan
    Yang, Kap-Seok
    Park, Woong-Yang
    Kim, Hyungtae
    Church, George M.
    Lee, Charles
    Kingsmore, Stephen F.
    Seo, Jeong-Sun
    [J]. NATURE, 2009, 460 (7258) : 1011 - U96
  • [20] DARNED: a DAtabase of RNa EDiting in humans
    Kiran, Anmol
    Baranov, Pavel V.
    [J]. BIOINFORMATICS, 2010, 26 (14) : 1772 - 1776