Effect of sample size on prognostic genes analysis in non-small cell lung cancer

被引:1
作者
Li, Pingdong [1 ]
Li, Haiyang [2 ]
Wan, Zhiyi [3 ]
Lu, Yanan [3 ]
机构
[1] Capital Med Univ, Beijing Tongren Hosp, Dept Otolaryngol Head & Neck Surg, Beijing, Peoples R China
[2] Peoples Hosp Beijing Daxing Dist, Dept Otolaryngol, Beijing, Peoples R China
[3] Beijing City Univ, Sch Biomed, Beijing, Peoples R China
关键词
Non-small cell lung cancer; Sample size; Prognostic genes; Power law; Events number; BREAST-CANCER;
D O I
10.1007/s00438-023-01999-2
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The identification of prognostic genes can help in the clinical management of non-small cell lung cancer (NSCLC). However, there is little overlap in the prognostic genes identified in different NSCLC studies. One reason for this may be the inadequate sample size. Here, the effect of sample size on prognostic genes analysis was investigated based on 515 stage II/III NSCLC cases from two cohorts detected by whole-exome sequencing. Prognostic genes analysis was repeatedly performed 100 times for each sample size level using random resampling methods. In stage II lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cases from the TCGA Pan-Lung Cancer cohort, the number of statistically significant prognostic genes first increased with sample size in a power law, then fluctuated steadily, and finally decreased slightly. The power law growth curves were also observed in stage III LUAD and LUSC cases from the TCGA Pan-Lung Cancer cohort and stage III Chinese LUAD cases from the OncoSG cohort. The correlation R-2 of the fitted power law growth curves were all greater than 0.99. In addition, at the sample size level where the number of prognostic genes peaked, the mean proportion of true prognostic genes in patients with stage II LUAD and LUSC was 28.32% and 23.12%, which could partly explain the little overlap in prognostic genes between reports. In conclusion, the number of prognostic genes takes a power law growth with the sample size in NSCLC, independent of histopathological subtype, race, and stage. These results also show how sample size affects the reliability of prognostic genes and will aid trial design for genomic mutation-based prognostic studies in NSCLC.
引用
收藏
页码:549 / 554
页数:6
相关论文
共 20 条
  • [1] Molecular classification and molecular forecasting of breast cancer: Ready for clinical application?
    Brenton, JD
    Carey, LA
    Ahmed, AA
    Caldas, C
    [J]. JOURNAL OF CLINICAL ONCOLOGY, 2005, 23 (29) : 7350 - 7360
  • [2] Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas
    Campbell, Joshua D.
    Alexandrov, Anton
    Kim, Jaegil
    Wala, Jeremiah
    Berger, Alice H.
    Pedamallu, Chandra Sekhar
    Shukla, Sachet A.
    Guo, Guangwu
    Brooks, Angela N.
    Murray, Bradley A.
    Imielinski, Marcin
    Hu, Xin
    Ling, Shiyun
    Akbani, Rehan
    Rosenberg, Mara
    Cibulskis, Carrie
    Ramachandran, Aruna
    Collisson, Eric A.
    Kwiatkowski, David J.
    Lawrence, Michael S.
    Weinstein, John N.
    Verhaak, Roel G. W.
    Wu, Catherine J.
    Hammerman, Peter S.
    Cherniack, Andrew D.
    Getz, Gad
    Artyomov, Maxim N.
    Schreiber, Robert
    Govindan, Ramaswamy
    Meyerson, Matthew
    [J]. NATURE GENETICS, 2016, 48 (06) : 607 - +
  • [3] The Underlying Tumor Genomics of Predominant Histologic Subtypes in Lung Adenocarcinoma
    Caso, Raul
    Sanchez-Vega, Francisco
    Tan, Kay See
    Mastrogiacomo, Brooke
    Zhou, Jian
    Jones, Gregory D.
    Nguyen, Bastien
    Schultz, Nikolaus
    Connolly, James G.
    Brandt, Whitney S.
    Bott, Matthew J.
    Rocco, Gaetano
    Molena, Daniela
    Isbell, James M.
    Liu, Yuan
    Mayo, Marty W.
    Adusumilli, Prasad S.
    Travis, William D.
    Jones, David R.
    [J]. JOURNAL OF THORACIC ONCOLOGY, 2020, 15 (12) : 1844 - 1856
  • [4] Genomic landscape of lung adenocarcinoma in East Asians
    Chen, Jianbin
    Yang, Hechuan
    Teo, Audrey Su Min
    Amer, Lidyana Bte
    Sherbaf, Faranak Ghazi
    Tan, Chu Quan
    Alvarez, Jacob Josiah Santiago
    Lu, Bingxin
    Lim, Jia Qi
    Takano, Angela
    Nahar, Rahul
    Lee, Yin Yeng
    Phual, Cheryl Zi Jin
    Chua, Khi Pin
    Suteja, Lisda
    Chen, Pauline Jieqi
    Chang, Mei Mei
    Koh, Tina Puay Theng
    Ong, Boon-Hean
    Anantham, Devanand
    Hsu, Anne Ann Ling
    Gogna, Apoorva
    Too, Chow Wei
    Aung, Zaw Win
    Lee, Yi Fei
    Wang, Lanying
    Lim, Tony Kiat Hon
    Wilm, Andreas
    Choi, Poh Sum
    Ng, Poh Yong
    Toh, Chee Keong
    Lim, Wan-Teck
    Ma, Siming
    Lim, Bing
    Liu, Jin
    Tam, Wai Leong
    Skanderup, Anders Jacobsen
    Yeong, Joe Poh Sheng
    Tan, Eng-Huat
    Creasy, Caretha L.
    Tan, Daniel Shao Weng
    Hillmer, Axel M.
    Zhai, Weiwei
    [J]. NATURE GENETICS, 2020, 52 (02) : 177 - +
  • [5] Outcome signature genes in breast cancer: is there a unique set?
    Ein-Dor, L
    Kela, I
    Getz, G
    Givol, D
    Domany, E
    [J]. BIOINFORMATICS, 2005, 21 (02) : 171 - 178
  • [6] Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer
    Ein-Dor, L
    Zuk, O
    Domany, E
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (15) : 5923 - 5928
  • [7] Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates
    Hsieh, FY
    Lavori, PW
    [J]. CONTROLLED CLINICAL TRIALS, 2000, 21 (06): : 552 - 560
  • [8] Survival analysis: part II - applied clinical data analysis
    In, Junyong
    Lee, Dong Kyu
    [J]. KOREAN JOURNAL OF ANESTHESIOLOGY, 2019, 72 (05) : 441 - 457
  • [9] Jiang Yu, 2017, Cancer Inform, V16, p1176935116684825, DOI 10.1177/1176935116684825
  • [10] Genomics in breast cancer - therapeutic implications
    Lonning, PE
    Sorlie, T
    Borresen-Dale, AL
    [J]. NATURE CLINICAL PRACTICE ONCOLOGY, 2005, 2 (01): : 26 - 33