Fatigue is one of the most commonly reported and the most distressing side effects reported in radiotherapy (RT) [1]. Since RT-related fatigue does not occur in every patient, in the context of individualizing care, the ability to predict fatigue risk has the potential to help guide treatment choices for patients and providers. Many factors influence the fatigue experienced by patients. However, it is becoming increasingly clear that RT-related fatigue is strongly related to a series of underlying genetically controlled biological events. We hypothesize that expression of genes related to oxidative stress produced from the toxic therapeutic regimen can be predictive of RT-related fatigue. A cohort of 15 non-metastatic prostate cancer patients receiving external beam RT (EBRT) were enrolled in a NIH IRB-approved study (NCT00852111). Fatigue scores, as measured by the Functional Assessment of Cancer Therapy- Fatigue (FACT-F) and blood were obtained at baseline, prior to EBRT (T1), as well as one-month post EBRT (T2). A lower FACT-F score indicates greater fatigue symptoms. Participants were categorized into fatigue groups, based on a >= 3 decrease in FACT-F score from T1 to T2 (worsening fatigue). The expression profile of 84 genes related to oxidative stress was measured from collected blood samples using the RT2 Profiler Human Oxidative Stress Plus PCR Array (Qiagen, Inc., Valencia, CA). Of the 15 men enrolled in this study, 7 were high fatigue (HF) and 8 were low fatigue (LF) at T2 [2]. We present a two-phase scheme which first selects a limited subset of genes deemed most predictive by a regularized linear regression method known as elastic net, followed by a widely used classifier, the regularized random forest (RRF), to discriminate patients having HF from LF. The elastic net and RRF were trained with T1 and T2 data as well as with T1-only data. We compared the results of these two workflows to impute the significant genes and to assess predictive ability of the T1-only data. In addition to PCR data, clinical information and demographic data were collected at T1 and T2. The feature vectors for all models consist of demographic and clinical characteristics, such as age, depressive symptom scores, blood hemoglobin level, body-mass index; and normalized gene-expression values from PCR. There are a total of 154 features from T1 & T2 (76 features from T1 alone). In evaluating the suitability of the proposed scheme, we first tested the efficacy of the tandem of elastic net (for feature selection) followed by RRF (for prediction) compared with elastic net alone and RRF alone (each doing both feature selection and prediction). In the case of elastic net + RRF, features were selected from an elastic net trained on all data; the performance numbers were obtained from leave-one-out cross-validation of the subsequent RRF trained on only the important features. Results show that the tandem approach out-performs either single-model approach. Interestingly, using the elastic net for feature selection boosts the AUC from 0.75 to 0.80. The same trend was observed when T1-only data is used to train the models. The AUC of the T1 tandem is boosted (compared with RRF alone) from 0.62 to 0.66. Removing the T2 information from training makes the problem more challenging, introducing false positives into the prediction. In summary, we recommend a two-phase scheme consisting of an elastic net, which first selects a limited subset of most predictive genes followed by RRF, which then discriminates patients having HF from LF. The scheme improves performance in cross-validation compared with RRF alone or elastic net alone. Several genes are consistently selected, including as PRDX5, FHL2 and GPX4. This may provide clues regarding the cause of RT-related fatigue.