In this paper, an achievable rate analysis is presented for the downlink and uplink (UL) of reconfigurable intelligent surface (RIS)-aided massive multiple-input multiple-output (MIMO) non-orthogonal multiple-access (NOMA) systems. A linear minimum mean square error estimation technique is used to estimate the UL composite channels at the massive MIMO base-station via the user pilots for a given RIS phase-shift matrix. A statistical channel state information (CSI)-based RIS phase-shift optimization technique is adopted by leveraging the effective covariance matrices of the composite channels pertaining to NOMA clusters with distinct spatial signatures. Since channel statistics vary at a much slower rate compared to the channel coherence interval, the proposed RIS-aided NOMA system design is benefited by a much lower pilot overhead and computational complexity than the instantaneous CSI-based joint phase-shift and precoder optimization techniques. The achievable sum rates are derived in closed-form for the maximum ratio transmission/combining based linear precoder/detector in the presence of practical impediments, including spatially correlated fading, erroneously estimated CSI, intra-cluster pilot contamination, imperfect successive interference cancellation, and hardware impairments by leveraging the statistical CSI-based optimal RIS phase-shift matrix. The pilot overhead, computational complexity, and convergence of the proposed phase-shift optimization are investigated and compared against the baseline techniques that use instantaneous CSI-aided optimization with semi-definite relaxation techniques. Monte-Carlo simulations/numerical results are used to validate our achievable sum rate analysis and to investigate the rate gains of the proposed system design over the orthogonal multiple access based counterparts.