End-to-End (E2E) learning has increasingly become a dominant method for enhancing the performance of communication systems. In this paper, an E2E Radio Over Fiber (RoF) transmission system utilizing self-supervised learning (SSL) is proposed. This E2E approach enables automatic optimization of the transmission system's transmitter modulator and receiver demodulator based on specific channel characteristics. The SSL architecture comprises four deep neural networks: TransNN for symbol mapping, SamplNN for upsampling, ChannelNN for channel modeling, and ReceivNN for demodulation, collectively replacing traditional components in the RoF link. Indeed, the E2E system adjusts the geometric shape of the constellation and upsampling rules according to RoF channel properties, facilitating transmission with enhanced received sensitivity. Perturbation noise is incorporated during the training phase to improve the ability to generalize the SSL. Notably, traditional demodulation methods cannot demodulate the RF signals transmitted in the E2E system, thereby introducing an additional layer of confidentiality to the transmission process. Numerical simulations have been conducted in 10 GHz 2 Gsym/s RoF transmission systems. The results indicate that compared to traditional approaches, the received sensitivity of the E2E system improved by 3.5 dB, under a BER limit of 2.4e-2. Compared to optimized only at the receiver side, achieved a sensitivity improvement of at least 3 dB. The numerical experiments also validate the importance of the SamplNN and perturbation training for the E2E system. These elements can improve the system's transmission performance and generalization ability, as well as enhance the system's security.