This paper studies the hybrid spectrum access for cellular vehicle-to-vehicle (V2V) heterogeneous networks (Het-Nets) exploiting deep reinforcement learning (DRL) technology. In the system, macro cell users (M-UEs), V2V clusters (V-Clusters) and V2V nodes are distributed in the time-division duplex (TDD) cellular networks, where all users share the cellular uplink resources for spectrum access. M-UEs and V-Clusters adopt orthogonal access patterns. V2V nodes expect to obtain an optimal strategy to reuse other users' resource for access while preventing other users from serious interference, which is difficult due to the limited information of other users' access patterns. Thus, DRL technology is used to train V2V nodes in an unsupervised way, where the spectrum access can be optimized without any prior information. Specifically, a double deep Q-network (DDQN) based hybrid spectrum access (D2HSA) algorithm is designed for maximizing the sum throughput of the HetNets, which makes the V2V nodes intelligently select suitable frames for spectrum access. Furthermore, in order to improve the stability of individual throughput for V2V nodes, the proposed algorithm is adjusted by considering stability in the design of the objective function. Last, the experiment results illustrate that our proposed scheme could reach an optimal throughput referring to the theoretic limit for different cases, and the performance greatly outperforms the scheme relying on the cooperation of base station.