In this study, we determined whether information in eye-hand/hand-eye cross-modal memory is maintained by the input modality used for encoding, the output modality used for testing, or both. In experiments, two categories of effect were examined: facilitatory, produced by rehearsal work with eyes or hands corresponding to the movement of the stimulus to be memorized after its presentation, and interference, formed through the performance of a noncorresponding movement. The results indicated that both the eye and hand facilitated the eye-hand cross-modal memory tasks (Experiment 1A), confirming that both serve a rehearsal function. Subsequently, we conducted an interference effect experiment (Experiment 1B) using the same memory task as that used in Experiment 1A and found that neither modality produced interference effects. This result indicates that information was preserved via output-modality-specific representations when the eye-interference task interfered with the information retention of input-modality-specific representations and via input-modality-specific representations when the hand-interference task interfered with the information retention of output-modality-specific representations. We observed the same facilitation and disappearance of interference effects for the hand-eye cross-modal memory task (Experiments 2A and 2B). In the eye-hand/hand-eye cross-modal memory tasks, the effects of eye and hand rehearsals were found to be comparable, which indicated that the two types of representations functioned together during encoding and testing. From the disappearance of interference effects, the possibility that modality-specific representations have functional aspects arises.