Generating content for large-scale reading comprehension assessments, particularly stories, poses significant challenges including money, time, and effort. To serve as effective assessment tools, these comprehension stories must adhere to specific criteria defined in early grade reading assessment standards dictating narrative structure, character types, readability levels, and other elements. Recently Natural Language Processing (NLP) techniques, mainly Large Language Models (LI:Ms), have been used to automate the story generation process. One key challenge is ensuring diversity across the many generated stories. For example, the Early reading standards such as Early Grade Reading Assessment (EGRA) requires comprehension stories to he similar in difficulty level but narratively different across multiple implementations to ensure fairness. This paper proposes a solution to increase diversity across stories by employing GPT-4 to generate children's reading comprehension stories mediated by a database of classic tales. By leveraging existing narratives, this method drastically reduces the resources required for content creation while ensuring alignment with the EGRA criteria. We present a systematic framework for selecting, adapting, and evaluating the stories, aiming to streamline the content generation process. The generated stories are additionally evaluated using well-defined text metrics and by a human evaluator for reliability. Our findings underscore the potential of integrating GPT-4 with classic tales to optimize resources and enhance scalability in literacy assessment practices, offering practical implications for educators and policymakers in early grade learning.