Organizations across the world, including NATO, OECD, the WHO, and the United Nations, as well as many governments, are now employing guidelines for safe, secure, and trustworthy Artificial Intelligence (AI). While technology policies are still being formulated, many AI applications catered toward children have already been developed or are still developing. While designing any child-centered AI, it is utmost importance to keep the children's privacy at the forefront. One modality for child-centered AI is speech/language communication, which has found applications in various educational technologies, tutoring services, as well as interactive learning and social robots. Although, short of a full de-identification of speech segments, longer duration sentences and audio content could reveal partial neutral identifying information (e.g., gender of a child, etc.), but if taken in longer duration context with sequenced longitudinal data (e.g., audio recordings over full days at home or in classrooms, and linked over time), privacy concerns will grow and be critical. Motivated by a privacy-preserving design, this study explores the use of discrete speech units as a form of anonymous encoding, to develop Automatic Speech Recognition (ASR) systems for children that better ensure privacy protection. The primary goal here is to ascertain that discrete speech units retain the key linguistic information for the ASR task of output text creation, but simultaneously lack identifying speaker-specific information, or the ability to potentially re-generate the original speech waveform given the available sequence of discrete speech units. Here, a Discrete ASR model trained on the My Science Tutor Children's Conversational Speech Corpus (MyST) archives an output word-error-rate (WER) of 15.7%. Our Discrete ASR model achieves similar performance in terms of WER when compared to state-of-the-art End-to-End (E2E) ASR models trained using features extracted from large-scale self-supervised pre-trained speech processing model (such as WavLM), although it is noted that E2E ASR models are almost 10 times larger in model checkpoint memory size and number of model parameters and takes 3x the amount of time to train. In addition, open-domain testing on other popular child speech corpora confirms that the proposed Discrete ASR models perform equal to E2E ASR models for corpora containing children speech in the same age range as MyST (e.g., CMU corpus) and slightly lower performance for a corpus containing a wider age range of children (e.g., OGI corpus). Finally, this study also shows that child ASR using the proposed discrete speech units achieves promising performance in recognizing WH-Words, Nouns, Verbs, and Pronouns in an early childhood case study of teacher-child interactions in a childcare facility, involving preschool children with and without speech/language delays which is an extremely vulnerable and challenging speech/language assessment population.