In the multi-disciplinary context of computational creativity and affective human-machine interaction, understanding and detecting creative processes accurately is advantage. This paper introduces a novel computational framework for creatively state detection, employing a multi-modal approach that integrates emotions, arousal, and valence. The framework utilizes multimodal inputs to capture the creativity states, with emotion detection forming a foundational element. By fusioning emotions and emotional dimension, arousal, and valence. This paper outlines the theoretical foundations, key components, and integration principles of the proposed framework, paving the way for future advancements in computational creativity and affective computing.