From AI to AGI - The Evolution of Real-Time Systems with GPT Integration

被引:0
作者
Pande, Aarush Kaunteya [1 ]
Brantley, Preston [1 ]
Tanveer, Muhammad Hassan [1 ]
Voicu, Razvan Cristian [1 ]
机构
[1] Kennesaw State Univ, Dept Robot & Mechatron Engn, Marietta, GA 30060 USA
来源
SOUTHEASTCON 2024 | 2024年
关键词
Real-Time Systems; Voice Control; Artificial Intelligence; Computer Vision; GPT; Drones; AGI;
D O I
10.1109/SOUTHEASTCON52093.2024.10500172
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative artificial intelligence (AI), particularly ChatGPT, is revolutionizing various sectors, from exercise applications to accounting software, politics, and pharmaceuticals. As versatile aerial vehicles, drones have broad applications in videography, military operations, and surveying. However, their programming and optimal utilization often require extensive training. This research tackles these challenges by utilizing ChatGPT's sophisticated logic and prompt training features to enable drones to operate autonomously in various settings, ranging from everyday tasks to emergencies like search and rescue missions. Enhancing Microsoft Research's PromptCraft robotics, the project integrates innovative algorithms and GPT4-Vision, improving command efficiency, speed, and accuracy. This integration also leverages additional sensor data feedback, allowing the drones to process user prompts with enhanced contextual understanding. Initial results show a significant improvement in command response times and accuracy, enabling the drones to interpret and execute complex voice commands in various environments. This paper presents a multimodal framework that enriches the capabilities of voice-controlled robotic systems and broadens the scope of AI applications in real-time systems, laying the groundwork for customized AI-driven systems, including robots tailored for diverse applications and the shift towards AGI.
引用
收藏
页码:699 / 707
页数:9
相关论文
共 27 条
  • [1] Brock A, 2019, Arxiv, DOI [arXiv:1809.11096, 10.48550/arXiv.1809.11096]
  • [2] Brown TB, 2020, ADV NEUR IN, V33
  • [3] Everybody Dance Now
    Chan, Caroline
    Ginosar, Shiry
    Zhou, Tinghui
    Efros, Alexei A.
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5932 - 5941
  • [4] An empirical study of smoothing techniques for language modeling
    Chen, SF
    Goodman, J
    [J]. COMPUTER SPEECH AND LANGUAGE, 1999, 13 (04) : 359 - 394
  • [5] Chomsky N., 1957, SYNTACTIC STRUCTURES
  • [6] De Saussure F, 1989, COURS LINGUISTIQUE G, V1
  • [7] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [8] github, CODE
  • [9] Karras T, 2018, Arxiv, DOI [arXiv:1710.10196, DOI 10.48550/ARXIV.1710.10196]
  • [10] Kirillov A, 2023, Arxiv, DOI arXiv:2304.02643