On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper)

被引：17

作者：

Mai, Gengchen ^{[1
]}

Huang, Weiming ^{[2
]}

Sun, Jin ^{[3
]}

Song, Suhang ^{[4
]}

Mishra, Deepak ^{[5
]}

Liu, Ninghao ^{[3
]}

Gao, Song ^{[6
]}

Liu, Tianming ^{[3
]}

Cong, Gao ^{[2
]}

Hu, Yingjie ^{[7
]}

Cundy, Chris ^{[8
]}

Li, Ziyuan ^{[9
]}

Zhu, Rui ^{[10
]}

Lao, Ni ^{[11
]}

机构：

[1] Univ Georgia, Dept Geog, 210 Field St, Athens, GA USA

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Block N4,50 Nanyang Ave, Singapore, Singapore

[3] Univ Georgia Athens, Sch Comp, 415 Boyd Res & Educ Ctr, Athens, GA 30602 USA

[4] Univ Georgia, Coll Publ Hlth, Rhodes Hall,105 Spear Rd, Athens, GA 30602 USA

[5] Univ Georgia, Dept Geog, 210 Field St, Athens, GA 30602 USA

[6] Univ Wisconsin Madison, Dept Geog, Geospatial Data Sci Lab, Sci Hall,550 N Pk St, Madison, WI 53715 USA

[7] Univ Buffalo, Dept Geog, GeoAI Lab, Ste 105, Buffalo, NY 14261 USA

[8] Stanford Univ, Dept Comp Sci, 353 Jane Stanford Way, Stanford, CA 94305 USA

[9] Univ Connecticut, Sch Business, 2100 Hillside Rd, Storrs, CT 06269 USA

[10] Sch Geog Sci, Univ Rd, Bristol BS81SS, Avon, England

[11] Google, 1600 Amphitheatre Pkwy, Mountain View, CA 94043 USA

来源：

ACM TRANSACTIONS ON SPATIAL ALGORITHMS AND SYSTEMS | 2024年 / 10卷 / 02期

基金：

美国国家科学基金会;

关键词：

Foundation models; geospatial artificial intelligence; multimodal learning; GEOGRAPHICALLY WEIGHTED REGRESSION; URBAN LAND-USE; GEOSPATIAL SEMANTICS; HEALTH GEOGRAPHY; KNOWLEDGE GRAPH; TRAJECTORIES; LOCATION; CONTEXT; IMPACT; PLACE;

D O I：

10.1145/3653070

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

Large pre-trained models, also known as foundation models (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have not yet seen an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges of developing multimodal foundation models for GeoAI. We first investigate the potential of many existing FMs by testing their performances on seven tasks across multiple geospatial domains, including Geospatial Semantics, Health Geography, Urban Geography, and Remote Sensing. Our results indicate that on several geospatial tasks that only involve text modality, such as toponym recognition, location description recognition, and US state-level/county-level dementia time series forecasting, the task-agnostic large learning models (LLMs) can outperform task-specific fully supervised models in a zero-shot or few-shot learning setting. However, on other geospatial tasks, especially tasks that involve multiple data modalities (e.g., POI-based urban function classification, street view image-based urban noise intensity classification, and remote sensing image scene classification), existing FMs still underperform task-specific models. Based on these observations, we propose that one of the major challenges of developing an FM for GeoAI is to address the multimodal nature of geospatial tasks. After discussing the distinct challenges of each geospatial data modality, we suggest the possibility of a multimodal FM that can reason over various types of geospatial data through geospatial alignments. We conclude this article by discussing the unique risks and challenges to developing such a model for GeoAI.

引用

页数：46

共 50 条

[1] Foundation models in ophthalmology: opportunities and challenges
Sevgi, Mertcan
Ruffell, Eden
Antaki, Fares
Chia, Mark A.
Keane, Pearse A.
CURRENT OPINION IN OPHTHALMOLOGY, 2025, 36 (01) : 90 - 98
[2] Foundation models meet visualizations: Challenges and opportunities
Yang, Weikai
Liu, Mengchen
Wang, Zheng
Liu, Shixia
COMPUTATIONAL VISUAL MEDIA, 2024, 10 (03) : 399 - 424
[3] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
Myers, Devon
Mohawesh, Rami
Chellaboina, Venkata Ishwarya
Sathvik, Anantha Lakshmi
Venkatesh, Praveen
Ho, Yi-Hui
Henshaw, Hanna
Alhawawreh, Muna
Berdik, David
Jararweh, Yaser
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 1 - 26
[4] Foundation models in smart agriculture: Basics, opportunities, and challenges
Li, Jiajia
Xu, Mingle
Xiang, Lirong
Chen, Dong
Zhuang, Weichao
Yin, Xunyuan
Li, Zhaojian
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 222
[5] Challenges and Opportunities in Neuro-Symbolic Composition of Foundation Models
Jha, Susmit
Roy, Anirban
Cobb, Adam
Berenbeim, Alexander
Bastian, Nathaniel D.
MILCOM 2023 - 2023 IEEE MILITARY COMMUNICATIONS CONFERENCE, 2023,
[6] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
Devon Myers
Rami Mohawesh
Venkata Ishwarya Chellaboina
Anantha Lakshmi Sathvik
Praveen Venkatesh
Yi-Hui Ho
Hanna Henshaw
Muna Alhawawreh
David Berdik
Yaser Jararweh
Cluster Computing, 2024, 27 : 1 - 26
[7] Towards a Foundation Model for Geospatial Artificial Intelligence (Vision Paper)
Mai, Gengchen
Cundy, Chris
Choi, Kristy
Hu, Yingjie
Lao, Ni
Ermon, Stefano
30TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2022, 2022, : 744 - 747
[8] Journey of JIACAM: Vision, Challenges and Opportunities
Nebhinani, Naresh
Aneja, Jitender
Patra, Suravi
Suthar, Navratan
Kuppili, Pooja Patnaik
Gupta, Tanu
Choudhary, Swati
Singhai, Kartik
JOURNAL OF INDIAN ASSOCIATION FOR CHILD AND ADOLESCENT MENTAL HEALTH, 2022, 18 (01) : 9 - 11
[9] Progress and opportunities of foundation models in bioinformatics
Li, Qing
Hu, Zhihang
Wang, Yixuan
Li, Lei
Fan, Yimin
King, Irwin
Jia, Gengjie
Wang, Sheng
Song, Le
Li, Yu
BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
[10] The emergence of cognitive digital twin: vision, challenges and opportunities
Zheng, Xiaochen
Lu, Jinzhi
Kiritsis, Dimitris
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2022, 60 (24) : 7610 - 7632

← 1 2 3 4 5 →