ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users
被引:13
|
作者:
Jain, Dhruv
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Seattle, WA 98195 USA
Google, Mountain View, CA 94043 USAUniv Washington, Seattle, WA 98195 USA
Jain, Dhruv
[1
,2
]
Nguyen, Khoa Huynh Anh
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Seattle, WA 98195 USAUniv Washington, Seattle, WA 98195 USA
Nguyen, Khoa Huynh Anh
[1
]
Goodman, Steven
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Seattle, WA 98195 USAUniv Washington, Seattle, WA 98195 USA
Goodman, Steven
[1
]
Grossman-Kahn, Rachel
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Seattle, WA 98195 USAUniv Washington, Seattle, WA 98195 USA
Grossman-Kahn, Rachel
[1
]
Ngo, Hung
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Seattle, WA 98195 USAUniv Washington, Seattle, WA 98195 USA
Ngo, Hung
[1
]
Kusupati, Aditya
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Seattle, WA 98195 USAUniv Washington, Seattle, WA 98195 USA
Kusupati, Aditya
[1
]
Du, Ruofei
论文数: 0引用数: 0
h-index: 0
机构:
Google Res, San Francisco, CA USAUniv Washington, Seattle, WA 98195 USA
Du, Ruofei
[3
]
Olwal, Alex
论文数: 0引用数: 0
h-index: 0
机构:
Google Res, Mountain View, CA USAUniv Washington, Seattle, WA 98195 USA
Olwal, Alex
[4
]
Findlater, Leah
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Seattle, WA 98195 USAUniv Washington, Seattle, WA 98195 USA
Findlater, Leah
[1
]
Froehlich, Jon E.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Seattle, WA 98195 USAUniv Washington, Seattle, WA 98195 USA
Froehlich, Jon E.
[1
]
机构:
[1] Univ Washington, Seattle, WA 98195 USA
[2] Google, Mountain View, CA 94043 USA
[3] Google Res, San Francisco, CA USA
[4] Google Res, Mountain View, CA USA
来源:
PROCEEDINGS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI' 22)
|
2022年
关键词:
Accessibility;
deaf;
Deaf;
hard of hearing;
sound awareness;
sound recognition;
CLASSIFICATION;
EVENTS;
D O I:
10.1145/3491102.3502020
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Recent advances have enabled automatic sound recognition systems for deaf and hard of hearing (DHH) users on mobile devices. However, these tools use pre-trained, generic sound recognition models, which do not meet the diverse needs of DHH users. We introduce ProtoSound, an interactive system for customizing sound recognition models by recording a few examples, thereby enabling personalized and fine-grained categories. ProtoSound is motivated by prior work examining sound awareness needs of DHH people and by a survey we conducted with 472 DHH participants. To evaluate ProtoSound, we characterized performance on two real-world sound datasets, showing significant improvement over state-of-the-art (e.g., +9.7% accuracy on the first dataset). We then deployed ProtoSound's end-user training and real-time recognition through a mobile application and recruited 19 hearing participants who listened to the real-world sounds and rated the accuracy across 56 locations (e.g., homes, restaurants, parks). Results show that ProtoSound personalized the model on-device in real-time and accurately learned sounds across diverse acoustic contexts. We close by discussing open challenges in personalizable sound recognition, including the need for better recording interfaces and algorithmic improvements.