Knocker: Vibroacoustic-based object recognition with smartphones

被引:27
作者
Gong, Taesik [1 ]
Cho, Hyunsung [1 ]
Lee, Bowon [2 ]
Lee, Sung-Ju [1 ]
机构
[1] School of Computing, KAIST
[2] Department of Electronic Engineering, Inha University
关键词
Machine learning; Multimodal sensing; Object interaction; Object recognition; Smartphone sensing;
D O I
10.1145/3351240
中图分类号
学科分类号
摘要
While smartphones have enriched our lives with diverse applications and functionalities, the user experience still often involves manual cumbersome inputs. To purchase a bottle of water for instance, a user must locate an e-commerce app, type the keyword for a search, select the right item from the list, and finally place an order. This process could be greatly simplfied if the smartphone identifies the object of interest and automatically executes the user preferred actions for the object. We present Knocker that identifiies the object when a user simply knocks on an object with a smartphone. The basic principle of Knocker is leveraging a unique set of responses generated from the knock. Knocker takes a multimodal sensing approach that utilizes microphones, accelerometers, and gyroscopes to capture the knock responses, and exploits machine learning to accurately identify objects. We also present 15 applications enabled by Knocker that showcase the novel interaction method between users and objects. Knocker uses only the built-in smartphone sensors and thus is fully deployable without specialized hardware or tags on either the objects or the smartphone. Our experiments with 23 objects show that Knocker achieves an accuracy of 98% in a controlled lab and 83% in the wild. © 2019 Copyright held by the owner/author(s).
引用
收藏
相关论文
共 44 条
[1]  
Audio Latency Measurements, (2018)
[2]  
Bigham J.P., Jayant C., Ji H., Little G., Miller A., Miller R.C., Miller R., Tatarowicz A., White B., White S., Yeh T., VizWiz: Nearly real-time answers to visual questions, Proceedings of the 23Nd Annual ACM Symposium on User Interface Software and Technology (UIST '10), pp. 333-342, (2010)
[3]  
Chang J.C., Amershi S., Kamar E., Revolt: Collaborative crowdsourcing for labeling machine learning datasets, Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17), pp. 2334-2346, (2017)
[4]  
Chen K., Furst J., Kolb J., Kim H.-S., Jin X., Culler D.E., Katz R.H., SnapLink: Fast and accurate vision-based appliance control in large commercial buildings, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 1, 4, (2018)
[5]  
Das S., Laput G., Harrison C., Hong J.I., Thumprint: Socially-inclusive local group authentication through shared secret knocks, Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17), pp. 3764-3774, (2017)
[6]  
De Freitas A.A., Nebeling M., Chen X., Yang J., Ranithangam A.S.K.K., Dey A.K., Snap-to-it: A user-inspired platform for opportunistic device interactions, Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16), pp. 5909-5920, (2016)
[7]  
Guides for Android Audio Latency, (2018)
[8]  
Dunn J., It Looks like Apple has Some Work to do If It Wants Siri to be as Smart as Google Assistant, (2017)
[9]  
Fan M., Adams A.T., Truong K.N., Public restroom detection on mobile phone via active probing, Proceedings of the 2014 ACM International Symposium on Wearable Computers (ISWC '14), pp. 27-34, (2014)
[10]  
Gong T., Chang J.H., Kim J.-G., Kang S., Kim D., Lee S.-J., Enjoy the silence: Noise control with smartphones, Computer Communication and Networks (ICCCN), 2017 26th International Conference on, pp. 1-9, (2017)