Knocker

Vibroacoustic-based Object Recognition with Smartphones



Abstract


While smartphones have enriched our lives with diverse applications and functionalities, the user experience still often involves manual cumbersome inputs. To purchase a bottle of water for instance, a user must locate an e-commerce app, type the keyword for a search, select the right item from the list, and finally place an order. This process could be greatly simplified if the smartphone identifies the object of interest and automatically executes the user preferred actions for the object. We present Knocker that identifies the object when a user simply knocks on an object with a smartphone. The basic principle of Knocker is leveraging a unique set of responses generated from the knock. Knocker takes a multimodal sensing approach that utilizes microphones, accelerometers, and gyroscopes to capture the knock responses, and exploits machine learning to accurately identify objects. We also present 15 applications enabled by Knocker that showcase the novel interaction method between users and objects. Knocker uses only the built-in smartphone sensors and thus is fully deployable without specialized hardware or tags on either the objects or the smartphone. Our experiments with 23 objects show that Knocker achieves an accuracy of 98% in a controlled lab and 83% in the wild.


Video




Publications


Knocker: Vibroacoustic-based Object Recognition with Smartphones
Taesik Gong, Hyunsung Cho, Bowon Lee, and Sung-Ju Lee
ACM Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 2019 (UbiComp '19).
PDF Video Slides

Real-Time Object Identification with a Smartphone Knock
Taesik Gong, Hyunsung Cho, Bowon Lee, and Sung-Ju Lee
Proceedings of ACM MobiSys 2019 (Video).
PDF Video Best Video Award

Identifying Everyday Objects with a Smartphone Knock
Taesik Gong, Hyunsung Cho, Bowon Lee, and Sung-Ju Lee
Proceedings of ACM CHI 2018 Extended Abstracts.
PDF Video



People


Taesik Gong

KAIST

Hyunsung Cho

KAIST

Bowon Lee

Inha University

Sung-Ju Lee

KAIST

Awards


Best Video Award | ACM MobiSys 2019

Best Demo/Poster Award | ACM SIGCHI Local Chapter 2018

Best Demo/Poster Award | HCI@KAIST Workshop 2018



Dataset


We provide the Knocker dataset to foster further studies. Please refer to the README file before playing with the dataset.

Dataset Link



FAQ


Q: I understand sound is an important feature to distinguish objects for Knocker. But why accelerometer and gyroscope?
A: In addition to sound, a knock also exerts a force to the smartphone as the form of acceleration and angular velocity. Each object exhibits a different pattern of the force, and it is captured by the rapid changes in the built-in accelerometer and gyroscope sensor values in the smartphone. As using only sound is susceptible to noise, in addition to the knocking sound, we leverage the accelerometer and gyroscope values that are both distinctive per object and noise-tolerant.

Q: Can Knocker identify similar objects that seem to generate very similar responses?
A: It depends on the objects. Our carefully selected multi-modal feature sets combined with the machine learning classifier jointly make similar objects distinguishable to a certain extent. We tested Knocker with 12 similar objects (different books and bottles) and achieved 93% accuracy. However, Knocker would not distinguish almost the same objects with the same material, shape, content type, and size (e.g., a Pepsi can vs. a Cocacola can). Please refer to Section 5.4 of our paper for further details.

Q: Does Knocker work with different people and environments with various noises?
A: Yes. We found that knock responses are unique per object across different people. We also tested Knocker in the wild where a diverse set of noises that have never trained exist and found that Knocker achieved decent accuracy powered by noise-irrelevant features.

Q: How many training data does Knocker need?
A: With 23 everyday objects, Knocker achieves 92% accuracy with only 20 training knocks for each object. Please refer to Section 5.2 of our paper for further details.