跳至内容

2024

EasyAsk: An In-App Contextual Tutorial Search Assistant for Older Adults with Voice and Touch Inputs
(IMWUT' 24) Weiwei Gao, Kexin Du, Yujia Luo, Weinan Shi, Chun Yu, and Yuanchun Shi
Abstract
An easily accessible tutorial is crucial for older adults to use mobile applications (apps) on smartphones. However, older adults often struggle to search for tutorials independently and efficiently. Through a formative study, we investigated the demands of older adults in seeking assistance and identified patterns of older adults’ behaviors and verbal questions when seeking help for smartphone-related issues. Informed by the findings from the formative study, we designed EasyAsk, an app-independent method to make tutorial search accessible for older adults. This method was implemented as an Android app. Using EasyAsk, older adults can obtain interactive tutorials through voice and touch whenever they encounter problems using smartphones. To power the method, EasyAsk uses a large language model to process the voice text and contextual information provided by older adults, and another large language model to search for the tutorial. Our user experiment, involving 18 older participants, demonstrated that EasyAsk helped users obtain tutorials correctly in 98.94% of cases, making tutorial search accessible and natural.
Evaluating the Privacy Valuation of Personal Data on Smartphones
(IMWUT' 24) Lihua Fan, Shuning Zhang, Yan Kong, Xin Yi, Yang Wang, Xuhai "Orson" Xu, Chun Yu, Hewu Li, and Yuanchun Shi.
Abstract
Smartphones hold a great variety of personal data during usage, which at the same time poses privacy risks. In this paper, we used the selling price to reflect users’ privacy valuation of their personal data on smartphones. In a 7-day auction, they sold their data as commodities and earn money. We first designed a total of 49 commodities with 8 attributes, covering 14 common types of personal data on smartphones. Then, through a large-scale reverse second price auction (N=181), we examined students’ valuation of 15 representative commodities. The average bid-price was 62.8 CNY (8.68 USD) and a regression model with 14 independent variables found the most influential factors for bid-price to be privacy risk, ethnic and gender. When validating our results on non-students (N=34), we found that despite they gave significantly higher prices (M=109.8 CNY, 15.17 USD), “privacy risk” was still one of the most influential factors among the 17 independent variables in the regression model. We recommended that stakeholders should provide 8 attributes of data when selling or managing it.
Leveraging Large Language Models for Generating Mobile Sensing Strategies in Human Behavior Modeling
(IMWUT'24) Nan Gao, Zhuolei Yu, Yue Xu, Chun Yu, Yuntao Wang, Flora D. Salim, and Yuanchun Shi.
Abstract
Mobile sensing plays a crucial role in generating digital traces to understand human daily lives. However, studying behaviours like mood or sleep quality in smartphone users requires carefully designed mobile sensing strategies such as sensor selection and feature construction. This process is time-consuming, burdensome, and requires expertise in multiple domains. Furthermore, the resulting sensing framework lacks generalizability, making it difficult to apply to different scenarios. In the research, we propose an automated mobile sensing strategy for human behaviour understanding. First, we establish a knowledge base and consolidate rules for data collection and effective feature construction. Then, we introduce the multi-granular human behaviour representation and design procedures for leveraging large language models to generate strategies. Our approach is validated through blind comparative studies and usability evaluation. Ultimately, our approach holds the potential to revolutionise the field of mobile sensing and its applications.
AngleSizer:Enhancing SpatialScale Perception for the Visually Impaired with an Interactive Smartphone Assistant
(IMWUT'24) Xiaoqing Jing, Chun Yu, Kun Yue, Liangyou Lu, Nan Gao, Weinan Shi, Mingshan Zhang, Ruolin Wang, and Yuanchun Shi.
Abstract
Spatial perception, particularly at small and medium scales, is an essential human sense but poses a significant challenge for the blind and visually impaired (BVI). Traditional learning methods for BVI individuals are often constrained by the limited availability of suitable learning environments and high associated costs. To tackle these barriers, we conducted comprehensive studies to delve into the real-world challenges faced by the BVI community. We have identified several key factors hindering their spatial perception, including the high social cost of seeking assistance, inefficient methods of information intake, cognitive and behavioral disconnects, and a lack of opportunities for hands-on exploration. As a result, we developed AngleSizer, an innovative teaching assistant that leverages smartphone technology. AngleSizer is designed to enable BVI individuals to use natural interaction gestures to try, feel, understand, and learn about sizes and angles effectively. This tool incorporates dual vibration-audio feedback, carefully crafted teaching processes, and specialized learning modules to enhance the learning experience. Extensive user experiments validated its efficacy and applicability with diverse abilities and visual conditions. Ultimately, our research not only expands the understanding of BVI behavioral patterns but also greatly improves their spatial perception capabilities, in a way that is both cost-effective and allows for independent learning.
G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios
(IMWUT '24)Zeyu Wang, Yuanchun Shi, Yuntao Wang, Yuchen Yao, Kun Yan, Yuhan Wang, Lei Ji, Xuhai Xu, and Chun Yu
Abstract
Modern information querying systems are progressively incorporating multimodal inputs like vision and audio. However, the integration of gaze -a modality deeply linked to user intent and increasingly accessible via gaze-tracking wearables -remains under explored. This paper introduces a novel gaze-facilitated information querying paradigm, named G-VOILA, which synergizes users'gaze, visual field, and voice-based natural language queries to facilitate a more intuitive querying process.In a user-enactment study involving 21 participants in 3 daily scenarios (p=21, scene =3), we revealed the ambiguity in users'query language and a gaze-voice coordination pattern in users'natural query behaviors with G-VOILA.Based on the quantitative and qualitative findings, we developed a design framework for the G-VOILA paradigm, which effectively integrates the gaze data with the in-situ querying context.Then we implemented a G-VOILA prouf-of-concept using cutting-edge deep learning techniques.A follow-up user study (p=16, scene =2)demonstrates its effectiveness by achieving both higher objective score and subjective score, compared to a baseline without gaze data.We further conducted interviews and provided insights for future gaze-facilitated information querying systems.
The EarSAVAS Dataset: Enabling Subject-Aware Vocal Activity Sensing on Earables
(IMWUT '24)Xiyuxing Zhang, Yuntao Wang, Yuxuan Han, Chen Liang, Ishan Chatterjee, Jiankai Tang, Xin Yi, Shwetak Patel, and Yuanchun Shi
Abstract
Subject-aware vocal activity sensing on wearables,which specifically recognizes and monitors the wearer's distinct vocalactivities,is essential in advancing personal health monitoring and enabling context-aware applications.While recentadvancements in earables present new opportunities,the absence of relevant datasets and effective methods remains asignificant challenge.In this paper,we introduce EarSAVAS,the first publicly available dataset constructed specifically forsubject-aware human vocal activity sensing on earables.EarSAVAS encompasses eight distinct vocal activities from both theearphone wearer and bystanders,including synchronous two-channel audio and motion data collected from 42 participantstotaling 44.5 hours.Further,we propose EarVAS,a lightweight multi-modal deep learning architecture that enables efficientsubject-aware vocal activity recognition on earables.To validate the reliability of EarSAVAS and the efficiency of EarVAS, we implemented two advanced benchmark models.Evaluation results on EarSAVAS reveal EarVAS's effectiveness with anaccuracy of 90.84%and a Macro-AUC of 89.03%.Comprehensive ablation experiments were conducted on benchmark models and demonstrated the effectiveness of feedback microphone audio and highlighted the potential value of sensor fusion in subject-aware vocal activity sensing on earables. We hope that the proposed EarSAVAS and benchmark models can inspire other researchers to further explore efficient subject-aware human vocal activity sensing on earables.
Time2Stop: Adaptive and Explainable Human-AI Loop for Smartphone Overuse Intervention
(CHI ’24) Adiba Orzikulova, Han Xiao, Zhipeng Li, Yukang Yan, Yuntao Wang, Yuanchun Shi, Marzyeh Ghassemi, Sung-Ju Lee, Anind K Dey, Xuhai Xu
Abstract
Despite a rich history of investigating smartphone overuse intervention techniques, AI-based just-in-time adaptive intervention (JITAI) methods for overuse reduction are lacking. We develop Time2Stop, an intelligent, adaptive, and explainable JITAI system that leverages machine learning to identify optimal intervention timings, introduces interventions with transparent AI explanations, and collects user feedback to establish a human-AI loop and adapt the intervention model over time. We conducted an 8-week feld experiment (N=71) to evaluate the efectiveness of both the adaptation and explanation aspects of Time2Stop. Our results indicate that our adaptive models signifcantly outperform the baseline methods on intervention accuracy (>32.8% relatively) and receptivity(>8.0%). In addition, incorporating explanations further enhances the efectiveness by 53.8% and 11.4% on accuracy and receptivity, respectively. Moreover, Time2Stop signifcantly reduces overuse, decreasing app visit frequency by 7.0∼8.9%. Our subjective data also echoed these quantitative measures. Participants preferred the adaptive interventions and rated the system highly on intervention time accuracy, efectiveness, and level of trust. We envision our work can inspire future research on JITAI systems with a human-AI loop to evolve with users.
CardboardHRV: Bridging Virtual Reality and Biofeedback with a Cost-Effective Heart Rate Variability System
(CHI EA '24) Fengzhen Cui, Yuntao Wang*, henshen Lei, Yuanchun Shi
Abstract
We introduce CardboardHRV, an affordable and effective heart rate variability (HRV) biofeedback system leveraging Cardboard VR. Designed for easy access to HRV biofeedback without sacrificing therapeutic value, we adapted the Google Cardboard VR headset with an optical fiber modification. This enables the camera of the inserted phone to capture the photoplethysmography (PPG) signal from the user’s lateral forehead, enabling CardboardHRV to accurately calculate the heart rate variability as a basis for biofeedback. Furthermore, we’ve integrated an engaging biofeedback game to assist users throughout their sessions, enhancing user engagement and the overall experience. In a preliminary user evaluation, CardboardHRV demonstrated comparable therapeutic outcomes to traditional HRV biofeedback systems that require an additional electrocardiogram (ECG) device, proving itself as a more cost-effective and immersive alternative.
PepperPose: Full-Body Pose Estimation with a Companion Robot
(CHI '24) Chongyang Wang, Siqi Zheng, Lingxiao Zhong, Chun Yu, Chen Liang, Yuntao Wang, Yuan Gao*, Tin Lun Lam, Yuanchun Shi
Abstract
Accurate full-body pose estimation across diverse actions in a user friendly and location-agnostic manner paves the way for interactive applications in realms like sports, fitness, and healthcare. This task becomes challenging in real-world scenarios due to factors like the user’s dynamic positioning, the diversity of actions, and the varying acceptability of the pose-capturing system. In this context, we present PepperPose, a novel companion robot system tailored for optimized pose estimation. Unlike traditional methods, PepperPose actively tracks the user and refines its viewpoint, facilitating enhanced pose accuracy across different locations and actions. This allows users to enjoy a seamless action-sensing experience. Our evaluation, involving 30 participants undertaking daily functioning and exercise actions in a home-like space, underscores the robot’s promising capabilities. Moreover, we demonstrate the opportunities that PepperPose presents for human-robot interaction, its current limitations, and future developments.