Tuesday, June 10, 2025 12pm
About this Event
As AI systems are increasingly deployed in assistive technologies and autonomous environments, it is important for them to recognize when perceptual input is insufficient and to support users in acquiring missing information. This research investigates self-aware perception by focusing on two core capabilities: recognizing when input is incomplete or ambiguous, and identifying possible ways to obtain the missing information.
First, we explore assistive visual interfaces for blind and low-vision (BLV) users. We introduce the Directional Guidance task, which enables Vision-Language Models (VLMs) to detect when a question about an image cannot be answered due to framing issues and to suggest spatial camera adjustments. To address the lack of labeled training data, we design an automated perturbation-based data augmentation pipeline. Empirical results show that fine-tuned models outperform zero-shot baselines on a carefully constructed benchmark.
Second, we study structured representations for traffic scenes in autonomous driving. Using NuScenes data, we develop a neuro-symbolic pipeline based on Frame Theory to convert sensor data into interpretable summaries of agent motion and scene dynamics. These representations are designed to support introspection and may serve as inputs to symbolic-based reasoning in future work.
Together, these studies aim to contribute to the development of intelligent systems that better handle uncertainty and support interaction by recognizing the limits of what they currently perceive.
Event Host: Li Liu, Ph.D. Student, Computer Science and Engineering
Advisor: Leilani Gilpin
0 people are interested in this event
Zoom link: https://ucsc.zoom.us/j/96914407455?pwd=lcUjUaakFoIJClw5Uyba3wfqCWU8Id.1
Zoom Passcode: 780187
User Activity
No recent activity