SignBridge Duo — Case Study.fig
Accessibility · AR Wearables · 2025

SignBridge Duo:
Bridging the
Auditory Gap

SignBridge Duo AR Glasses

An intelligent AR captioning system for DHH and neurodiverse users — real-time speech-to-text with context-aware modes and Important Information Detection.

Accessibility AR Wearables HCI Research Inclusive Design
TypeCourse Project
RoleProduct Designer
TimelineSep – Dec 2025
SchoolCornell Tech
TeamMingyuan Pang, Jiawen Chen, Yuhan Zhang
01 / Overview

Restoring autonomy to the in-between user

DHH individuals who aren't fluent in sign language live between two worlds — excluded from the Deaf community due to language barriers, isolated from the hearing world due to communication fatigue. Existing tools don't solve this. They route through human interpreters, removing agency, or offer static one-size-fits-all captions that fail neurodiverse users entirely.

SignBridge Duo replaces human intermediaries with a context-aware AR system that adapts to the user's environment automatically.

4
Interaction Modes
IID
Memory Retention Layer
3 mo.
Sep – Dec 2025
Jump to Final Product ↓
02 / My Role

What I owned

Design was split evenly across the team — we each owned a full iteration of the prototype, and co-directed the final video. My specific ownership was the environmental mode and surrounding-context design — how the system reads and responds to ambient space — plus one complete design iteration from scratch.

August Wang
Iteration 1
Full prototype iteration, lead designer
Environmental
Spatial context UI — PA mode, ambient alerts, directional cues
Consultation
Sourced & ran Jazmin Cano sessions
Video
Scenario script, shoot storyboard, co-director
ML research
Robustness study → IID fallback constraints
Team
Mingyuan Pang
Iteration 2 · competitor analysis · video co-director
Jiawen Chen
Iteration 3 · user scenarios · video co-editor
Yuhan Zhang
Technical feasibility · IID logic · video co-editor
All four
Storyboard ideation · design system consolidation
03 / Challenge & Research

Finding the user
no one designed for

We started with Jessica Kellgren-Fozard, a late-deafened YouTuber who grew up hearing. She never acquired sign language as a first language — and even if she had, it wouldn't fully solve the problem. BSL, ASL, Auslan, and other sign systems aren't mutually intelligible. They're distinct linguistic cultures. A BSL user and an ASL user can't simply communicate, which means even within the DHH community, there's no universal fallback.

For someone who loses hearing later in life, neither world fits: not the hearing world that assumes you can still follow along, and not the Deaf community whose language and identity she never shared. We called this the "in-between" user — and no existing tool was built for this space.

🔍
Competitor Analysis
We analyzed Lingvano, SignVideo, and Captify Pro — and found the same two failures across every one of them.
🔗
Dependency on Intermediaries
All three tools route through third-party human interpreters, removing autonomy rather than restoring it.
🧠
Rigid, Context-Blind Design
Static one-way captions with zero environmental adaptation. No cognitive load support, no priority filtering.
⚠️
The "In-Between" Gap
Late-deafened users don't share a first language with the Deaf community, and no two sign systems are mutually intelligible. Every existing tool assumes one world or the other.
04 / Design Process

Three iterations. One scenario.

With the user gap defined, we built a storyboard around Alex — a DHH professional navigating a chaotic airport. High background noise makes lip-reading impossible. He misses a PA announcement about a gate change and only realizes when the crowd moves. This scenario sharpened our design brief: the system needed to adapt to the environment, not the other way around.

Storyboard — Alex at the airport

Storyboard: Alex's invisible barrier — the moment that defined our design direction

From that storyboard we ran three prototype iterations — each team member led one from scratch. We explored different approaches to information hierarchy, caption placement, and environmental awareness before converging on a shared system.

🗣
1-on-1 Mode
Focused single-speaker dialogue. Captions appear close and stable, minimizing visual noise.
👥
Group Mode
Multi-speaker tracking with positional cues and speaker labels for social settings.
🏫
Presentation Mode
Stabilized captioning with sentence-level buffering for academic lectures.
📢
Environmental Mode
PA announcements and ambient alerts — synthetic spatial cues with directional arrows. My owned area.

Each iteration pushed a different design question. Mine focused on the environmental and surrounding-context layer — how the glasses read ambient space and surface non-speech signals spatially. By iteration 3 we had enough signal to consolidate into a single horizontal structure.

SignBridge Duo — Design Iterations
3 iterations · pre-Jazmin

Three prototype iterations — each team member led one

SignBridge Duo — Horizontal Structure
Pre-Jazmin consolidated system

Consolidated system — all four modes side by side, before Jazmin consultation

05 / Expert Validation

From good design
to inclusive design

We brought our consolidated system to Jazmin Cano — Senior UX Research Specialist, Accessibility at Owlchemy Labs (now Google). Her feedback was specific: the design handled DHH users well, but it was missing an entire population — neurodiverse users whose DHH experience overlaps with ADHD and APD. That one observation reshaped what the product was.

01
Expand scope: neurodiverse users

Jazmin pointed out DHH needs frequently co-occur with ADHD and APD. This is where IID was born — a memory retention layer to catch critical verbal info before it fades, addressing the cognitive load overlap directly.

02
Safety redesign: remove seizure risks

Our original alert used a red photosensitive flash. Replaced with shape-based cues — triangle warnings, border pulses — urgent without triggering photosensitivity.

03
Accessibility controls as core UX, not settings

Contrast and text size moved to the primary UI layer. If a user needs it, it shouldn't be three taps deep.

"An impressive example of inclusive design that prioritizes user autonomy. SignBridge Duo bridges the critical gap between raw information and meaningful, accessible communication."

Jazmin Cano — Senior UX Research Specialist, Accessibility · Owlchemy Labs (now Google)
06 / Final Product

Finalized screens. Then made it move.

With the post-consultation design system locked, we built out the finalized static screens — every mode, the IID layer, the safety redesign. Then we took it further: a real-world scenario video that put Alex's story in motion, co-directed and co-produced by the whole team.

SignBridge Duo — Finalized Screens
Click to interact

Finalized design system — post-consultation · all four modes and IID layer · click to interact

SignBridge Duo final design system overview

Full system overview — all modes, IID layer, and safety redesign

To demonstrate the system in a believable real-world context, we produced a scenario-based mini mockup video — scripted around Alex's airport journey, filmed and composited to show the AR overlay in a synthetic environment. The team co-directed and co-edited; I wrote the scenario script and storyboarded the shoot.

Through SignBridge Duo, I learned that accessibility is not a feature checklist — it's a fundamental architectural framework. By designing for the intersectional needs of DHH and neurodiverse users rather than treating them as a monolith, we built a system that adapts to the human, not the other way around.

The expert consultation was the most formative moment of the project. It taught me that the best design decisions come from admitting what you don't know — and that designing for the margins almost always improves the experience for everyone.

Innovation serves no purpose unless it restores autonomy to those who need it most.

07 / Research & Engineering

The feature we cut —
and what I did after

We scoped out gesture translation early — the model accuracy needed to reliably translate between sign systems in real-time wasn't feasible for this project. But the question didn't go away. After SignBridge wrapped, I ran an applied ML study specifically to understand why gesture recognition is so fragile under real-world conditions, and what that means for any system that tries to use it.

The Study
Distribution-Shift Robustness Across Visual Modalities

When we cut gesture translation from SignBridge's scope, I wanted to understand why it was so hard — not just accept it as a technical limit. So I ran a separate study: how do gesture-recognition models actually fail under real-world conditions like occlusion, lighting changes, and rotation?

With co-researcher Mingyuan Pang, I tested classical ML models and a CNN across three gesture and expression datasets under four corruption types at three severity levels.

I led the video gesture experiments: preprocessing, CNN implementation in PyTorch, robustness evaluation, and the Results and Discussion sections. The findings answered my original question — and changed how I designed the IID fallback system.

🧹
Data Cleaning
Identified and replaced a near-duplicate dataset mid-project. Extracted video frames, normalized inputs, applied PCA.
🧠
Model Implementation
Built a 3-layer CNN in PyTorch with Adam optimizer, data augmentation, and convergence monitoring.
📊
Robustness Evaluation
Systematic testing across 4 corruption types × 3 severity levels. Wrote the Results and Discussion sections.
🔗
Cross-functional Translation
Took ML findings and translated them into concrete design decisions — not just a research report.
Model accuracy under worst-case corruption (Severity 3)
CNN = video gesture model · KNN = static hand gesture
Corruption Type CNN (video) KNN (static) Verdict
Blur 90.9% 96.0% Resilient
Brightness 63.6% 30.0% Fragile
Occlusion 39.1% 17.3% Critical failure
Rotation 26.2% 45.0% Fragile
Key finding: 99.67% clean accuracy collapsed to 17.3% under occlusion at max severity. High benchmark performance does not predict real-world robustness.

These weren't just interesting data points — they became design requirements.

ML Finding
Occlusion is catastrophic for both CNN and classical models. A partially blocked hand or face collapses recognition accuracy to near-random.
Design Response
IID uses a conservative detection threshold. Better to surface a false positive than miss a safety-critical signal. Added a "low confidence" fallback visual state.
ML Finding
Rotation destroys all models — the CNN drops from 91.8% clean to 26.2% at max rotation. AR glasses see content at any angle.
Design Response
Peripheral AR placement never occludes the user's primary task view. If the model fails on a rotated input, the user retains their own visual context to compensate.
ML Finding
CNNs handle blur well but are fragile to brightness shifts — a lighting change alone drops accuracy to 63.6%.
Design Response
IID elevated contrast controls to the primary UI layer (not buried in settings) so users can manually adjust for their environment when the model degrades.
ML Finding
A lightweight CNN outperforms classical models on video data, but no model is universally robust. Architecture alone doesn't solve the problem.
Design Response
Chose a lightweight classifier for IID over an LLM specifically because latency in real-time captioning is itself a failure mode. Speed over accuracy in the detection layer.
🤝
On working with engineers
This research changed how I think about design-engineering collaboration. I didn't bring these findings to the team as "here's why the model isn't good enough." I brought them as constraints: given these failure modes, here's what the design needs to handle on its own. Designing around known technical limits — rather than wishing them away — is how things actually ship.
Next Project
Kimi.AI: Soulful Companion
View next →