SignBridge
Sign-language platform combining personalized learning, recognition, searchable dictionary flows, 3D avatar contribution, and community interaction.
Project Overview
Building a sign-language platform that goes past recognition demos into guided learning, moderated contributions, and production ML delivery.
SignBridge started as a final year project, but the implemented scope became much larger than a single recognition flow. The product covers 11 core use cases across account management, dashboards, real-time recognition, searchable gesture dictionaries, community contributions, 3D avatar generation, learning materials, proficiency testing, progress tracking, forum/chat, and admin moderation. On the architecture side, the web app stays in a Next.js plus Supabase stack, while Malaysian Sign Language inference is isolated in a Dockerized FastAPI service deployed to AWS EC2 through GitHub Actions.
Key Highlights
- Primary users
- Deaf and hearing learners, contributors, and admins
- Core flows
- 11 use cases across learning, recognition, dictionary, avatars, and community
- MSL model
- 44-class combined classifier with PSO hyperparameter tuning
- Deployment
- Next.js + Supabase + Dockerized FastAPI on AWS EC2
The Challenges
- Most sign-language tools solve one narrow job: a recognition demo, a static glossary, or a learning page. Learners and contributors still have to jump between disconnected experiences.
- Supporting MSL required more than a notebook model. The classifier needed a deployable API, confidence handling, and a clean boundary between the web product and inference service.
- The platform had to serve very different actors, including deaf users, non-deaf users, contributors, and admins, while keeping moderation, accessibility, and learning progression understandable.
Project Goals
- Unify recognition, dictionary search, personalized learning, contributions, avatar capture, and community interaction into one platform instead of separate prototypes.
- Support both ASL and MSL, with MSL backed by a dedicated model-serving pipeline rather than a local research-only workflow.
- Use proficiency tests, role-aware recommendations, and progress tracking to turn the product into a guided learning system instead of a content dump.
- Keep the ML deployment reproducible with Docker, GitHub Actions, and AWS EC2 so model updates could ship without hand-rebuilding the server each time.
System Architecture
Platform architecture overview
This diagram summarizes the full SignBridge system boundary: browser clients, the Next.js 15 application, Supabase for auth/data/realtime/storage, and the external Malaysian Sign Language inference service deployed on AWS EC2.
Web product and learning surface
The Next.js 15 App Router application carries the public landing pages and authenticated product flows: dashboard, tutorials, quizzes, proficiency tests, gesture search, contribution forms, forum/chat, profile management, and admin screens. Server routes also act as a proxy layer so the browser can call ML services without mixed-content or CORS problems.
Identity, data, and realtime collaboration
Supabase centralizes PostgreSQL data, authentication, storage, realtime chat, and RLS-backed access control. This kept the product cohesive while supporting user roles, learning progress, contributions, moderation state, and uploaded resources without a separate custom backend.
Recognition service boundary
Language-specific recognition requests are routed through a Next.js API endpoint that forwards images to external ML services. For MSL, the service is a FastAPI app that loads a PSO-optimized combined classifier on startup, exposes `/predict-image/` and `/ws/recognize`, and returns label, confidence, and sign type for uploaded frames or real-time streams.
Model deployment pipeline
The MSL repo packages the combined PSO service into Docker, pushes images to Amazon ECR via GitHub Actions, then SSHes into EC2 to pull and recreate containers followed by health checks. That turns the model server into a repeatable deployment instead of a manual lab setup.
3D avatar contribution pipeline
A browser-side MediaPipe hand-tracking flow captures landmark data, renders it with React Three Fiber and Three.js, and stores recordings as privacy-preserving avatar data instead of raw video. After admin approval, avatars are automatically published into the gesture dictionary as searchable contributions.
Technology Rationale
Tradeoffs & Constraints
Product breadth vs. polish
Supporting 11 use cases created a more credible platform, but it also meant each module had to be scoped carefully to avoid turning the capstone into a shallow collection of demos.
Web app simplicity vs. external ML service
Splitting the system into a web product and an EC2 inference API added moving parts, but it avoided trying to run heavy model inference inside the main frontend deployment.
Privacy-preserving avatars vs. media richness
Landmark recordings are much lighter and safer than storing raw videos, but they also limit how much contextual visual information is available for later review.
Accuracy vs. training cost
PSO tuning improved the MSL classifier, but it pushed training time into multi-hour CPU jobs and increased the operational weight of the model pipeline.
Lessons Learned
- Accessibility products become much harder once they serve multiple actors and learning states; role design and moderation boundaries matter as much as UI polish.
- A separate inference boundary is worth the extra ops work when the frontend and ML runtime have very different hosting requirements.
- Storing structured hand landmarks instead of raw media is a practical privacy pattern when building contribution workflows for sensitive user-generated content.
- Recommendation systems feel more trustworthy when they are tied to explicit proficiency tests, role-aware content tagging, and clear reasons for each suggestion.
Outcomes & Next Steps
SignBridge became an end-to-end platform rather than a single-feature capstone: a Next.js 15 application backed by Supabase, plus a separate FastAPI MSL service that serves a PSO-optimized 44-class combined model for alphabets, numbers, and common words. The result is a much stronger case study because it demonstrates product design, real-time browser integration, ML serving, moderation workflows, and cloud deployment in one system.
- Expand the MSL dataset beyond the current 44-class combined model and measure confidence thresholds against real user capture conditions.