AI AgentHR Tech

AI Voice Daily Report System for Foreign Workers

A SaaS that automatically transcribes, translates to Japanese, analyzes sentiment, and assesses risk from voice recordings by foreign workers in their native language, allowing managers to provide feedback in Japanese. Supports 16 languages.

Challenge

Traditional interview-based care had high time constraints and psychological barriers, making adequate care difficult while employee mental health issues were increasing.

Solution

Developed a mental health care app leveraging AI translation and sentiment analysis. Provided an environment where employees can easily seek support through emotion alert features.

Result

Supported early detection of mental health issues, reducing leave-of-absence and turnover rates

Team

1 member, 2 months

PM & full engineering

Role

Responsible for everything from requirements to design, implementation, and operations.

Partnered with the business side, leading PM and engineering from the validation stage.

Tech Stack

FrontendTypeScript / Next.js (App Router)
StateJotai / React Query
UITailwind CSS / Radix UI / shadcn/ui
DatabaseSupabase (PostgreSQL / Auth / Storage / Edge Functions / RLS)
AIOpenAI (Whisper, GPT-4o) / Google (Gemini, NLP API) / Vercel AI SDK
InfrastructureVercel / Terraform / GitHub Actions
AnalyticsPostHog
PaymentStripe / Lemon Squeezy

Key Features

01

6-stage AI pipeline: Voice recording → Whisper transcription → GPT/Gemini translation → Sentiment analysis → Risk assessment → Summary & follow-up question generation

02

16 language support: Japanese, English, Vietnamese, Indonesian, Myanmar, Nepali, Filipino, Thai, etc.

03

Multi-tenant RBAC: Tenant isolation per workspace (RLS), 4-level role management

04

Customizable question templates: Flexible daily report formats adaptable to industry and workplace

05

AI cost optimization: Model tiering (GPT-4o-mini/Gemini Flash → Gemini Pro) and fallback chains for optimal cost performance

06

Monthly partitioning: Partition high-frequency tables to prevent performance degradation at scale

Technical Highlights

Security Architecture Improvement

Changed from public API endpoint-based analysis to internal trigger approach using Supabase Edge Function + pg_cron, eliminating the DDoS attack surface.

AI API Cost Optimization

Used GPT-4o-mini/Gemini Flash (cheap) for translation, Gemini Pro only for complex reasoning. Built fallback chain from Google NLP API → OpenAI for unsupported languages in sentiment analysis.

Multilingual Voice Processing Challenges

Improved accuracy for 16-language transcription using Whisper language parameters and domain-specific prompts. Addressed Safari voice recording compatibility with custom WebAudio API hooks.