Voice Daily Report System for Multinational Workers

A SaaS that turns native-language voice daily reports into Japanese text so managers can review status and give feedback. Supports 16 languages.

Challenge

Traditional interview-based care had high time constraints and psychological barriers, making adequate care difficult while employee mental health issues were increasing.

Solution

Built an app that transcribes and translates voice daily reports recorded in each worker's native language, with alerts and question templates to help managers notice risk signals earlier.

Result

Created an operating flow for managers to review employee changes before interviews

Team

1 member, 2 months

PM & full engineering

Role

Handled requirements, screen design, voice processing, permissions, and operations design.

Defined the MVP scope around field daily-report formats and manager review workflows.

Tech Stack

FrontendTypeScript / Next.js (App Router)
StateJotai / React Query
UITailwind CSS / Radix UI / shadcn/ui
DatabaseSupabase (PostgreSQL / Auth / Storage / Edge Functions / RLS)
AIOpenAI (Whisper, GPT-4o) / Google (Gemini, NLP API) / Vercel AI SDK
InfrastructureVercel / GitHub Actions

Key Features

01

Processing flow: voice recording → transcription → translation → sentiment/risk check → summary → follow-up question drafting

02

16 language support: Japanese, English, Vietnamese, Indonesian, Myanmar, Nepali, Filipino, Thai, etc.

03

Multi-tenant RBAC: Tenant isolation per workspace (RLS), 4-level role management

04

Customizable question templates: Flexible daily report formats adaptable to industry and workplace

05

API cost control through model selection by task and fallback chains

06

Monthly partitioning: Partition high-frequency tables to prevent performance degradation at scale

Technical Highlights

Security Architecture Improvement

Changed from public API endpoint-based analysis to internal trigger approach using Supabase Edge Function + pg_cron, eliminating the DDoS attack surface.

API Cost Control

Used GPT-4o-mini/Gemini Flash (cheap) for translation, Gemini Pro only for complex reasoning. Built fallback chain from Google NLP API → OpenAI for unsupported languages in sentiment analysis.

Multilingual Voice Processing Challenges

Improved accuracy for 16-language transcription using Whisper language parameters and domain-specific prompts. Addressed Safari voice recording compatibility with custom WebAudio API hooks.