Job Overview
Job Type
Full-time
Japanese Level
None Required
Category
Tech & Engineering
Description
About the role At Starley, we develop and operate "Cotomo," one of Japan's largest voice-based AI conversation applications. Under the concept "Whether you want to talk or remain silent," we are continuously exploring innovative ways to enhance human-AI interaction. This position requires not only backend expertise but also a strong commitment to overall product improvement including UX/UI enhancements and feature planning. TechStack Python, Rust, TypeScript, WebSocket, WebRTC, ElasticSearch, PostgreSQL, GCP, Azure, AWS, Unity, Weights & Biases, NVIDIA Triton, vllm, pytorch, transformers, deepspeed, Dataform, BigQuery, Sentry, Slack, Github Responsibilities Design and implement efficient, highly available infrastructure to support high traffic Build backend systems that integrate various AI models, including speech recognition, natural language processing, and speech synthesis Develop streaming systems to deliver high-quality, real-time voice communication Construct and optimize scalable frameworks for large-scale data processing Collaborate with product managers and designers to participate in product improvement and feature planning Who we're looking for: An individual with a strong curiosity towards new technologies and a desire to grow through hands-on development A problem-solver who prioritizes user experience and can devise practical solutions A collaborative team player who values open communication Requirements At least 3 years of experience in designing, implementing, and maintaining backend systems Proficiency with relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases Basic understanding of real-time communication technologies (e.g., WebRTC, WebSocket) Experience working with cloud platforms (e.g., AWS, GCP, Azure) Experience in building or improving CI/CD pipelines (personal project experience is acceptable) Practical experience in integrating new tools and technologies (e.g., RAG, Cursor, Devin) or equivalent hands-on experience Fluency in Japanese for daily communication Preferred Experience While not specifically required, tell us if you have any of the following. Experience working in an early-stage startup environment Technical communication skills in English Exposure to machine learning model operations Experience with home server setup and management Basic familiarity with deep learning models (e.g., LLMs) and fine-tuning techniques Familiarity with speech recognition or natural language processing is a plus Location/Work Style: Based in Tokyo, Japan (primarily working from our Akasaka office with some remote flexibility) VISA sponsorship available for international candidates Compensation Starting from 7,500,000 JPY per year, with performance-based stock options

