Job Overview
Salary
¥8,000,000 - 12,000,000/year
Job Type
Full-time
Japanese Level
None Required
Category
Tech & Engineering
Description
In this position you’ll be joining a team that functions as an independent startup within CADDi. You’ll be creating a new application based on top of our core platform, CADDi Drawer, which manages large archives of engineering drawings, documents, and supply-chain data. We have developed an initial version of the new product and are now preparing for its official release. As we move toward a full-scale launch, ensuring system reliability and scalability has become critical. To support this phase, we are looking for an SRE who can take ownership of the platform’s reliability—designing and improving monitoring, incident response, automation, and infrastructure practices. Responsibilities As the SRE, design reliability, scalability, and operability from the ground up to support a rapidly growing product. Partner with Engineering teams to embed reliability and performance into product design. Build automation-first systems for infrastructure, deployments, scaling, and incident prevention to ensure sustainable operations. Design and operate internal platforms and DevOps practices (CI/CD, development and testing environments) to maximize developer productivity. Define, implement, and operate SLIs/SLOs aligned with product strategy, enabling data-driven reliability decisions. Establish incident response processes with a strong focus on learning, prevention, and continuous improvement. Design and operate cloud infrastructure (GCP) with security and compliance. Act as a leader shaping reliability culture and laying the foundation for the team. What You will Gain Experience in playing a central role in improving the reliability, scalability, and operational excellence of a new product at CADDi, from early-stage launch through growth. Experience providing significant value to society through the development of products that transform industrial structures SRE experience in an organization with global business operations Experience tackling challenging problems alongside highly motivated team members Requirements 7+ years of hands-on experience in software development 5+ years of experience in an SRE team or a closely related role (e.g., platform engineering, reliability engineering) Experience designing, building, and operating architectures using cloud services Experience applying Infrastructure as Code (IaC) to manage scalable and repeatable infrastructure Hands-on operational experience with container orchestration technologies such as Kubernetes Experience designing, building, and operating CI/CD pipelines, with a focus on reliability and delivery safety Experience developing and operating web applications, including production troubleshooting and performance considerations Fluent business-level Japanese communication skills Examples include: Japanese Language Proficiency Test (JLPT) N2 level or equivalent, or Approximately 3 years of professional work experience in a Japanese-speaking environment Nice to have Experience designing and operating distributed systems Experience in designing, developing, and operating backend systems for high-traffic web applications Experience designing, building, and operating systems on Google Cloud Platform (GCP) Experience designing and operating monitoring and observability platforms, such as Datadog Experience promoting and embedding SRE culture within an organization (e.g., team formation, enabling other teams, education, and advocacy) Hands-on SRE experience in an engineering organization with 50+ engineers Solid foundational knowledge of networking concepts Compensation ¥8,000,000 ~ ¥12,000,000 annually. Stock option Hiring Process Application Review We’ll take a look at your application and contact you within X business days if we are interested. Coding Assignment/Technical Writing Sample 1st Round Interview (Hiring Manager) 2nd Round Interview (Engineering Manager) Final Round Interview (CTO/VPoE)
Requirements
- 7+ years of hands-on experience in software development
- 5+ years of experience in an SRE team or a closely related role (e.g., platform engineering, reliability engineering)
- Experience designing, building, and operating architectures using cloud services
- Experience applying Infrastructure as Code (IaC) to manage scalable and repeatable infrastructure
- Hands-on operational experience with container orchestration technologies such as Kubernetes
- Experience designing, building, and operating CI/CD pipelines, with a focus on reliability and delivery safety
- Experience developing and operating web applications, including production troubleshooting and performance considerations
- Fluent business-level Japanese communication skills
- Examples include:
- Japanese Language Proficiency Test (JLPT) N2 level or equivalent, or
- Approximately 3 years of professional work experience in a Japanese-speaking environment
- Nice to have
- Experience designing and operating distributed systems
- Experience in designing, developing, and operating backend systems for high-traffic web applications
- Experience designing, building, and operating systems on Google Cloud Platform (GCP)
- Experience designing and operating monitoring and observability platforms, such as Datadog
- Experience promoting and embedding SRE culture within an organization (e.g., team formation, enabling other teams, education, and advocacy)
- Hands-on SRE experience in an engineering organization with 50+ engineers
- Solid foundational knowledge of networking concepts
- Compensation
- ¥8,000,000 ~ ¥12,000,000 annually.
- Stock option
- Hiring Process

