Senior Devops/ Site Reliability Engineer

DevOps Terraform DevOps Google Cloud Platform

Icon company 会社

Bestarion

Icon salary 給与
$3,500 まで
Icon Location 勤務地
Ho Chi Minh
Icon Vacancies 総空席数
3 人

福利厚生

13ヶ月目の給与 13ヶ月目の給与
その他の福利厚生 その他の福利厚生
● Fitness & sports activities: football, tennis, table tennis, badminton… ● Commitment to community development: charity every quarter, blood donation, public seminars, career orientation talks… ● Support for personal loans such as home loans, vehicle loans, tuition fees…
年間給与の見直し 年間給与の見直し
● Performance appraisal twice a year
旅行/会社の旅行 旅行/会社の旅行
業績ボーナス 業績ボーナス
追加の健康保険 追加の健康保険

職務概要

Working Time: + Monday - Friday, 8:00 AM - 5:30 PM (Flexible depending on each project) + 1-hour daily standup Tuesday-Friday, likely from 9 PM to 10 PM VNT. + Expectation to Travel to USA: The expectation is 1 - 4 trips/year, with each trip lasting 1-2 weeks. About the project: We're looking for a skilled and motivated DevOps/Site Reliability Engineer (SRE) to join our growing team. In this exciting role, you will be responsible for building and maintaining our cloud infrastructure, automating our CI/CD pipelines, and ensuring the reliability, performance, and scalability of our services. The ideal candidate will have a strong background in both software development and systems engineering, with a focus on GCP and automation tools, and a strong sense of ownership. JOB DESCRIPTIONS: - Design and manage infrastructure on Google Cloud Platform (GCP) using Terraform for Infrastructure as Code (IaC). - Build, configure, and maintain CI/CD pipelines using Jenkins and Groovy scripts to automate software delivery from code commit to production deployment. - Manage Jenkins plugins, master/agent nodes, and pipeline libraries to ensure the stability and scalability of our CI/CD platform. - Troubleshoot and debug automation code and interconnected systems to quickly identify and resolve issues, ensuring minimal disruption to services. - Manage core GCP services including Compute Engine, Managed Instance Groups (MIG), Disk Snapshots, Storage, and Artifact Registry to support our application ecosystem. - Containerize applications using Docker to ensure consistency across development, testing, and production environments. - Implement and manage infrastructure as code, monitoring, and logging solutions to ensure high availability and performance of our systems. - Collaborate with development teams to improve the entire software development lifecycle, from code to production. - Develop and maintain workflows in Airflow to orchestrate complex data and application tasks. - Troubleshoot and resolve production incidents, participate in on-call rotation, perform root cause analysis and perform key maintenance activities quarterly. - Effectively communicate complex technical concepts to both technical and non-technical stakeholders through clear written and verbal communication. - Strong expertise in managing and repaving Windows and Linux machines, ensuring security compliance through automated processes. - Skilled in implementing security compliance measures, including repaving infrastructure, key rotation, and periodic updates to meet industry standards. - Strong knowledge of monitoring and alerting systems, including Prometheus, Cloud Monitoring, and PagerDuty, to ensure system reliability and proactive incident response.

必要なスキルと経験

- Bachelor's degree in Computer Science, Information Technology, or a related field. - Have over 5+ years of experience as a DevOps Engineer, SRE, or a similar role. Excellent verbal and written English communication skills are essential. You must be able to clearly document processes, write concise reports, and articulate technical issues to various audiences. - Strong proficiency with Terraform for managing cloud resources. - Hands-on experience with Jenkins, including managing Jenkins masters and agents, and writing Groovy scripts for pipeline automation. - Proven ability to troubleshoot and resolve issues in complex, interconnected systems quickly and efficiently. - Expertise in GCP services, including Compute Engine, MIG, Disk Snapshots, Storage, and Artifact Registry. - Solid experience with Docker and containerization principles. - Familiarity with Airflow for workflow management and orchestration. - Strong understanding of Linux/Unix systems, networking, and security principles. - Excellent problem-solving skills and a collaborative, team-oriented mindset. - Maintenance Work Hours: The resource will need to work USA hours for three days every three months to perform maintenance on key production systems.

この求人に応募する理由

- The company will fully cover all travel and relocation expenses related to the U.S. assignment. - Attractive salary and benefits (13th salary, distinguished employee of the quarter and year, seniority award…) - Performance appraisal twice a year - Healthcare and accident insurance - Various training on best practices and soft skills - Team Building activities in every summer, company trip, big annual year-end party every year, etc - Fitness & sports activities: football, tennis, table tennis, badminton… - Commitment to community development: charity every quarter, blood donation, public seminars, career orientation talks… - Support for personal loans such as home loans, vehicle loans, tuition fees…

類似の仕事