Lead Data Engineer

Data Engineer

Icon company Company

GFT Group

Icon salary Salary
Negotiable
Icon Location Location
Hanoi
Icon Vacancies Vacancies
1 person(s)

Job Overview And Responsibility

As a Lead/Senior Data Engineer at GFT, you will be responsible for managing, designing, and enhancing data systems and workflows that drive key business decisions. The role is focused 75% on data engineering, involving the construction and optimization of data pipelines and architectures, and 25% on supporting data science initiatives through collaboration with data science teams for machine learning workflows and advanced analytics. You will leverage technologies like Python, Airflow, Kubernetes, and AWS to deliver high-quality data solutions. Key Activities - Architect, develop, and maintain scalable data infrastructure, including data lakes, pipelines, and metadata repositories, ensuring the timely and accurate delivery of data to stakeholders - Work closely with data scientists to build and support data models, integrate data sources, and support machine learning workflows and experimentation environments - Develop and optimize large-scale, batch, and real-time data processing systems to enhance operational efficiency and meet business objectives - Leverage Python, Apache Airflow, and AWS services to automate data workflows and processes, ensuring efficient scheduling and monitoring - Utilize AWS services such as S3, Glue, EC2, and Lambda to manage data storage and compute resources, ensuring high performance, scalability, and cost-efficiency - Implement robust testing and validation procedures to ensure the reliability, accuracy, and security of data processing workflows - Stay informed of industry best practices and emerging technologies in both data engineering and data science to propose optimizations and innovative solutions

Required Skills and Experience

- Core Expertise: Proficiency in Python for data processing and scripting (pandas, pyspark), workflow automation (Apache Airflow), and experience with AWS services (Glue, S3, EC2, Lambda) - Containerization & Orchestration: Experience working with Kubernetes and Docker for managing containerized environments in the cloud - Data Engineering Tools: Hands-on experience with columnar and big data databases (Athena, Redshift, Vertica, Hive/Hadoop), along with version control systems like Git - Cloud Services: Strong familiarity with AWS services for cloud-based data processing and management - CI/CD Pipeline: Experience with CI/CD tools such as Jenkins, CircleCI, or AWS CodePipeline for continuous integration and deployment - Data Engineering Focus (75%): Expertise in building and managing robust data architectures and pipelines for large-scale data operations - Data Science Support (25%): Ability to support data science workflows, including collaboration on data preparation, feature engineering, and enabling experimentation environments

Why Candidate should apply this position

- Competitive salary - 13th-month salary guarantee - Performance bonus - Professional English course for employees - Premium health insurance - Extensive annual leave

Similar jobs