Problem Solving Systems Software C++ Automation Tools
• Develop efficient infrastructure and tools for automating complex software processes • Drive performance optimization through advanced test harnesses, benchmarking frameworks, and analytical tools • Apply deep knowledge of OS internals, kernel, device drivers, memory, storage, networking, and interconnects to build and troubleshoot performant systems • Work with engineering teams to understand needs, define requirements, and deliver efficient solutions • Set performance goals, monitor feedback, analyze data, and drive continuous improvements for system reliability • Influence technical strategy and contribute to roadmaps for platform automation initiatives
• Bachelor’s or equivalent in Computer Science, Computer Engineering, or related field (Master’s preferred) • 6+ years of experience in software development, focused on infrastructure, distributed systems, automation, or performance engineering • Proven system-level programming expertise in C++, Python, or Go • Deep understanding of system software, including OS internals, device drivers, memory management, and debugging complex compute performance issues • Experience designing, building, and operating large-scale distributed systems with strong networking and cluster management knowledge • Proficiency in building and maintaining automation, CI/CD, and performance benchmarking pipelines • Excellent analytical, problem-solving, and debugging skills • Strong interpersonal and communication skills to work across teams and explain complex technical concepts
• Benefits will be shared in detail with successful candidates
• Experience optimizing AI/ML workloads, particularly inference applications, across diverse hardware platforms • Background in building or contributing to compute infrastructure in cloud or on-premise environments • Familiarity with Docker, Kubernetes, and container orchestration technologies • Experience using profiling tools and methodologies for hardware and software systems • Proven ability to deliver significant efficiency or architectural improvements in large-scale systems