Skip navigation EPAM

Site Reliability Engineer Remote

  • hot

Site Reliability Engineer Description

DESCRIPTION



Join our dynamic team as a Site Reliability Engineer and lead the way in optimizing and automating our Linux-based infrastructure.

With 3 to 5 years of experience in Site Reliability Engineering, DevOps, or Infrastructure, you will play a crucial role in elevating our capabilities and ensuring high-impact, internet-facing production services run smoothly.

If you are passionate about driving innovation and efficiency in a tech-forward environment, we want you on our team.

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Responsibilities

  • Optimize Linux operating systems for high-impact, internet-facing production services and distributed systems
  • Implement and coordinate advanced telemetry using tools like Splunk, Grafana, and Prometheus
  • Address complex issues in Kubernetes, setting high standards and best practices for the team
  • Create and maintain cutting-edge automation scripts using Bash and Python to enhance operational efficiency
  • Build and operate advanced systems such as Kubernetes or EKS, sharing knowledge and experience with the team
  • Design and maintain robust, high-performance cloud infrastructure with AWS, ensuring top levels of availability and reliability
  • Champion innovative automation solutions to minimize manual work and drive efficiency
  • Lead deployment automation using tools like Terraform or CloudFormation to boost productivity and reliability

Requirements

  • 3 – 5 years of experience in SRE roles
  • Advanced experience with Linux system administration & IIS
  • Advanced experience with Kubernetes, Docker, and containerization
  • Proficient in managing Infrastructure as Code via tools such as Terraform or CloudFormation
  • Proficient in Python & Bash
  • Proficient in Splunk, Prometheus & Grafana

We Offer

  • Career plan and real growth opportunities
  • Unlimited access to LinkedIn learning solutions
  • International Mobility Plan within 25 countries
  • Constant training, mentoring, online corporate courses, eLearning and more
  • English classes with a certified teacher
  • Support for employee’s initiatives (Algorithms club, toastmasters, agile club and more)
  • Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more)
  • Flexible work schedule and dress code
  • Collaborate in a multicultural environment and share best practices from around the globe
  • Hired directly by EPAM & 100% under payroll
  • Law benefits (IMSS, INFONAVIT, 25% vacation bonus)
  • Major medical expenses insurance: Life, Major medical expenses with dental & visual coverage (for the employee and direct family members)
  • 13 % employee savings fund, capped to the law limit
  • Grocery coupons
  • 30 days December bonus
  • Employee Stock Purchase Plan
  • 12 vacations days plus 4 floating days
  • Official Mexican holidays, plus 5 extra holidays (Maundry Thursday and Friday, November 2nd, December 24th & 31st)
  • Relocation bonus: transportation, 2 weeks of accommodation for you and your family and more
  • Monthly non-taxable amount for the electricity and internet bills

Conditions

  • By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM´s Privacy Notice and Policy

Witaj. W czym możemy pomóc?

NASZE LOKALIZACJE