Skip navigation EPAM

Site Reliability Engineer Sofia, Bulgaria or Remote

Site Reliability Engineer Description

EPAM is committed to providing our global team of more than 41,150 EPAMers with inspiring careers from day one. EPAMers think creatively and lead with passion and honesty. Our people are the source of our success. We value collaboration, work in partnership with our customers, and strive for the highest standards of excellence. In today’s market conditions, we’re supporting operations for hundreds of clients around the world remotely. No matter where you are located, you’ll join a dedicated, diverse community that will help you discover your fullest potential.

You are curious, persistent, logical and with a growth mindset – a true techie at heart. You enjoy living by the code of your craft and developing elegant solutions for complex problems. If this sounds like you, this could be the perfect opportunity to join EPAM as SITE RELIABILITY ENGINEER or any level above.

What You’ll Do

  • Building and maintaining observability ecosystem including the following components: Prometheus, Corthex/GME, Grafana, Loki, Jaeger, FluentD/Bit, etc
  • Coding work (Python and Go) on projects such as building API’s, self-service portals and observability platform automation and integration
  • Help to support a large-scale observability platform, including on-call shifts
  • Working with other team members to define the architectures and practices that should be adopted to deliver observability platform operational goals
  • Contributing to research and tooling for monitoring and performance improvement to provide solid SLAs for our customers
  • Assist internal customers on getting the most from company observability platforms, both guarding during onboarding stage and encouraging the best practice methodologies

What You Have

  • 2+ years of experience with application and infrastructure monitoring tools
  • Understanding of foundational monitoring concepts: avoiding noise, defining CLI/SLO, etc
  • Experience designing and implementing central logging solution by using tools such as FluentBit, FluentF, PromTail or LogStash
  • Experience with containerization (e.g. Docker, Kubernetes, Amazon EKS or Azure AKS)
  • Familiarity with Terraform, Ansible automation platform
  • 2+ years of experience working cloud infrastructure based on AWS and Azure platforms
  • 2+ years of programming/scripting experience with Python, PowerShell, shell scripting or GO
  • 2+ years of experience with CI/CD, Git, Terraform or Jenkins
  • Experience providing operational support for a production service
  • Understanding of Linux and networking fundamentals
  • Strong problem-solving characteristics
  • Clear, concise communication skills and good command of written and spoken English

Nice to have

  • Experience with Prometheus/PromQL/Cortex/Loki/Grafana platform
  • Experience in custom ETL design, implementation, and maintenance. SQL and relational database knowledge (MS SQL, MySQL, PostgreSQL)

We offer

  • Opportunity to engineer your future
  • Personal development program that will allow you to be valued for your strengths
  • Wide range of professional trainings and workshops
  • Broad projects variety and possible mobility between projects over the time
  • Collaborate in a multicultural environment and exchange best practices with colleagues around the world
  • Varied social benefits; Sports, Transportation and Health programs
  • Employee Stock Purchase Plan
  • Work-life balance and flexible schedule, team buildings and sport opportunities
  • Modern office in the new Infinity Tower business center
  • Remote By Design™ - we provide you with virtual working environment making you able to be productive and work from any location – being your home, a nice sunny terrace, summer or any of the EPAM offices all around the globe

Great! What's Next?

  • Send us your up-to-date CV or LinkedIn profile, representing your skills and past projects’ achievements
  • We’ll get in touch with you and help you ensure you’ve identified the most suitable role we have for you based on your strengths and passion
  • We’ll invite you for follow up technical conversations with your potential future colleagues and/or project, which can be done face to face in a calm meeting room at our office or via video conferencing call from your cozy place
  • At the end we will align how we can best proceed further in a mutual agreement
  • P.S. Want to prepare better and Grow yourself? Check our training and development resources and tailor your own PDP – free for any passionate IT professional at

Witaj. W czym możemy pomóc?