Big Data | Data Software Engineer Remote
Big Data | Data Software Engineer Description
Job #: 82858DESCRIPTION
You are curious, persistent, logical and clever – a true techie at heart. You enjoy living by the code of your craft and developing elegant solutions for complex problems. If this sounds like you, this could be the perfect opportunity to join EPAM as a "Big Data" Now, Data Software Engineer! . Scroll down to learn more about the position’s responsibilities and requirements.
#REF_DR22_MX
Requirements
- 3+ years of experience in software development with one (preferably 2) of the following programming languages: Python/Java/Scala
- Experience with Linux OS: configure services and write basic shell scripts, understanding of network fundamentals
- Good knowledge of SQL (DML, DDL, Advance functions) and NoSQL (Mongo DB, Cassandra, Elastic search)
- Object oriented design or functional programming design
- Advanced experience in software development e.g., configuration management, monitoring, debugging, performance tuning, documentation, Agile experience
- Experience with testing (Unit, Integration, End to end)
- Engineering experience and practice in Data Management, Data Storage, Data Visualization, Integration, Operation, Security
- Experience building data ingestion pipelines, Data pipeline orchestration
- Experience with data modeling; hands-on development experience with modern Big Data components
- Our tech stack includes Hadoop, Apache Hive, Spark, Kafka, Apache Beam, Storm, Jenkins, Streaming, Databricks, and Airflow
- Cloud experience in at least one of the following clouds: Azure, GCP, AWS
- Good understanding of CI/CD principles and best practices (Jenkins, Travis CI, Spinnaker
- Container experience (Docker, Kubernetes)
- Analytical approach to problem; excellent interpersonal, mentoring and communication skills
- English proficiency
- Advanced understanding of distributed computing principles
Technologies
- Programming Languages: Python/Java/Scala and SQL
- Cloud-Native stack: Databricks, Azure DataFactory, AWS Glue, AWS EMR, Athena, GCP DataProc, GCP DataFlow
- Big Data stack: Spark Core, Spark SQL, Spark ML, Kafka, Kafka Connect, Airflow, Nifi, Streamset
- NoSQL: CosmosDB, DynamoDB, Cassandra, HBase; MongoDB
- Queues and Stream processing: Kafka Streams; Flink; Spark Streaming
We offer
- Career plan and real growth opportunities
- Unlimited access to LinkedIn learning solutions
- International Mobility Plan within 25 countries
- Constant training, mentoring, online corporate courses, eLearning and more
- English classes with a certified teacher
- Support for employee’s initiatives (Algorithms club, toastmasters, agile club and more)
- Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more.)
- Flexible work schedule and dress code
- Collaborate in a multicultural environment and share best practices from around the globe
- Hired directly by EPAM & 100% under payroll
- Law benefits (IMSS, INFONAVIT, 25% vacation bonus)
- Major medical expenses insurance: Life, Major medical expenses with dental & visual coverage. (For the employee and direct family members)
- 13 % employee savings fund, capped to the law limit
- Grocery coupons
- 10 vacations days plus 2 floating days
- Official Mexican holidays, plus five extra holidays (Holy Thursday and Friday, November 2nd, December 24th & 31st
- Relocation bonus: transportation, 2 weeks of accommodation for you and your family and more
- 30 days of salary End of Year bonus
- Monthly non-taxable amount for the electricity and internet bills
- Employee Stock Purchase Program
Conditions
- By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM´s Privacy Notice (https://www.epam.com/applicant-privacy-notice) and Policy (https://www.epam.com/privacy-policy)
About the Project
"Big Data" term is becoming obsolete. Most of Big Data technologies are de-facto standards for Data Management. So, it was decided to rebrand "Big Data" Primary Skill into "Data Software Engineering"Data Software Engineer enables data-driven decision making by collecting, transforming, and publishing data. A Data Software Engineer should be able to design, build, operationalize, secure, and monitor data processing systems with a particular emphasis on security and compliance; scalability and efficiency; reliability and fidelity; and flexibility and portability. A Data Software Engineer should also be able to leverage, deploy, and continuously train pre-existing machine learning models.
Data Software Engineer must have solid knowledge of data processing languages, such as JVM-based (Scala, Java,...) or Python, SQL, Linux, and shell scripting. Also, they need to understand parallel data processing and data architecture patterns.