Jobomschrijving
YOUR RESPONSIBILITIES
You will contribute to existing ETL Pipelines:
1. Make Data accessible for Data Analysts, providing them efficient Data Marts.
2. Processing on Big Data with Pyspark and using as least as possible memory usage.
3. Having your jobs launched in pre-production and production environments and following good guidelines in order to be sure that your code is working properly.
4. Understanding complex Software and Data Architecture and finding a way to answer a need efficiently.
You will work with a complete data engineer environment:
5. Doing peer and version controlled programming thanks to Github
6. Use Python programming language and OOP
7. Using Pyspark as a Framework for big Data
8. Using DataBricks and Kubernetes to run your job
9. Using Apache Airflow as a scheduler
10. Using the necessities of AWS for your needs (S3, ECR, EKS, MWAA, CA)
In collaboration with our senior Data Engineer you will:
11. Implement new pipelines:
12. Construct projects pipelines validated by a United team.
13. Write good quality code thanks to templates with Unit and Integration testing.
14. Follow good practices regarding personal and sensitive Data storage.
15. Monitor our existing Data:
16. Fix bugs on the pipeline in order to make it as quick as possible available again
17. Proactive monitoring and checking Data quality daily.
You are a member of Decathlon's international Data Engineer community:
18. Be an integral part of the digital united community, although attached to the Decathlon Belgium team.
19. Communicate with counterparts in other countries to stay aligned and up to date with our processes.
Jobprofiel
WHO ARE YOU?
20. You follow a bachelor or master degree program (engineering, informatics or statistics) with a specialization in data engineering, big data or data science.
21. You can use most common programming languages for data engineering (SQL, Python, Scala, Java, C).
22. You have already worked with Pyspark or Spark with Scala
23. You know how to use basic Github and follow good practices in terms of Version control.
24. You Know basic Terminal Usage and UNIX environment (MacOS/Linux)
25. You already worked with one dashboarding tool (e.g., Google Data Studio / Looker Studio, Tableau, Power BI etc…).
26. You are fluent in French and have a B2 level of English, important in the international context of this role. A basic knowledge of Dutch would be a plus.
27. An experience with at least one of the main Cloud Environments (GCP, AWS, Azure) would be a plus.
28. An experience with DataBricks would be a plus.
29. An experience with Docker/Kubernetes would be a plus.
WHAT WE OFFER
30. Flexible work organization depending on the school
31. Freedom of choice of laptop (Mac, Windows)
32. Skills development (diversity of projects, languages and technologies)
33. Internal training