We'd like to store some cookies on your computer that will help us gather some extra anonymous data about how you use our website to help us improve our service. If you're happy for us to do this, please click Accept; otherwise click Reject.
Data Pipeline Architecture: Design and implement scalable and efficient data pipelines to handle large volumes of real-time data.
Data Ingestion and Processing: Develop and maintain robust data ingestion and processing frameworks using Kafka Connect, Kafka, and Clickhouse.
Data Modeling and Storage: Design and implement appropriate data models and storage solutions to support real-time analytics and AI.
Performance Optimization: Continuously monitor and optimize data pipelines for performance, scalability, and reliability.
Collaboration: Work closely with Machine Learning Engineers and Data Analysts to ensure seamless integration of data pipelines with AI/ML models and visualisation tools.
Required
Skills
and
Experience:
Technologies:
Data Streaming: Kafka Connect, Kafka, Pulsar
Data Warehousing: Clickhouse, Snowflake, BigQuery, BigTable, Cassandra
AI & Analytics: Vertex AI, bicycle.ai, Cosmos.ai or similar tools
Data Visualization: Looker, Tableau, Power BI
Cloud Computing: Google Cloud Platform, AWS, Azure
Programming Languages & Frameworks: Python, Spark, Flink, Scala, SQL
Data Modeling: Dimensional modeling, data warehousing methodologies
Professional and Soft Skills:
Experience: Minimum of 5 years of experience in data engineering roles, with a strong focus on real-time data analytics.
Communication Skills: High proficiency in English (both written and oral).
Agile Development: Familiarity with agile development methodologies, such as Scrum or Kanban, is helpful for working in a fast-paced, iterative environment.
Education: Bachelor's degree in Computer Science, Engineering, or a related field. A Master's degree is highly desirable.
Problem-Solving Skills: Strong capability in identifying issues and formulating effective solutions.
Certifications: Relevant certifications in data engineering and cloud technologies are highly desirable.