Data Pipeline Engineer – GCP & Distributed Data Processing
Бюджет: $100.0
FIXED /
⭐ 5.00 (204)
United Kingdom
database-design, database-architecture, etl-pipelines, python, google-cloud-platform, data-science, apache-kafka
We are looking for a Data Pipeline Engineer who can help us design and build reliable, scalable data systems on Google Cloud Platform (GCP).
In this role, you will work with large volumes of data and make sure it flows smoothly from different sources into clean, usable formats for analytics and business use. This includes working with JSON files, APIs, databases, and document-based data like PDFs.
You’ll be responsible for building and maintaining ETL pipelines, improving data processing performance, and making sure our systems are stable, fast, and production-ready. We care a lot about clean architecture, proper monitoring, and handling failures gracefully.
You will also work on distributed data processing systems (like Spark or similar tools) to handle large-scale workloads efficiently in the cloud.
What you’ll do:
Design and build scalable ETL/ELT pipelines on GCP
Process and transform large datasets (structured + unstructured)
Work with APIs, JSON data, and document-based inputs (PDFs, etc.)
Improve performance and reliability of existing data workflows
Collaborate with engineering and analytics teams
Ensure data quality, consistency, and observability
If you enjoy working with large-scale data systems and building clean, efficient pipelines that power real products, this role will be a great fit.
Start your proposal with word "Data Engineer" otherwise you will not be considered.
Відкрити на Upwork