Data Pipeline Engineer – GCP & Distributed Data Processing

Bütçe: $100.0 FIXED / ⭐ 5.00 (204) United Kingdom

database-design, database-architecture, etl-pipelines, python, google-cloud-platform, data-science, apache-kafka

We are looking for a Data Pipeline Engineer who can help us design and build reliable, scalable data systems on Google Cloud Platform (GCP). In this role, you will work with large volumes of data and make sure it flows smoothly from different sources into clean, usable formats for analytics and business use. This includes working with JSON files, APIs, databases, and document-based data like PDFs. You’ll be responsible for building and maintaining ETL pipelines, improving data processing performance, and making sure our systems are stable, fast, and production-ready. We care a lot about clean architecture, proper monitoring, and handling failures gracefully. You will also work on distributed data processing systems (like Spark or similar tools) to handle large-scale workloads efficiently in the cloud. What you’ll do: Design and build scalable ETL/ELT pipelines on GCP Process and transform large datasets (structured + unstructured) Work with APIs, JSON data, and document-based inputs (PDFs, etc.) Improve performance and reliability of existing data workflows Collaborate with engineering and analytics teams Ensure data quality, consistency, and observability If you enjoy working with large-scale data systems and building clean, efficient pipelines that power real products, this role will be a great fit. Start your proposal with word "Data Engineer" otherwise you will not be considered.

Upwork'te aç