AI engineer in Jaipur
Rozpočet: $1800.0
FIXED /
⭐ 5.00 (2)
India
databricks-platform, artificial-intelligence, machine-learning
Role Overview
We are looking for a highly skilled Data Scientist to join our advanced analytics team. You will be responsible for end-to-end development of predictive churn models to help us proactively retain customers. You will leverage the Databricks Intelligence Platform to handle large-scale data processing, feature engineering, and MLOps, turning raw customer data into actionable retention strategies.
Key Responsibilities
Model Development: Design, train, and validate machine learning models (e.g., Random Forest, XGBoost, Neural Networks) to predict customer churn probability.
Data Engineering: Utilize PySpark and SQL within Databricks to build robust, scalable data pipelines for ingestion, cleaning, and feature engineering.
MLOps & Deployment: Manage the full machine learning lifecycle using MLflow for experiment tracking, model versioning, and deploying models to production as serving endpoints.
Insight Generation: Perform exploratory data analysis (EDA) to identify key churn drivers and collaborate with marketing/CRM teams to design personalized retention campaigns.
Governance: Ensure all data models and assets are governed under Unity Catalog, adhering to security and compliance best practices.
Innovation: Stay current with LLM/GenAI advancements to explore how generative AI can automate personalized outreach and retention content.
Required Technical Skills
Languages: Expert-level Python (Pandas, NumPy, Scikit-Learn).
Platform: Deep experience with Databricks (Notebooks, Workflows, Delta Lake, Cluster Management).
Big Data: Proficiency in Apache Spark (PySpark) for distributed computing.
Machine Learning: Strong understanding of classification algorithms, handling imbalanced datasets (e.g., SMOTE, class weighting), and evaluation metrics (Precision, Recall, F1-Score, AUC-ROC).
Tools: Experience with MLflow for experiment tracking and model registry.
Preferred Qualifications
Experience with SQL for advanced data querying.
Familiarity with cloud platforms (AWS, Azure, or GCP) within a Databricks environment.
Knowledge of GenAI or LLM integration for automated customer engagement.
Prior experience in telecom, retail, or subscription-based industry churn analysis.
Understanding the Pipeline
To help you better grasp how these technologies work together in a production environment, this diagram illustrates the flow:
Data Ingestion: Raw logs and CRM data are pulled into the Lakehouse.
Preparation: PySpark is used for cleaning and feature engineering.
Modeling: Algorithms are trained and tracked via MLflow.
Action: The model outputs predictions that feed into BI dashboards or automated marketing triggers.
Otevřít na Upwork