AI engineer in Jaipur

Rozpočet: $1800.0 FIXED / ⭐ 5.00 (2) India

databricks-platform, artificial-intelligence, machine-learning

Role Overview We are looking for a highly skilled Data Scientist to join our advanced analytics team. You will be responsible for end-to-end development of predictive churn models to help us proactively retain customers. You will leverage the Databricks Intelligence Platform to handle large-scale data processing, feature engineering, and MLOps, turning raw customer data into actionable retention strategies. Key Responsibilities Model Development: Design, train, and validate machine learning models (e.g., Random Forest, XGBoost, Neural Networks) to predict customer churn probability. Data Engineering: Utilize PySpark and SQL within Databricks to build robust, scalable data pipelines for ingestion, cleaning, and feature engineering. MLOps & Deployment: Manage the full machine learning lifecycle using MLflow for experiment tracking, model versioning, and deploying models to production as serving endpoints. Insight Generation: Perform exploratory data analysis (EDA) to identify key churn drivers and collaborate with marketing/CRM teams to design personalized retention campaigns. Governance: Ensure all data models and assets are governed under Unity Catalog, adhering to security and compliance best practices. Innovation: Stay current with LLM/GenAI advancements to explore how generative AI can automate personalized outreach and retention content. Required Technical Skills Languages: Expert-level Python (Pandas, NumPy, Scikit-Learn). Platform: Deep experience with Databricks (Notebooks, Workflows, Delta Lake, Cluster Management). Big Data: Proficiency in Apache Spark (PySpark) for distributed computing. Machine Learning: Strong understanding of classification algorithms, handling imbalanced datasets (e.g., SMOTE, class weighting), and evaluation metrics (Precision, Recall, F1-Score, AUC-ROC). Tools: Experience with MLflow for experiment tracking and model registry. Preferred Qualifications Experience with SQL for advanced data querying. Familiarity with cloud platforms (AWS, Azure, or GCP) within a Databricks environment. Knowledge of GenAI or LLM integration for automated customer engagement. Prior experience in telecom, retail, or subscription-based industry churn analysis. Understanding the Pipeline To help you better grasp how these technologies work together in a production environment, this diagram illustrates the flow: Data Ingestion: Raw logs and CRM data are pulled into the Lakehouse. Preparation: PySpark is used for cleaning and feature engineering. Modeling: Algorithms are trained and tracked via MLflow. Action: The model outputs predictions that feed into BI dashboards or automated marketing triggers.

Otevřít na Upwork