Mining Process Optimization Platform
An end-to-end ML platform for optimizing mineral processing — SAG milling, flotation, thickening — deployed across multiple mining divisions. Delivered +100 TPH throughput uplift and measurable copper recovery improvements.
Business Context
In large-scale copper mining, the economics are stark: a 1% improvement in copper recovery or a 100 TPH throughput gain in SAG milling translates to tens of millions of dollars in additional annual revenue. Yet operational decisions — mill speed, flotation reagent dosing, thickener settings — were traditionally made by operators based on personal experience and shift-to-shift knowledge transfer, with no systematic mechanism to capture, scale, or optimize this expertise across sites.
Strategic Value
The platform delivered measurable results at industrial scale: throughput uplift exceeding +100 TPH in SAG milling and measurable copper recovery improvements across flotation circuits. Built on Kedro pipelines deployed on Azure Databricks, it embeds data-driven recommendations into the 4-hourly production cadence across multiple mining divisions. The ensemble approach (XGBoost, gradient boosting, neural networks) with MLflow experiment tracking enables reproducible model selection, while the recommendation engine generates setpoint recommendations with confidence intervals — so operators see not just what to change, but how confident the model is. Multi-division deployment required balancing standardized methodology with division-specific calibration, a dual requirement that shaped the entire platform architecture.
The Challenge
Large-scale mining operations involve complex, interconnected processes. SAG mills, flotation banks, and thickeners each have dozens of controllable variables and hundreds of sensor readings, creating a high-dimensional optimization problem that evolves with ore characteristics. A 1% recovery improvement translates to tens of millions USD annually.
Our Approach
Modular Kedro pipeline: (1) Data ingestion from SCADA via Azure Data Factory, (2) Domain-informed feature engineering — rolling statistics, lag variables, regime detection, (3) Ensemble model training — XGBoost, gradient boosting, neural networks on historical windows, (4) Scenario simulation generating setpoint recommendations with confidence intervals, (5) Operational dashboard with KPI tracking and expert feedback loops.
Key Performance Indicators
| KPI | Baseline | Result | Impact |
|---|---|---|---|
| Decision Basis | Operator experience, shift variability | Data-driven recommendations every 4h | Consistent, auditable decisions |
| Value Realization | Unknown improvement potential | +100 TPH throughput, recovery gains | Quantifiable annual production value |
| Multi-division Scalability | Site-specific solutions | Configurable shared platform | Reduced per-site implementation cost |
Proprietary — source code not publicly available
Architecture
mining optimization
The Scale of Impact
In large-scale copper mining, the economics are unforgiving. A 1% improvement in copper recovery or a 100 TPH throughput gain in SAG milling translates to tens of millions of dollars annually. These aren’t theoretical numbers — they’re the operating reality that justified building this platform and deploying it across multiple mining divisions.
The platform delivered measurable results: throughput uplift exceeding +100 TPH in SAG milling and measurable copper recovery improvements across flotation circuits. Optimization recommendations run on a 4-hourly production cadence, embedded directly into the daily operational workflow.
The Problem
SAG mills, flotation banks, and thickeners each have dozens of controllable variables and hundreds of sensor readings. Ore characteristics change continuously — different pits, different benches, different geological zones feed material with varying hardness, mineralogy, and grade. What worked yesterday may not work today.
Operators traditionally adjusted setpoints based on experience and shift-to-shift knowledge transfer. The result: inconsistent decisions, missed optimization opportunities, and no systematic way to capture or scale operational expertise across sites.
Architecture
The platform is built on Kedro pipelines deployed on Azure Databricks, designed for reproducibility and MLOps rigor:
Data ingestion pulls from SCADA systems, laboratory analyses, and operational databases through Azure Data Factory — both streaming and batch. Hundreds of sensor readings per processing plant feed the system continuously.
Feature engineering transforms raw signals into domain-informed features: rolling statistics over configurable windows (mean, variance, percentiles), lag variables that capture process inertia, ore property indicators derived from assay data, and operational regime detection using hidden Markov models and change-point algorithms. This is where domain expertise becomes computational — each feature encodes something a process engineer knows matters.
Model training uses an ensemble approach — XGBoost, gradient-boosted trees, and neural networks trained on historical operational windows. MLflow tracks experiments across divisions, enabling reproducible model selection and comparison.
The recommendation engine doesn’t just predict — it simulates scenarios. It generates actionable setpoint recommendations with confidence intervals, so operators see not just what to change, but how confident the model is and what range of outcomes to expect.
Operational dashboards close the feedback loop. Power BI for management visibility, Streamlit prototypes for engineering deep-dives, and adherence tracking to measure whether recommendations are being followed and whether they’re delivering value.
Multi-Division Deployment
The hardest part wasn’t the ML — it was the organizational challenge. Each mining division has different ore, different equipment, different operators, and different operational culture. The platform architecture had to balance standardized methodology (same pipeline framework, same model types, same recommendation logic) with division-specific calibration (local thresholds, site-specific features, equipment-specific constraints). This dual requirement shaped every architectural decision.
Technology Stack
Visual assets for this project are not publicly available.
This is a proprietary project. Source code and external resources are not publicly available.