Alberta Affordability Data Platform — End-to-End ETL Pipeline
Comprehensive end-to-end ETL pipeline built on Databricks processing public datasets. Implements Medallion Architecture (Bronze-Silver-Gold layers) to transform raw data into analytics-ready datasets with actionable insights on housing affordability, employment trends, and cost-of-living metrics across Alberta.
Databricks
Apache Spark
Delta Lake
Python
SQL
Azure Data Factory
Medallion Architecture
- Built end-to-end data pipelines processing public datasets into cleansed layers
- Implemented Medallion Architecture ensuring data reliability and consistency
- Automated data extraction and integration workflows with Python and SQL
- Documented data models, metrics, and business rules for reusable reporting
- Performed comprehensive data cleansing and quality validation