My Projects

Showcasing data science and engineering solutions

Alberta Affordability Data Platform — End-to-End ETL Pipeline

Comprehensive end-to-end ETL pipeline built on Databricks processing public datasets. Implements Medallion Architecture (Bronze-Silver-Gold layers) to transform raw data into analytics-ready datasets with actionable insights on housing affordability, employment trends, and cost-of-living metrics across Alberta.

Databricks Apache Spark Delta Lake Python SQL Azure Data Factory Medallion Architecture
  • Built end-to-end data pipelines processing public datasets into cleansed layers
  • Implemented Medallion Architecture ensuring data reliability and consistency
  • Automated data extraction and integration workflows with Python and SQL
  • Documented data models, metrics, and business rules for reusable reporting
  • Performed comprehensive data cleansing and quality validation