PortfolioDatabricks Lakehouse Pipeline

Databricks Lakehouse Pipeline

Personal Project

TL;DR - Quick Summary

Build a medallion architecture (Bronze → Silver → Gold) on Databricks Community Edition using synthetic SaaS data. Ingest raw CSV/JSON into Bronze, clean and validate in Silver, aggregate business metrics in Gold. Uses Delta Lake format throughout to demonstrate lakehouse patterns used in DACH enterprise data engineering.

Databricks Community EditionPySparkApache Spark 3.5Delta LakePythonSQL (Databricks notebooks)
Team:Solo project
Role:Data Engineer
Status:In Development
View Project
Come back in some time to see the results! 🚀

Project In Development

This project is currently being built. The TL;DR above provides the key summary. Full details, technical documentation, and code samples will be available once completed.