PortfolioIoT Analytics Lakehouse on Databricks

IoT Analytics Lakehouse on Databricks

Personal Project

TL;DR - Quick Summary

Built a production-grade data lakehouse on Databricks Community Edition using Delta Lake and medallion architecture (bronze/silver/gold layers). Processes 10M+ simulated IoT sensor events from smart manufacturing equipment using PySpark batch transformations. Demonstrates modern lakehouse patterns: ACID transactions, schema evolution, time travel, data quality checks, and optimized analytics queries.

Databricks Community EditionApache Spark 3.5PythonPySparkDelta Lake TablesDBFS (Databricks File System)Batch JobsIncremental ProcessingDatabricks NotebooksVersion control integrationPython FakerCustom IoT event generator
Team:Solo project
Role:Data Engineer
Status:In Development
View Project
Come back in some time to see the results! 🚀

Project In Development

This project is currently being built. The TL;DR above provides the key summary. Full details, technical documentation, and code samples will be available once completed.