Introduction to Databricks – Theory 60% / Practice 40% - 4h
Creating Databricks Service
Databricks RI Overview
Databricks Architecture Overview
Databricks Notebooks
Databricks Cluster and Jobs - Theory 60% / Practice 40% - 4h
Cluster types and configuration
Databricks cluster pool
Databricks Job
Notebooks’ workflows
DBFS - Theory 60% / Practice 40% - 4h
Databricks and Spark - Theory 60% / Practice 40% - 4h
Data Formats
Transformation
Joins, Aggregation
SQL
Delta Lake - Theory 60% / Practice 40% - 4h
Pitfalls of Data Lakes
Data Lakehouse Architecture
Read & Write to Delta Lake
Updates and Deletes on Delta Lake
Merge/Upsert to Delta Lake
History, Time Travel, Vacuum
Delta Lake Transaction Log
Convert from Parquet to Delta
Data Ingestion
Data Transformation - PySpark and Notebooks
Visualizations in Databricks - Theory 60% / Practice 40% - 2h
Collaboration in Databricks - Theory 60% / Practice 40% - 2h
Deploying Databricks on Azure - Theory 60% / Practice 40% - 2h
Deploying Databricks on the AWS Marketplace - Theory 60% / Practice 40% - 2h
Data Protection Use cases - 4h