Microsoft Fabric
Microsoft Fabric is an all-in-one analytics platform that unifies data engineering, data warehousing, real-time analytics, and business intelligence inside a single product. It replaces the need to stitch together ADF + Databricks + Synapse + Power BI separately.
What is Microsoft Fabric?
Microsoft Fabric is a unified SaaS analytics platform launched in 2023. Instead of creating separate Azure resources for each part of your pipeline — ADF for orchestration, Databricks for processing, Synapse for warehousing, Power BI for reporting — Fabric brings all of them into one workspace with one license and one storage layer called OneLake.
Think of it this way: the traditional Azure data stack requires 4-6 separate services that you wire together. Fabric is Microsoft's answer to that complexity — one platform where a data engineer, analyst, and business user all work in the same environment.
Fabric Components
Combines a Delta Lake file store with a SQL analytics endpoint. Store raw and processed data as Delta tables. Query with SQL or Spark.
A full T-SQL data warehouse on OneLake. Same SQL syntax as Synapse. Ideal for business analysts and Power BI semantic layers.
Built-in orchestration — same visual pipeline builder as ADF. No separate ADF resource needed. Triggers, activities, linked services all inside Fabric.
PySpark and SQL notebooks running on managed Spark compute. Same code as Databricks notebooks. No cluster config needed.
Native inside Fabric workspace. Reports connect directly to Lakehouse or Warehouse — no data export, no gateway, real-time refresh.
Real-time event processing. Ingest from Event Hubs, transform in-flight, land in Lakehouse. No-code streaming pipeline builder.
The unified storage layer under everything. All Fabric workloads read and write to OneLake. One copy of data — no movement between services.
Real-time alerting on streaming data. Define conditions on event streams and trigger actions (email, Teams message, Power Automate) when they are met.
Lakehouse — Data Engineering in Fabric
The Lakehouse is where data engineers work in Fabric. It has two sections: Files (raw storage — like ADLS Gen2) and Tables (managed Delta Lake tables). Notebooks transform data from Files into Tables. Once data is in Tables, it is automatically available via the SQL analytics endpoint — no Synapse setup needed.
# Microsoft Fabric Lakehouse — PySpark notebook inside Fabric
# The notebook runs on Fabric Spark Runtime (same as Databricks)
# No cluster configuration needed — Fabric manages compute automatically
import pyspark.sql.functions as F
from pyspark.sql.types import *
from datetime import datetime
# In Fabric, you reference your Lakehouse files directly
# The Files/ section = raw storage (like ADLS Gen2 bronze layer)
# The Tables/ section = managed Delta Lake tables
# Read raw CSV from Lakehouse Files section (Bronze)
df_raw = spark.read .option("header", True) .option("inferSchema", True) .csv("Files/bronze/sales/2025/03/15/sales_raw.csv")
print(f"Raw rows: {df_raw.count()}")
# Clean and transform (Silver logic)
df_silver = df_raw .filter(F.col("order_id").isNotNull()) .dropDuplicates(["order_id"]) .withColumn("revenue", F.col("revenue").cast(DoubleType())) .withColumn("order_date", F.to_date(F.col("order_date"), "yyyy-MM-dd")) .withColumn("load_ts", F.lit(datetime.utcnow().isoformat()))
# Write to Lakehouse Tables section as Delta (Silver)
# This creates a table immediately queryable in SQL analytics endpoint
df_silver.write .format("delta") .mode("overwrite") .saveAsTable("silver_sales") # appears in Fabric SQL endpoint automatically
print("Written to silver_sales Delta table")Warehouse — SQL Analytics in Fabric
The Fabric Warehouse is a full T-SQL engine. The key difference from Synapse: it sits on OneLake, so it can query Lakehouse Delta tables directly without moving data. Write your Gold aggregations in SQL, connect Power BI, done.
-- Microsoft Fabric Warehouse — T-SQL analytics
-- Fabric Warehouse is a full SQL engine on top of OneLake
-- Query Delta tables from Lakehouse directly — no data movement
-- Create Gold aggregation table in the Warehouse
CREATE TABLE gold_daily_revenue AS
SELECT
order_date,
region,
product_category,
COUNT(DISTINCT order_id) AS total_orders,
SUM(revenue) AS total_revenue,
AVG(revenue) AS avg_order_value,
COUNT(DISTINCT customer_id) AS unique_customers
FROM silver_sales -- references Lakehouse Delta table via OneLake shortcut
GROUP BY order_date, region, product_category;
-- Cross-database query — Warehouse queries Lakehouse table directly
-- No COPY INTO, no ETL, no data movement
-- Both sit on the same OneLake storage
-- Create a view for Power BI (semantic layer)
CREATE VIEW vw_revenue_summary AS
SELECT
order_date,
region,
SUM(total_revenue) AS revenue,
SUM(total_orders) AS orders
FROM gold_daily_revenue
WHERE order_date >= DATEADD(month, -3, GETDATE())
GROUP BY order_date, region;Data Pipeline — Orchestration in Fabric
Fabric includes a pipeline builder identical to ADF — same visual interface, same Copy Activity, same connectors. No separate ADF resource needed. Pipelines live in your Fabric workspace alongside notebooks and warehouses.
// Microsoft Fabric Data Pipeline — same as ADF, built into Fabric
// No separate ADF resource needed — pipelines live inside Fabric workspace
// Example: Copy Activity to ingest CSV from HTTP source into Lakehouse
{
"name": "PL_Ingest_Sales_Daily",
"activities": [
{
"name": "Copy_Sales_CSV",
"type": "Copy",
"source": {
"type": "HttpSource",
"requestMethod": "GET"
},
"sink": {
"type": "LakehouseTableSink",
"tableActionOption": "Append",
"lakehouseTableName": "bronze_sales_raw"
}
},
{
"name": "Run_Notebook_Transform",
"type": "TridentNotebook",
"dependsOn": [{ "activity": "Copy_Sales_CSV", "dependencyConditions": ["Succeeded"] }],
"typeProperties": {
"notebookId": "your-notebook-id",
"workspaceId": "your-workspace-id"
}
}
]
}Fabric vs Traditional Azure Stack
| Task | Traditional Azure | Microsoft Fabric |
|---|---|---|
| Ingest raw data | ADF Copy Activity → ADLS Gen2 | Data Pipeline Copy Activity → Lakehouse Files |
| Transform data | Databricks notebook on ADLS | Fabric Notebook on Lakehouse Tables |
| Serve analytics | Synapse Analytics dedicated pool | Fabric Warehouse or Lakehouse SQL endpoint |
| Build reports | Power BI desktop → publish → refresh gateway | Power BI inside Fabric workspace — live connection |
| Store secrets | Azure Key Vault separate resource | Key Vault still used (Fabric workspace secrets coming) |
| Real-time ingestion | Event Hubs → Stream Analytics → ADLS | Eventstream → Lakehouse (no-code) |
Fabric Licensing — F SKUs
🎯 Key Takeaways
- ✓Fabric unifies ADF + Databricks + Synapse + Power BI into one platform with one license
- ✓OneLake is the shared storage under everything — no data movement between services
- ✓Lakehouse = Files (raw) + Tables (Delta) + SQL endpoint, all in one
- ✓Fabric notebooks run PySpark — same code as Databricks, no cluster config needed
- ✓Data Pipelines in Fabric are identical to ADF — same visual builder, same connectors
- ✓Learn the traditional Azure stack first — Fabric makes more sense once you understand what it is replacing
Discussion
0Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.