ProjectsAdvanced+500 XP

Project 02 — Copy Multiple CSV Files Using ForEach Loop

Stop creating one Copy activity per file. Use the ForEach activity to loop through a list of files and copy all of them in a single pipeline run. Add a new store tomorrow — just update the array, no pipeline changes needed.

60–90 min March 2026

Project 01 — Copy CSV to ADLS Project 03 — Parameterized Pipeline with Run Date

Series: Azure Data Engineering — Zero to Advanced

Project: 02 of 25

Level: Beginner

Builds on: Project 01 — same resources, same ADF

Time: 60–90 minutes

What you will build

A pipeline that loops through 10 store CSV files and copies all of them to ADLS in a single run — using one ForEach activity instead of 10 Copy activities.

Real World Problem

In Project 01 we solved the first problem — we moved a single CSV file from a laptop to Azure automatically. But here is where the real world gets complicated.

FreshCart does not have 1 store. They have 10 stores. Every night, each store manager exports their own sales file:

File	Store	City
store_ST001_sales.csv	ST001	New York
store_ST002_sales.csv	ST002	New York
store_ST003_sales.csv	ST003	Seattle
store_ST004_sales.csv	ST004	Chicago
store_ST005_sales.csv	ST005	Austin
store_ST006_sales.csv	ST006	Boston
store_ST007_sales.csv	ST007	Kolkata
store_ST008_sales.csv	ST008	Ahmedabad
store_ST009_sales.csv	ST009	Jaipur
store_ST010_sales.csv	ST010	Chandigarh

Imagine solving this the wrong way — creating 10 separate Copy activities, one per store file:

❌ WRONG APPROACH

Copy Activity 1 → store_ST001

Copy Activity 2 → store_ST002

Copy Activity 3 → store_ST003

... 7 more activities ...

Copy Activity 10 → store_ST010

New store opens → manually add an activity

File renamed → manually update each activity

10 separate failure points

✅ RIGHT APPROACH

ForEach Activity

└── For each file in list

Copy Activity

(runs once per file)

New store opens → just add to the array

Tomorrow: 50 stores → same pipeline

1 activity, 1 failure point

Concepts You Must Understand First

What is a ForEach Activity?

A ForEach activity is a loop inside ADF. It takes a list of items and runs one or more activities for each item in that list.

Real analogy: You have 10 packages to deliver. Instead of making 10 separate trips, you load all packages into one truck and deliver them one by one on a single route. The ForEach activity is the truck route.

ForEach Activity

├── Items: ["ST001", "ST002", ... "ST010"]

│ ↑ this is the list it loops through

└── Activities inside the loop:

Copy Activity (runs once per item)

Sequential

Copies file 1, then file 2, then file 3. Safer. Slower.

Parallel

Copies multiple files at the same time. Faster. We use this with batch count 4.

What is a Pipeline Parameter?

A parameter is a value you pass into a pipeline from outside — before it starts running. In Project 02 we use an Array parameter called store_files to pass the list of file names.

PARAMETER

— Set FROM OUTSIDE the pipeline
— Set before the pipeline starts — cannot change during a run
— Example: store_files = ["ST001.csv", "ST002.csv", ...] passed when triggering

💡 Note

Variables are a separate concept — they are set inside a pipeline during a run and can change as activities execute. You will use variables properly in Project 03 where they are genuinely needed. For Project 02, a parameter is all you need.

What is a Dynamic Expression?

In Project 01, our dataset had a hardcoded file name: daily_sales.csv — fixed, never changes. In Project 02, the file name changes on every loop iteration. This is where dynamic expressions come in.

The @{ } syntax tells ADF: this is not a static value — evaluate this expression right now.

Common dynamic expressions

@item()→ Current item in a ForEach loop ← what we use in this project

@pipeline().parameters.store_files→ Value of the store_files array parameter

@utcNow()→ Current date and time

@formatDateTime(utcNow(),'yyyy-MM-dd')→ Today's date formatted

What we are building

Blob (landing/store_sales/)

store_ST001_sales.csv

store_ST002_sales.csv

store_ST003_sales.csv

store_ST004_sales.csv

store_ST005_sales.csv

... 5 more

→

ForEach

ADLS (raw/sales/)

store_ST001_sales.csv

store_ST002_sales.csv

store_ST003_sales.csv

store_ST004_sales.csv

store_ST005_sales.csv

... 5 more

One pipeline. One ForEach. Ten files copied.

PHASE 1 — PREPARE SAMPLE DATA

Step 1 — Create 10 Store CSV Files

Each file represents one store's daily sales. Open Notepad (Windows) or TextEdit (Mac) and create each file below. Save them all in a folder on your Desktop called freshmart_store_files.

store_ST001_sales.csv

order_id,store_id,product_name,category,quantity,unit_price,order_date
ORD1001,ST001,Basmati Rice 5kg,Grocery,12,299.00,2024-01-15
ORD1002,ST001,Samsung TV 43inch,Electronics,2,32000.00,2024-01-15
ORD1003,ST001,Amul Butter 500g,Dairy,25,240.00,2024-01-15
ORD1004,ST001,Colgate Toothpaste,Personal Care,30,89.00,2024-01-15
ORD1005,ST001,Nike Running Shoes,Apparel,5,4500.00,2024-01-15

store_ST002_sales.csv

order_id,store_id,product_name,category,quantity,unit_price,order_date
ORD2001,ST002,Sunflower Oil 1L,Grocery,18,145.00,2024-01-15
ORD2002,ST002,iPhone 14,Electronics,1,75000.00,2024-01-15
ORD2003,ST002,Amul Milk 1L,Dairy,40,62.00,2024-01-15
ORD2004,ST002,Dove Soap 100g,Personal Care,50,65.00,2024-01-15
ORD2005,ST002,Levis Jeans,Apparel,8,2999.00,2024-01-15

Create files 3–10 with the same structure, using the store IDs ST003 through ST010 and their respective city product data. Save all 10 as store_ST00X_sales.csv.

📸SCREENSHOT

Desktop folder 'freshmart_store_files' — showing all 10 CSV files listed

Step 2 — Upload All 10 Files to Landing Container

Go to Azure Portal → Storage accounts → stfreshmartdev → Containers → landing.

Click "+ Add Directory" → name it store_sales.

📸SCREENSHOT

Add Directory dialog — 'store_sales' entered as directory name

Click into the store_sales directory → click "Upload" → hold Ctrl and select all 10 CSV files at once → click "Upload".

📸SCREENSHOT

Upload dialog — all 10 files selected, showing their names in the file list before uploading

📸SCREENSHOT

landing/store_sales/ directory after upload — all 10 CSV files visible with file sizes

PHASE 2 — CREATE PARAMETERIZED DATASETS

In Project 01, our datasets had a hardcoded file path: File: daily_sales.csv. That worked for one file. Now the file name needs to change on each loop iteration. We do this by adding parameters to the dataset — a placeholder instead of a fixed value.

💡 Think of it like a form letter

"Dear [NAME], your order [ORDER_ID] has been shipped." — instead of writing 1000 separate letters, you write one template and fill in the parameters for each person. A parameterized dataset is the same idea.

Step 3 — Create Parameterized Source Dataset

In ADF Studio → Author → Datasets → "+" → "New dataset"→ "Azure Blob Storage" → "Continue" → "DelimitedText" → "Continue".

Nameds_src_blob_store_sales

Linked servicels_blob_freshmart_landing

File pathleave ALL THREE fields empty for now

First row as header✅ Yes

Import schemaNone

📸SCREENSHOT

Dataset form — name filled, linked service selected, file path fields left empty

Click "OK". You are now in the dataset editor. Click the "Parameters" tab at the bottom.

📸SCREENSHOT

Dataset editor — Parameters tab highlighted at the bottom

Click "+ New":

Namefile_name

TypeString

Defaultleave empty

📸SCREENSHOT

Parameters tab — new parameter 'file_name' of type String added

Now click the "Connection" tab → click inside the "File" field → click "Add dynamic content" (blue link below the field).

📸SCREENSHOT

Connection tab — 'Add dynamic content' blue link visible below the File field

The dynamic content editor opens. Under "Parameters" on the right → click file_name. The expression becomes:

@dataset().file_name

📸SCREENSHOT

Dynamic content editor — @dataset().file_name expression in the box, file_name parameter visible in the right panel

Click "OK". Set the full path:

Containerlanding

Directorystore_sales

File@dataset().file_name← dynamic

📸SCREENSHOT

Connection tab fully configured — container 'landing', directory 'store_sales', file showing the dynamic expression

Click 💾 Save.

Step 4 — Create Parameterized Sink Dataset

Click "+" next to Datasets → "New dataset"→ "Azure Data Lake Storage Gen2" → "Continue"→ "DelimitedText" → "Continue".

Nameds_sink_adls_store_sales

Linked servicels_adls_freshmart

File pathleave ALL THREE fields empty

Click "OK" → Parameters tab → "+ New" → file_name, type String.

📸SCREENSHOT

Sink dataset Parameters tab — file_name parameter added

Click Connection tab → File field → "Add dynamic content"→ click file_name → expression: @dataset().file_name → "OK".

Containerraw

Directorysales

File@dataset().file_name

📸SCREENSHOT

Sink dataset Connection tab — raw/sales/@dataset().file_name

Click 💾 Save.

PHASE 3 — BUILD THE PIPELINE

Step 5 — Create New Pipeline

In ADF Studio → Author → "+" next to Pipelines → "New pipeline".

Namepl_copy_all_store_sales

DescriptionLoops through all store CSV files and copies each one to ADLS raw/sales/

📸SCREENSHOT

New blank pipeline canvas — name 'pl_copy_all_store_sales' in the Properties panel on the right

Step 6 — Add Pipeline Parameter

Click on the empty canvas background (not on any activity) → at the bottom → click the "Parameters" tab.

📸SCREENSHOT

Pipeline canvas — empty background clicked, Parameters tab visible at bottom

Click "+ New":

Namestore_files

TypeArray

Default["store_ST001_sales.csv","store_ST002_sales.csv","store_ST003_sales.csv","store_ST004_sales.csv","store_ST005_sales.csv","store_ST006_sales.csv","store_ST007_sales.csv","store_ST008_sales.csv","store_ST009_sales.csv","store_ST010_sales.csv"]

⚠️ Array Format Must Be Exact JSON

The default value must be a valid JSON array: strings in double quotes, separated by commas, wrapped in square brackets. Copy the value above exactly — no trailing commas, no single quotes.

📸SCREENSHOT

Pipeline Parameters tab — store_files parameter with Array type and the full JSON array as default value

Step 7 — Add ForEach Activity

In the left activities panel → expand "Iteration & conditionals"→ drag "ForEach" onto the canvas.

📸SCREENSHOT

Left activities panel — 'Iteration & conditionals' section expanded, ForEach being dragged to canvas

📸SCREENSHOT

ForEach activity placed on the canvas — a larger box with 'ForEach' label and a '+' icon in the centre

Click on the ForEach activity → configure the bottom panel:

General Tab

NameForEach_store_files

DescriptionLoops through each store CSV file name

📸SCREENSHOT

ForEach General tab — name and description filled in

Settings Tab

Sequential☐ Unchecked ← we want parallel

Batch count4

Items@pipeline().parameters.store_files← add via dynamic content

For Items field → click "Add dynamic content"→ under "Parameters" → click store_files. Expression becomes @pipeline().parameters.store_files.

📸SCREENSHOT

Dynamic content editor — @pipeline().parameters.store_files expression, store_files parameter highlighted on the right

📸SCREENSHOT

ForEach Settings tab — Sequential unchecked, Batch count 4, Items showing @pipeline().parameters.store_files

Sequential OFF — run multiple iterations at the same time (parallel). Faster but uses more resources.

Batch count 4 — run maximum 4 iterations simultaneously: files 1,2,3,4 → then 5,6,7,8 → then 9,10. Prevents overloading the system.

Step 8 — Add Copy Activity INSIDE the ForEach

⚠️ Most Common Mistake in This Step

Do NOT drag a Copy activity from the left panel onto the main canvas. You must click the "+"that is inside the ForEach box. This is the mistake beginners make most often.

Click the "+ (Add activity)" button that is inside the ForEach box on the canvas.

📸SCREENSHOT

ForEach activity box — showing the '+' button inside the box (not on the main canvas)

You are now inside the loop — a new blank canvas area opens labeled with the ForEach name.

📸SCREENSHOT

ForEach inner canvas — a new blank canvas area showing you are inside the loop

From the left panel → drag a "Copy data" activity onto this inner canvas.

📸SCREENSHOT

Copy data activity placed inside the ForEach inner canvas

Step 9 — Configure Source with @item()

Click the Copy activity inside the ForEach → configure:

General Tab

Namecopy_store_file

DescriptionCopies current store file from landing to ADLS raw/sales/

Source Tab

Select ds_src_blob_store_sales. A Dataset properties section appears. Click inside the file_name value field → "Add dynamic content".

📸SCREENSHOT

Source tab — ds_src_blob_store_sales selected, Dataset properties section showing file_name field with 'Add dynamic content' link

Under "ForEach iterator" on the right → click "Item". Expression becomes:

@item()

📸SCREENSHOT

Dynamic content editor — @item() expression, 'Item' option highlighted under ForEach iterator section

Click "OK".

What does @item() mean?

@item() only works inside a ForEach loop. It returns the current item being processed.

Iteration 1: @item() = "store_ST001_sales.csv"

Iteration 2: @item() = "store_ST002_sales.csv"

Iteration 3: @item() = "store_ST003_sales.csv" ... and so on

📸SCREENSHOT

Source tab complete — file_name Dataset property showing @item() value

Step 10 — Configure Sink with @item()

Click Sink tab → select ds_sink_adls_store_sales→ in the file_name Dataset property → "Add dynamic content"→ click "Item" → expression: @item() → "OK".

📸SCREENSHOT

Sink tab — ds_sink_adls_store_sales selected, file_name Dataset property showing @item()

Both source and sink now use @item() — the same file name is used for reading and writing:

Sourcereads fromlanding/store_sales/store_ST001_sales.csv

Sinkwrites toraw/sales/store_ST001_sales.csv

Same file name — different container and folder. Clean.

Step 11 — Validate and Return to Main Canvas

Click the back arrow at the top left of the inner canvas to return to the main pipeline canvas.

📸SCREENSHOT

Back arrow at top left — returning to main pipeline canvas from ForEach inner canvas

📸SCREENSHOT

Main pipeline canvas — ForEach_store_files activity showing '1 activity' label inside it

Click "Validate" in the top toolbar.

📸SCREENSHOT

Validation successful message — 'Your pipeline has been validated. No errors were found.'

If you see errors:

Dataset property file_name is not set → Copy activity → Source or Sink tab → add @item() to the file_name property

Items expression is required → ForEach activity → Settings tab → Items field → add @pipeline().parameters.store_files

Step 12 — Debug

Click "Debug". A dialog appears asking for parameter values — the default array should be pre-filled. If not, paste it:

["store_ST001_sales.csv","store_ST002_sales.csv","store_ST003_sales.csv","store_ST004_sales.csv","store_ST005_sales.csv","store_ST006_sales.csv","store_ST007_sales.csv","store_ST008_sales.csv","store_ST009_sales.csv","store_ST010_sales.csv"]

📸SCREENSHOT

Debug parameter dialog — store_files parameter with the JSON array value pre-filled

Click "OK". Watch the ForEach run. Click the 👓 glasses icon next to the ForEach in the Output tab to see individual iterations.

📸SCREENSHOT

ForEach run details — showing all 10 iterations, each with status, file name, and duration

📸SCREENSHOT

All 10 iterations completed — every row showing green checkmark and duration

Step 13 — Verify All 10 Files in ADLS

Go to Azure Portal → stfreshmartdev → Containers → raw → sales.

📸SCREENSHOT

raw/sales/ directory — showing all 10 store CSV files listed with file sizes and timestamps

Click on any file → "Edit" to preview the data.

📸SCREENSHOT

One store file open in preview — showing the 5 rows of data for that store

Step 14 — Publish

Click "Publish all" → the panel shows 3 new items: pipeline, 2 datasets → click "Publish".

📸SCREENSHOT

Successfully published message — 3 new items published

💡 Bonus — Test With Only 3 Files

Click Debug → change the store_files value to just 3 files. The pipeline runs only 3 iterations. This is how professionals test with a subset before running the full load.

["store_ST001_sales.csv","store_ST002_sales.csv","store_ST003_sales.csv"]

📸SCREENSHOT

Debug dialog — store_files with only 3 files in the array for a quick test run

📸SCREENSHOT

ForEach showing only 3 iterations — faster test run completed

What Was Added in Project 02

Item	Name	What It Does
Dataset	ds_src_blob_store_sales	Parameterized source — file name is dynamic
Dataset	ds_sink_adls_store_sales	Parameterized sink — file name is dynamic
Pipeline	pl_copy_all_store_sales	Contains ForEach + Copy
Parameter	store_files (Array)	List of filenames to process
Activity	ForEach_store_files	Loops through the file list
Activity	copy_store_file	Copies one file per iteration — inside ForEach

Key Concepts Reference

Concept	What It Is	When You Use It
ForEach Activity	Loops through a list of items	When you have multiple files/tables to process
Array Parameter	A list of values passed to a pipeline	When the list of files may change
@item()	Current item in a ForEach loop	Inside ForEach — to get the current loop value
@pipeline().parameters.X	Read a pipeline parameter	Anywhere in the pipeline
@dataset().X	Pass a value to a dataset parameter	In Copy activity Source/Sink dataset properties
Dynamic expression	@{ } syntax — evaluated at runtime	Whenever a value needs to be dynamic, not fixed
Batch count	How many ForEach iterations run in parallel	Balance speed vs resource usage
Parameterized dataset	Dataset where the file path uses parameters	When same dataset is used for many different files

Common Mistakes

⚠️

Placing Copy activity OUTSIDE the ForEach

Fix: Delete it, click the "+" INSIDE the ForEach box, re-add it

⚠️

Forgetting to set Dataset properties in Source or Sink

Fix: Copy activity → Source tab → Dataset properties → set file_name to @item()

⚠️

Wrong Array format in parameter default

Fix: Must be: ["file1.csv","file2.csv"] — double quotes, square brackets, no trailing comma

⚠️

Sequential = ON with large file lists

Fix: Turn Sequential OFF and set Batch count to 4 or 5

What is coming in Project 03

Right now you pass the file list as a hardcoded default array parameter. What if the file name includes today's date?

store_ST001_sales_20240115.csv

store_ST001_sales_20240116.csv ← tomorrow

store_ST001_sales_20240117.csv ← day after

In Project 03 you will learn Parameterized Pipelines with Variables — pass run_date at trigger time, use a Set Variable activity to build the folder path, and ADF constructs the correct file names automatically every night.

🎯 Key Takeaways

✓ForEach loops through an array — one Copy activity handles all files instead of duplicating activities per file
✓@item() returns the current loop value — it only works inside a ForEach activity
✓Parameterized datasets use @dataset().file_name so the same dataset works for any file
✓The store_files Array parameter holds the list of files — passed from outside the pipeline before it runs
✓Sequential OFF + Batch count 4 runs 4 files simultaneously — much faster than sequential for large lists
✓Pass a smaller test array at Debug time to validate the pipeline on 3 files before running all 10
✓Adding a new store? Just add the filename to the array parameter — the pipeline never needs to change
✓Variables are different from parameters — you will use them properly in Project 03 where they are genuinely needed

What to learn next

Project 03 — Run Date Pipeline

Projects · 75 min · +500 XP

Project 01 — Copy CSV to ADLS

Project 03 — Parameterized Pipeline with Run Date

Discussion

Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.

Continue with GitHub