This project provides a comprehensive guide for building an ETL (Extract, Transform, Load) pipeline using SQL Server Integration Services (SSIS) in Visual Studio 2019.
The primary goal of this project is to provide a basic yet practical solution for anyone looking to build a new ETL pipeline using SSIS, with AdventureWorks 2019 as the sample data source.
Make sure the following components are installed and properly configured:
-
Visual Studio 2019
Note: Visual Studio 2017 is also supported, but SSIS integration may behave differently. -
SQL Server Integration Services (SSIS) Extension for VS 2019
👉 Install SSIS for Visual Studio 2019 -
SQL Server (any supported edition)
-
SQL Server Management Studio (SSMS)
-
AdventureWorks 2019 Sample Database
📦 Download AdventureWorks2019.bak
The dataset simulates a sample retail system for a bicycle company. The objective is to build an end-to-end ETL pipeline to generate a variety of reports:
- Sales reports by employee
- Sales reports by region
- Number of orders per employee
- Number of orders per region
- Sales by product subcategory
- Sales by product and region
Build the staging area, design the data warehouse, and create SSIS packages to perform initial extraction, transformation, and loading of data.
- Import source data into SSMS
- Design a data warehouse schema to support reporting
- Create the data warehouse
AVW_datawarehouse_createDatabase.sql
- Create the staging database
AVW_staging_createDatabase.sql
- Extract data into the staging area
- Transform source data into dimensions and facts
- Load dimension tables (Level 1)
- Load dimension tables (Level 2)
- Load dimension tables (Level 3)
- Load SalesOrder fact table
- Load Product fact table
- Truncate all dimension and fact tables in the staging area to prepare for future ETL runs
Incrementally extract only the new, updated, or deleted records from the source and process them.
-
New & Updated Records:
UseSlowly Changing Dimension (SCD)
andLookup Transformation
to compare incoming data with existing staging data. -
Deleted Records:
Mark records as deleted by using anIsDeleted
column. Set it to1
before loading, and back to0
if matched with source data. Remaining1
values indicate deletions.
- Set
IsDeleted = 1
in staging for all records before extraction
- Extract new/updated/deleted records using:
- Use
Execute SQL Task
to: - Delete records in staging with
IsDeleted = 1
- Transform staging data into dimension and fact formats
- Update
IsDeleted = 1
in DW for records not matched
- Load changes to DW using
SCD
andLookup
, but apply only soft deletes (do not delete rows) - Truncate all staging dimension/fact tables again to prepare for next cycle
All SSIS packages are available in the /Package-SSIS
directory.
If you have any questions or encounter any issues while implementing the ETL flow, feel free to reach out via email.
Thank you for your interest and happy building!