Friday 7 December 2012

Replacing a Cursor With a SSIS Package


This can be done several ways depending on the situation. In this situation there is a number on each row that determines the number of times a row needs to be written to the destination.

The source table & script looks like this:--->


The Number of Nights column tells us how many times this row needs to be inserted into the destination table. So the Destination should look like the following image after the load is complete. Notice the number of nights matches the number of times the row appears on the destination table.


This can be performed by using a cursor to loop through each row, but this is very slow. If you needed to perform this for millions of rows it would be a very long process. The power of SSIS is in the batch loads it performs in data flows. You can perform this using a small SSIS package. Here is an image of the package Control Flow you can create to perform this kind of cursor work.


This SSIS package will have two variables, intCounter and intNumOfNights. The counter variable will increment during the loop. The number of nights variable will hold the maximum number of nights from the source table.


The first task in the package is an Execute SQL Task. It retrieves the maximum number of nights and saves it in the number of nights variable. This will control the number of times the loop runs.

The query in the Execute SQL Task is:


The result set is single row and intNumofNights is mapped under result set.



The For Loop Container will loop from 1 to the max number of nights. The image below shows how this is set up. This is assuming the lowest number of nights will be 1.


The only thing left is the Data Flow. The source will be an OLEDB source with the following SQL query.


The question mark is a parameter and is mapped to the intCounter variable. This will only select rows that have the number of nights greater than or equal to the counter.


The destination is an OLEDB Destination. No special setup needed for this task, just map the source columns to the proper destination columns.


This package will give you the results in the first two table images. The parameter in the Data Flow source prevents it from loading a row too many times. The SSIS package will perform much faster than the SQL cursor because the cursor is row by row and the SSIS package performs the data flow in batch.


No comments:

Post a Comment