Tuesday 11 December 2012

Sequence Container in SSIS 2008


The Sequence Container defines a control flow that is a subset of the control flow in a package. Sequence containers group the package into multiple separate control flows, each containing one or more tasks and containers that run within the overall package control flow. There are some benefits of using a Sequence container which are mentioned below:
  • Provides the facility of disabling groups of tasks to focus debugging on one subset of the package control flow.
  • Managing multiple tasks in one location by setting properties on a Sequence Container instead of setting properties on the individual tasks.
  • Provides scope for variables that a group of related tasks and containers use.

If a package has many tasks then it is easier to group the tasks in Sequence Containers and you can collapse and expand Sequence Containers.
Note: You can also create task groups which collapse and expand using the Group box this is a design-time feature that has no properties or run-time behavior.

Now I'll explain you how to use sequence Container in a package.
  1. To begin, right click on SSIS Packages folder in Solution Explorer and click New SSIS Package. Rename it with SequenceContainer.dtsx as shown below:


  2. Add a package variable DayOfWeek


  3. Drag and drop Script Task from toolbox. This task will be used to initialize DayOfWeek variable with current day of week. Rename Task name as Script Task - Set DayOfWeek.

  4. Double click on Script Task to open Script Task Editor. Enter User::DayOfWeek in ReadWriteVariables property.


  5. Click on Edit Script... and write below code in Main() function of ScriptMain.cs:


  6. Drag and Drop 7 Sequence Container tasks from Toolbox and rename the task on Week Days e.g. "SC - Sunday", "SC - Monday" etc. I am not placing any controls flow items inside Sequence Container to narrow down the example and focus more on how to use sequence container.

  7. Connect all these Sequence Containers to Script Task. Now double click on green arrow to open Precedence Constraint Editor. Select Evaluation operator as Expression and Constraint and Expression as @DayOfWeek == "Sunday". Click OK to close and save changes. The Expression for remaining task will be differ as per week day e.g. @DayOfWeek == "Monday" for Monday and so on.


  8. Save the package. Now right click on the package in Solution Explorer and execute the package.


  9. This package will execute only one sequence container as per expression set for the precedence constraints.


  10. This is just an example of how we can use Sequence Container. But we can use this in many ways which depends on the requirement.

Friday 7 December 2012

Replacing a Cursor With a SSIS Package


This can be done several ways depending on the situation. In this situation there is a number on each row that determines the number of times a row needs to be written to the destination.

The source table & script looks like this:--->


The Number of Nights column tells us how many times this row needs to be inserted into the destination table. So the Destination should look like the following image after the load is complete. Notice the number of nights matches the number of times the row appears on the destination table.


This can be performed by using a cursor to loop through each row, but this is very slow. If you needed to perform this for millions of rows it would be a very long process. The power of SSIS is in the batch loads it performs in data flows. You can perform this using a small SSIS package. Here is an image of the package Control Flow you can create to perform this kind of cursor work.


This SSIS package will have two variables, intCounter and intNumOfNights. The counter variable will increment during the loop. The number of nights variable will hold the maximum number of nights from the source table.


The first task in the package is an Execute SQL Task. It retrieves the maximum number of nights and saves it in the number of nights variable. This will control the number of times the loop runs.

The query in the Execute SQL Task is:


The result set is single row and intNumofNights is mapped under result set.



The For Loop Container will loop from 1 to the max number of nights. The image below shows how this is set up. This is assuming the lowest number of nights will be 1.


The only thing left is the Data Flow. The source will be an OLEDB source with the following SQL query.


The question mark is a parameter and is mapped to the intCounter variable. This will only select rows that have the number of nights greater than or equal to the counter.


The destination is an OLEDB Destination. No special setup needed for this task, just map the source columns to the proper destination columns.


This package will give you the results in the first two table images. The parameter in the Data Flow source prevents it from loading a row too many times. The SSIS package will perform much faster than the SQL cursor because the cursor is row by row and the SSIS package performs the data flow in batch.


Thursday 29 November 2012

Conditional Split Transformation in SSIS 2008

One very useful Control Flow feature in SSIS 2008 is the ‘Conditional Split’. Sometimes new SSIS users can have difficulties in finding out detail about the expressions used in the Conditional Split. The purpose of this blog is to provide some advice about the Conditional Split: its use and how it works, when to use it, and some examples of the expression language. For people who are new (and not-so-new!) to SSIS it can sometimes be difficult to decide which component to use, and it isn’t always immediately clear why the Conditional Split component is useful.

How does the Conditional Split transform work? Briefly, as a Data Flow component, the Conditional Split is a decision maker. The component tests each row, and then decides what its destination should be. In other words, the ‘Conditional Split’ tests each individual inbound row of data against a series of different conditions, which have been defined by the package creator. Then, the Conditional Split component funnels each row of the data up into different outputs, dependent on the result of the test against each condition.

In order to understand its purpose a bit better, here are some examples where it might be useful:

Directing ‘bad’ rows towards a separate place for further examination. A recent example is where timesheet data has been imported from one source to a different destination. The source system permitted timesheets to be submitted without customer name, project name, or submission date; in contrast, the target system regards these timesheets as incorrect. In order to ensure that the import from source to target went as smoothly as possible, the records were filtered for NULLS in the client, project or submission date fields and directed to a separate location for review. I could then take these output files, and ask the customer for default values for these fields, for inserting at a separate time.

Differentiating records between ‘new’ and ‘update’ records. ‘New’ records can be inserted into the appropriate place; ‘update’ records can have fields updated in the existing data, rather than inserted as new information. To divide data rows into ‘new’ rows and ‘update’ rows for differentiating between ‘new’ information and information that might need to be updated. For example, if you have a ‘new’ customer, then you may want to treat that differently from situations where an existing customer simply needs an updated profile. This is called ‘Slowly Changing Dimensions’ (SCD).

When should the Conditional Split component be used, instead of the Slowly Changing Dimension component? This is an important point, and it depends on the quality of the data. Note: the SCD component does not handle NULLs. It is possible to use the SSIS 2008 SCD to handle data with NULLs in it, so, in this case, the Conditional Split component should be used.

Prior to creating the package, it is possible to test the data for NULLs by using the Data Profiling Feature in SSIS, and it is recommended here that this investigation is done prior to writing the package. If the data does contain NULLS, then it is recommended to use the Conditional Split component. Choosing the correct and most efficient component for a particular purpose is half the battle; in the words of Mary Poppins, ‘well begun is half done’!

In order to set up the conditions, it is necessary to use the SSIS Expressions language, which applies to the various transformations within SSIS. In the Conditional Split transformation, it is only possible to use expressions that produce a TRUE or FALSE result. So, for example, the following expressions would be legal:

Implicitly, this means that the Conditional Split transform can have more than one output destination. Due to this, it is important to consider what the ‘default’ situation will be, and where the ‘default’ rows should be directed. Here is an example of the Conditional Split transformation, using the ISNULL expression as an example:-


To conclude, here are some valuable tips in using the Conditional Split transformation:

If the Conditional Split component isn’t working properly, then it is possible to ‘daisy chain’ them in order to clarify the output. The ‘daisy chain’ method can help since it is possible to use Data Viewers in order to actually see what data is going through each flow. This mechanism can illuminate the decision behind the decisions made by the Conditional Split.

If the expressions do not result in TRUE or FALSE outcomes, then the Conditional Split component will not work properly. Examples of expressions in this category include increment statements, e.g. @Counter = 1, and other valid expressions which do not return TRUE or FALSE, such as GETDATE().

Saturday 17 November 2012

SQL SERVER - DDL, DML, DCL, TCL Introduction

DML

DML is abbreviation of Data Manipulation Language. It is used to retrieve, store, modify, delete, insert and update data in database.

  • SELECT 
  • UPDATE 
  • INSERT 

DDL

DDL is abbreviation of Data Definition Language. It is used to create and modify the structure of database objects in database.

  • CREATE
  • ALTER
  • DROP 

DCL


DCL is abbreviation of Data Control Language. It is used to create roles, permissions, and referential integrity as well it is used to control access to database by securing it.

  • GRANT
  • REVOKE 

TCL

TCL is abbreviation of Transactional Control Language. It is used to manage different transactions occurring within a database.

  •  COMMIT
  •  ROLLBACK