Tuesday 7 March 2017

Types of Dimensions

 

1.       Slowly Changing Dimensions(SCD):-

 

Dimension attributes that change slowly over a period of time rather than changing regularly is grouped as SCDs.  Attributes like name, address can change but not too often.

 

SCD-1:- Old value of attribute is overwritten by new values of attribute and no history kept.

 

EMID

EMPNAME

DEPARTMENT

1

Pratik

COE

2

Surya

IT

 

EMPLOYEEID

NAME

DEPARTMENT

1

Pratik

Finance

2

Surya

IT

 

SCD-2:- In this type we tracks historical data by creating multiple records for a given Natural key (business key) in the dimensional tables with separate surrogate key and/or different version numbers. Unlimited history is preserved for each insert.

 

EMPID

EMPNAME

Location

1

Pratik

Noida

2

Surya

Delhi

 

Converting above table into SCD-2 table using natural key, surrogate key, startdate, enddate, flag

First iteration:-

EMPCODE

EMPID

EMPNAME

Location

Startdate

Enddate

Flag

1001

1

Pratik

Noida

01-Jan-15

31-Dec-9999

Y

1002

2

Surya

Delhi

01-Mar-16

31-Dec-9999

Y

 

EMPCODE

EMPID

EMPNAME

Location

Startdate

Enddate

Flag

1001

1

Pratik

Noida

01-Jan-15

NULL

Y

1002

2

Surya

Delhi

01-Mar-16

NULL

Y

 

 

Second Iteration:-

 

EMPCODE

EMPID

EMPNAME

Location

Startdate

Enddate

Flag

1001

1

Pratik

Noida

01-Jan-15

01-Feb-16

N

1002

2

Surya

Delhi

01-Mar-16

31-Dec-9999

Y

1003

1

Pratik

Pune

01-Feb-16

31-Dec-9999

Y

 

Third Iteration:-

EMPCODE

EMPID

EMPNAME

Location

Startdate

Enddate

Flag

1001

1

Pratik

Noida

01-Jan-15

01-Feb-16

N

1002

2

Surya

Delhi

01-Mar-16

31-Dec-9999

Y

1003

1

Pratik

Pune

01-Feb-16

01-Mar-16

N

1004

1

Pratik

Goa

01-Mar-16

31-Dec-9999

Y

 

SCD-3:-In this type, we tracks changes using separate columns and preserves limited history.it is limited to how many columns we want to add in dimension table.

 

EMID

EMPNAME

City

1

Pratik

Delhi

2

Surya

Noida

 

EMID

EMPNAME

Old City

Current City

1

Pratik

Delhi

Mumbai

2

Surya

Noida

 

 

The limitation of Slowly Changing Dimension Type III over Type II is that in Type III we have the immediate previous data only, whereas in Type II we have the complete history. For example if an employee changes 3 cities there will be 3 records in the table with proper effective dates whereas in Type III implementation only the current and the previous city will be stored.

 

2.       Junk Dimensions:-

A junk dimension is a collection of random transaction codes flags and/or text attributes that are unrelated to any particular dimension. The reason to create junk dimensions is to lighten the ETL part.

 

 

MaritialStatusId

Description

1

Single

2

Married

 

GenderId

Description

1

Male

2

Female

 

Fact Table:-

 

Gender_Married_ID

Gender Status

Married Status

1

Male

Single

2

Male

Married

3

Female

Single

4

Female

Married

 

 

New Fact Table:-

 

EmpID

Name

DateofBirth

Gender_Married_ID

1001

Pratik

01-Jan-89

1

1001

Surya

01-Feb-90

2

1001

Simran

01-Mar-85

3

1001

Neha

01-Dec-84

4

 

 

3.       Role-playing Dimensions:-

 

Dimensions which are often used for multiple purposes within the same database are called role-playing dimensions.

 

RDBMS Table:-

 

 

EmpID

EmpName

DOBID

DOJID

DOEID

1001

Ajay

1989-01-01

2013-01-01

2015-01-01

1002

Vijay

1985-04-01

2013-01-01

NULL

1003

Amit

1988-12-12

2013-01-01

NULL

1004

Ankit

1982-06-04

2013-01-01

2016-01-01

 

 

 

 

 

DateKey

Date

DayOfMonth

DayName

Month

MonthName

Quarter

Year

20130101

2013-01-01

1

Tuesday

1

January

1

2013

20130102

2013-01-02

2

Wednesday

1

January

1

2013

20130103

2013-01-03

3

Thursday

1

January

1

2013

20130104

2013-01-04

4

Friday

1

January

1

2013

 

                                                                                           EMPLOYEE FACT

EMPID
EMPNAME
DEPTID
DESIGNATION
DOBID
DOJID
DOEID
LOCATIONID
ORGID
 


        DIMDATE

DateKey
Date
DayOfMonth
DayName
Month
MonthName
Quarter
Year
                                                                                                        

 

 

 

 

 

 

 

 

EmpID

EmpName

DOBID

DOJID

DOEID

1001

Ajay

19890101

20130101

20150101

1002

Vijay

19850401

20130101

NULL

1003

Amit

19881212

20130101

NULL

1004

Ankit

19820604

20130101

20160101

 

 

 

4.       Conformed Dimensions:-

 

The dimension that is connected across many fact tables.

 

 

 

EMPID
EMPNAME
DEPTID
DESIGNATIONID
DOBID
DOJID
DOEID
LOCATIONID
ORGID
EMPID
TotalLeave
PlannedLeave
UnPlannedLeave
     LEAVE FACT                                         EMPLOYEE DIM

EMPID
TOTALSALES
PROFIT

                                                                                                                          SALES FACT

 

 

 

 

 

 

 

 

 

 

 

 


5.       Degenerate Dimensions:-

 

A degenerate dimension is a dimension which is derived from the fact table and doesn't have its own dimension table.

 

 

OrderID

OrderNumber

ITEMCOUNT

SALESCOUNT

TAX

1

500675

2

500

36

2

500676

4

200

26

 

 

 

 

No comments:

Post a Comment