Sunday 17 April 2011

SCD – Type 1

                                                            

Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule

For example, you may have a dimension in your database that tracks the sales records of your company's salespeople. Creating sales reports seems simple enough, until a salesperson is transferred from one regional office to another. How do you record such a change in your sales dimension?

You could sum or average the sales by salesperson, but if you use that to compare the performance of salesmen, that might give misleading information. If the salesperson that was transferred used to work in a hot market where sales were easy, and now works in a market where sales are infrequent, her totals will look much stronger than the other salespeople in her new region, even if they are just as good. Or you could create a second salesperson record and treat the transferred person as a new sales person, but that creates problems also.

Dealing with these issues involves SCD management methodologies:

Type 1:

The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. This is most appropriate when correcting certain types of data errors, such as the spelling of a name. (Assuming you won't ever need to know how it used to be misspelled in the past.)

Here is an example of a database table that keeps supplier information:

Supplier_Key

Supplier_Code

Supplier_Name

Supplier_State

123

ABC

Acme Supply Co

CA

In this example, Supplier_Code is the natural key and Supplier_Key is a surrogate key. Technically, the surrogate key is not necessary, since the table will be unique by the natural key (Supplier_Code). However, the joins will perform better on an integer than on a character string.

Now imagine that this supplier moves their headquarters to Illinois. The updated table would simply overwrite this record:

Supplier_Key

Supplier_Code

Supplier_Name

Supplier_State

123

ABC

Acme Supply Co

IL

The obvious disadvantage to this method of managing SCDs is that there is no historical record kept in the data warehouse. You can't tell if your suppliers are tending to move to the Midwest, for example. But an advantage to Type 1 SCDs is that they are very easy to maintain.

Explanation with an Example:

Source Table: (01-01-11) Target Table: (01-01-11)

Emp no

Ename

Sal

101

A

1000

102

B

2000

103

C

3000

 

Emp no

Ename

Sal

101

A

1000

102

B

2000

103

C

3000

The necessity of the lookup transformation is illustrated using the above source and target table.

Source Table: (01-02-11) Target Table: (01-02-11)

Emp no

Ename

Sal

Empno

Ename

Sal

101

A

1000

101

A

1000

102

B

2500

102

B

2500

103

C

3000

103

C

3000

104

D

4000

104

D

4000

  • In the second Month we have one more employee added up to the table with the Ename D and salary of the Employee is changed to the 2500 instead of 2000.

Step 1: Is to import Source Table and Target table.

  • Create a table by name emp_source with three columns as shown above in oracle.
  • Import the source from the source analyzer.
  • In the same way as above create two target tables with the names emp_target1, emp_target2.
  • Go to the targets Menu and click on generate and execute to confirm the creation of the target tables.
  • The snap shot of the connections using different kinds of transformations are shown below.

clip_image002

Step 2: Design the mapping and apply the necessary transformation.

  • Here in this transformation we are about to use four kinds of transformations namely Lookup transformation, Expression Transformation, Filter Transformation, Update Transformation. Necessity and the usage of all the transformations will be discussed in detail below.

Look up Transformation: The purpose of this transformation is to determine whether to insert, Delete, Update or reject the rows in to target table.

  • The first thing that we are goanna do is to create a look up transformation and connect the Empno from the source qualifier to the transformation.
  • The snapshot of choosing the Target table is shown below.

clip_image004

  • What Lookup transformation does in our mapping is it looks in to the target table (emp_table) and compares it with the Source Qualifier and determines whether to insert, update, delete or reject rows.
  • In the Ports tab we should add a new column and name it as empno1 and this is column for which we are gonna connect from the Source Qualifier.
  • The Input Port for the first column should be unchked where as the other ports like Output and lookup box should be checked. For the newly created column only input and output boxes should be checked.
  • In the Properties tab (i) Lookup table name ->Emp_Target.

(ii)Look up Policy on Multiple Mismatch -> use First Value.

(iii) Connection Information ->Oracle.

  • In the Conditions tab (i) Click on Add a new condition

(ii)Lookup Table Column should be Empno, Transformation port should be Empno1 and Operator should ‘=’.

Expression Transformation: After we are done with the Lookup Transformation we are using an expression transformation to check whether we need to insert the records the same records or we need to update the records. The steps to create an Expression Transformation are shown below.

  • Drag all the columns from both the source and the look up transformation and drop them all on to the Expression transformation.
  • Now double click on the Transformation and go to the Ports tab and create two new columns and name it as insert and update. Both these columns are gonna be our output data so we need to have check mark only in front of the Output check box.
  • The Snap shot for the Edit transformation window is shown below.

clip_image006

  • The condition that we want to parse through our output data are listed below.

Input à IsNull(EMPNO1)

Output à iif(Not isnull (EMPNO1) and Decode(SAL,SAL1,1,0)=0,1,0) .

  • We are all done here .Click on apply and then OK.

Filter Transformation: we are gonna have two filter transformations one to insert and other to update.

  • Connect the Insert column from the expression transformation to the insert column in the first filter transformation and in the same way we are gonna connect the update column in the expression transformation to the update column in the second filter.
  • Later now connect the Empno, Ename, Sal from the expression transformation to both filter transformation.
  • If there is no change in input data then filter transformation 1 forwards the complete input to update strategy transformation 1 and same output is gonna appear in the target table.
  • If there is any change in input data then filter transformation 2 forwards the complete input to the update strategy transformation 2 then it is gonna forward the updated input to the target table.
  • Go to the Properties tab on the Edit transformation

(i) The value for the filter condition 1 is Insert.

(ii) The value for the filter condition 1 is Update.

  • The Closer view of the filter Connection is shown below.

clip_image008

Update Strategy Transformation: Determines whether to insert, delete, update or reject the rows.

  • Drag the respective Empno, Ename and Sal from the filter transformations and drop them on the respective Update Strategy Transformation.
  • Now go to the Properties tab and the value for the update strategy expression is 0 (on the 1st update transformation).
  • Now go to the Properties tab and the value for the update strategy expression is 1 (on the 2nd update transformation).
  • We are all set here finally connect the outputs of the update transformations to the target table.

Step 3: Create the task and Run the work flow.

  • Don’t check the truncate table option.
  • Change Bulk to the Normal.
  • Run the work flow from task.

Step 4: Preview the Output in the target table.

clip_image010

No comments :

Post a Comment