Wednesday, 20 April 2011

LOOKUP TRANSFORMATION

                                                           
  • Passive Transformation
  • Can be Connected or Unconnected. Dynamic lookup is connected.
  • Use a Lookup transformation in a mapping to look up data in a flat file or a relational table, view, or synonym.
  • We can import a lookup definition from any flat file or relational database to which both the PowerCenter Client and Server can connect.
  • We can use multiple Lookup transformations in a mapping.

The Power Center Server queries the lookup source based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup source column values based on the lookup condition. Pass the result of the lookup to other transformations and a target.

We can use the Lookup transformation to perform following:

  • Get a related value: EMP has DEPTNO but DNAME is not there. We use Lookup to get DNAME from DEPT table based on Lookup Condition.
  • Perform a calculation: We want only those Employees who’s SAL > Average (SAL). We will write Lookup Override query.
  • Update slowly changing dimension tables: Most important use. We can use a Lookup transformation to determine whether rows already exist in the target.

1. LOOKUP TYPES

We can configure the Lookup transformation to perform the following types of lookups:

  • Connected or Unconnected
  • Relational or Flat File
  • Cached or Un cached

Relational Lookup:

When we create a Lookup transformation using a relational table as a lookup source, we can connect to the lookup source using ODBC and import the table definition as the structure for the Lookup transformation.

  • We can override the default SQL statement if we want to add a WHERE clause or query multiple tables.
  • We can use a dynamic lookup cache with relational lookups.

Flat File Lookup:

When we use a flat file for a lookup source, we can use any flat file definition in the repository, or we can import it. When we import a flat file lookup source, the Designer invokes the Flat File Wizard.

Cached or Un cached Lookup:

We can check the option in Properties Tab to Cache to lookup or not. By default, lookup is cached.

Connected and Unconnected Lookup

Connected Lookup

Unconnected Lookup

Receives input values directly from the pipeline.

Receives input values from the result of a :LKP expression in another transformation.

We can use a dynamic or static cache.

We can use a static cache.

Cache includes all lookup columns used in the mapping.

Cache includes all lookup/output ports in the lookup condition and the lookup/return port.

If there is no match for the lookup condition, the Power Center Server returns the default value for all output ports.

If there is no match for the lookup condition, the Power Center Server returns NULL.

If there is a match for the lookup condition, the Power Center Server returns the result of the lookup condition for all lookup/output ports.

If there is a match for the lookup condition,the Power Center Server returns the result of the lookup condition into the return port.

Pass multiple output values to another transformation.

Pass one output value to another transformation.

Supports user-defined default values

Does not support user-defined default values.


clip_image053

2 .LOOKUP T/F COMPONENTS

Define the following components when we configure a Lookup transformation in a mapping:

  • Lookup source
  • Ports
  • Properties
  • Condition

1. Lookup Source:

We can use a flat file or a relational table for a lookup source. When we create a Lookup t/f, we can import the lookup source from the following locations:

  • Any relational source or target definition in the repository
  • Any flat file source or target definition in the repository
  • Any table or file that both the Power Center Server and Client machine can connect to The lookup table can be a single table, or we can join multiple tables in the same database using a lookup SQL override in Properties Tab.

2. Ports:

Ports

Lookup

Type


Number

Needed

Description


I

Connected

Unconnected

Minimum 1

Input port to Lookup. Usually ports used for Join condition are Input ports.

O

Connected

Unconnected


Minimum 1

Ports going to another transformation from Lookup.

L

Connected

Unconnected

Minimum 1

Lookup port. The Designer automatically Designates each column in the lookup source as a lookup (L) and output port (O).

R Unconnected 1 Only

Return port. Use only in unconnected Lookup t/f only.


3. Properties Tab

Options

Lookup Type


Description

Lookup SQL Override

Relational

Overrides the default SQL statement to query the lookup table.

Lookup Table Name

Relational

Specifies the name of the table from which the transformation looks up and caches values.

Lookup Caching Enabled

Flat File, Relational

Indicates whether the Power Center Server caches lookup values during the session.

Lookup Policy on Multiple Match

Flat File, Relational

Determines what happens when the Lookup transformation finds multiple rows that match the lookup condition. Options: Use First Value or Use Last Value or Use Any Value or Report Error

Lookup Condition

Flat File, Relational

Displays the lookup condition you set in the Condition tab.

Connection Information

Relational

Specifies the database containing the lookup table.

Source Type

Flat File, Relational

Lookup is from a database or flat file.

Lookup Cache Directory Name

Flat File, Relational

Location where cache is build.

Lookup Cache Persistent

Flat File, Relational

Whether to use Persistent Cache or not.

Dynamic Lookup Cache

Flat File, Relational

Whether to use Dynamic Cache or not.

Recache From Lookup Source

Flat File, Relational

To rebuild cache if cache source changes and we are using Persistent Cache.

Insert Else Update

Relational

Use only with dynamic caching enabled. Applies to rows entering the Lookup transformation with the row type of insert.

Lookup Data Cache Size

Flat File, Relational

Data Cache Size

Lookup Index Cache Size

Flat File, Relational

Index Cache Size

Cache File Name Prefix

Flat File, Relational

Use only with persistent lookup cache. Specifies the file name prefix to use with persistent lookup cache files.

 

 

Some other properties for Flat Files are:

  • Date time Format
  • Thousand Separator
  • Decimal Separator
  • Case-Sensitive String Comparison
  • Null Ordering
  • Sorted Input

4: Condition Tab

We enter the Lookup Condition. The Power Center Server uses the lookup condition to test incoming values. We compare transformation input values with values in the lookup source or cache, represented by lookup ports.

  • The data types in a condition must match.
  • When we enter multiple conditions, the Power Center Server evaluates each condition as an AND, not an OR.
  • The Power Center Server matches null values.
  • The input value must meet all conditions for the lookup to return a value.
  • =, >, <, >=, <=, != Operators can be used.
  • Example: IN_DEPTNO = DEPTNO

In_DNAME = 'DELHI'

Tip: If we include more than one lookup condition, place the conditions with an equal sign first to optimize lookup performance.

Note:

1. We can use = operator in case of Dynamic Cache.

2. The Power Center Server fails the session when it encounters multiple keys for a Lookup transformation configured to use a dynamic cache.

clip_image131clip_image133

3. Connected Lookup Transformation

Example: To create a connected Lookup Transformation

  • EMP will be source table. DEPT will be LOOKUP table.
  • Create a target table CONN_Lookup_EXAMPLE in target designer. Table should contain all ports of EMP table plus DNAME and LOC as shown below.
  • Create the shortcuts in your folder.

Creating Mapping:

1. Open folder where we want to create the mapping.

2. Click Tools -> Mapping Designer.

3. Click Mapping-> Create-> Give name. Ex: m_CONN_LOOKUP_EXAMPLE

4. Drag EMP and Target table.

5. Connect all fields from SQ_EMP to target except DNAME and LOC.

6. Transformation-> Create -> Select LOOKUP from list. Give name and click

Create.

7. The Following screen is displayed.

8. As DEPT is the Source definition, click Source and then Select DEPT.

9. Click Ok.

clip_image135

10. Now Pass DEPTNO from SQ_EMP to this Lookup. DEPTNO from SQ_EMP will be named as DEPTNO1. Edit Lookup and rename it to IN_DEPTNO in ports tab.

11. Now go to CONDITION tab and add CONDITION.

DEPTNO = IN_DEPTNO and Click Apply and then OK.

Link the mapping as shown below:

12. We are not passing IN_DEPTNO and DEPTNO to any other transformation from LOOKUP; we can edit the lookup transformation and remove the OUTPUT check from them.

13. Mapping -> Validate

14. Repository -> Save

  • Create Session and Workflow as described earlier. Run the workflow and see the data in target table.
  • Make sure to give connection information for all tables.
  • Make sure to give connection for LOOKUP Table also.

We use Connected Lookup when we need to return more than one column from Lookup table.There is no use of Return Port in Connected Lookup.

SEE PROPERTY TAB FOR ADVANCED SETTINGS

clip_image137

4. Unconnected Lookup Transformation

An unconnected Lookup transformation is separate from the pipeline in the mapping. We write an expression using the :LKP reference qualifier to call the lookup within another transformation.

Steps to configure Unconnected Lookup:

  1. Add input ports.
  2. Add the lookup condition.
  3. Designate a return value.
  4. Call the lookup from another transformation.

Example: To create a unconnected Lookup Transformation

  • EMP will be source table. DEPT will be LOOKUP table.
  • Create a target table UNCONN_Lookup_EXAMPLE in target designer. Table should contain all ports of EMP table plus DNAME as shown below.
  • Create the shortcuts in your folder.

Creating Mapping:

1. Open folder where we want to create the mapping.

2. Click Tools -> Mapping Designer.

3. Click Mapping-> Create-> Give name. Ex: m_UNCONN_LOOKUP_EXAMPLE

4. Drag EMP and Target table.

5. Now Transformation-> Create -> Select EXPRESSION from list. Give name

and click Create. Then Click Done.

6. Pass all ports from SQ_EMP to EXPRESSION transformation.

7. Connect all fields from EXPRESSION to target except DNAME.

8. Transformation-> Create -> Select LOOKUP from list. Give name and click

Create.

9. Follow the steps as in Connected above to create Lookup on DEPT table.

10. Click Ok.

11. Now Edit the Lookup Transformation. Go to Ports tab.

12. As DEPTNO is common in source and Lookup, create a port IN_DEPTNO

ports tab. Make it Input port only and Give Datatype same as DEPTNO.

13. Designate DNAME as Return Port. Check on R to make it.

clip_image139clip_image141

14. Now add a condition in Condition Tab.

DEPTNO = IN_DEPTNO and Click Apply and then OK.

15. Now we need to call this Lookup from Expression Transformation.

16. Edit Expression t/f and create a new output port out_DNAME of data type as DNAME. Open the Expression editor and call Lookup as given below:

We double click Unconn in bottom of Functions tab and as we need only

DEPTNO, we pass only DEPTNO as input.

17. Validate the call in Expression editor and Click OK.

18. Mapping -> Validate

19. Repository Save.

  • Create Session and Workflow as described earlier. Run the workflow and see the data in target table.
  • Make sure to give connection information for all tables.
  • Make sure to give connection for LOOKUP Table also.

5. Lookup Caches

We can configure a Lookup transformation to cache the lookup table. The Integration Service (IS) builds a cache in memory when it processes the first row of data in a cached Lookup transformation.

The Integration Service also creates cache files by default in the $PMCacheDir. If the data does not fit in the memory cache, the IS stores the overflow values in the cache files. When session completes, IS releases cache memory and deletes the cache files.

  • If we use a flat file lookup, the IS always caches the lookup source.
  • We set the Cache type in Lookup Properties.

Lookup Cache Files

1. Lookup Index Cache:

  • Stores data for the columns used in the lookup condition.

2. Lookup Data Cache:

  • For a connected Lookup transformation, stores data for the connected output ports, not including ports used in the lookup condition.
  • For an unconnected Lookup transformation, stores data from the return port.

Types of Lookup Caches:

1. Static Cache

By default, the IS creates a static cache. It caches the lookup file or table and Looks up values in the cache for each row that comes into the transformation.The IS does not update the cache while it processes the Lookup transformation.

2. Dynamic Cache

To cache a target table or flat file source and insert new rows or update existing rows in the cache, use a Lookup transformation with a dynamic cache.

The IS dynamically inserts or updates data in the lookup cache and passes data to the target. Target table is also our lookup table. No good for performance if table is huge.

3. Persistent Cache

If the lookup table does not change between sessions, we can configure the Lookup transformation to use a persistent lookup cache.

The IS saves and reuses cache files from session to session, eliminating the time Required to read the lookup table.

4. Recache from Source

If the persistent cache is not synchronized with the lookup table, we can Configure the Lookup transformation to rebuild the lookup cache.If Lookup table has changed, we can use this to rebuild the lookup cache.

clip_image143clip_image145

5. Shared Cache

  •  Unnamed cache: When Lookup transformations in a mapping have compatible caching structures, the IS shares the cache by default. You can only share static unnamed caches.
  •  Named cache: Use a persistent named cache when we want to share a cache file across mappings or share a dynamic and a static cache. The caching structures must match or be compatible with a named cache. You can share static and dynamic named caches.

Building Connected Lookup Caches

We can configure the session to build caches sequentially or concurrently.

  • When we build sequential caches, the IS creates caches as the source rows enter the Lookup transformation.
  • When we configure the session to build concurrent caches, the IS does not wait for the first row to enter the Lookup transformation before it creates caches. Instead, it builds multiple caches concurrently.

1. Building Lookup Caches Sequentially:

clip_image002

2. Building Lookup Caches Concurrently:

clip_image002[6]

  • To configure the session to create concurrent caches

Edit Session -> In Config Object Tab-> Additional Concurrent Pipelines for

Lookup Cache Creation -> Give a value here (Auto By Default)

Note: The IS builds caches for unconnected Lookups sequentially only.

No comments :

Post a Comment