Lookup transformations are used to lookup a set of values in another table.Lookups slows down the performance.
1. To improve performance, cache the lookup tables. Informatica can cache all the lookup and reference tables; this makes operations run very fast. (Meaning of cache is given in point 2 of this section and the procedure for determining the optimum cache size is given at the end of this document.)
2. Even after caching, the performance can be further improved by minimizing the size of the lookup cache. Reduce the number of cached rows by using a sql override with a restriction.
Cache: Cache stores data in memory so that Informatica does not have to read the table each time it is referenced. This reduces the time taken by the process to a large extent. Cache is automatically generated by Informatica depending on the marked lookup ports or by a user defined sql query.
Example for caching by a user defined query: -
Suppose we need to lookup records where employee_id=eno.
‘employee_id’ is from the lookup table, EMPLOYEE_TABLE and ‘eno’ is the
input that comes from the from the source table, SUPPORT_TABLE.
We put the following sql query override in Lookup Transform
‘select employee_id from EMPLOYEE_TABLE’
If there are 50,000 employee_id, then size of the lookup cache will be 50,000.
Instead of the above query, we put the following:-
‘select emp employee_id from EMPLOYEE_TABLE e, SUPPORT_TABLE s
where e. employee_id=s.eno’
If there are 1000 eno, then the size of the lookup cache will be only 1000.But here the performance gain will happen only if the number of records in SUPPORT_TABLE is not huge. Our concern is to make the size of the cache as less as possible.
3. In lookup tables, delete all unused columns and keep only the fields that are used in the mapping.
4. If possible, replace lookups by joiner transformation or single source qualifier.Joiner transformation takes more time than source qualifier transformation.
5. If lookup transformation specifies several conditions, then place conditions that use equality operator ‘=’ first in the conditions that appear in the conditions tab.
6. In the sql override query of the lookup table, there will be an ORDER BY clause. Remove it if not needed or put fewer column names in the ORDER BY list.
7. Do not use caching in the following cases: -
-Source is small and lookup table is large.
-If lookup is done on the primary key of the lookup table.
8. Cache the lookup table columns definitely in the following case: -
-If lookup table is small and source is large.
9. If lookup data is static, use persistent cache. Persistent caches help to save and reuse cache files. If several sessions in the same job use the same lookup table, then using persistent cache will help the sessions to reuse cache files. In case of static lookups, cache files will be built from memory cache instead of from the database, which will improve the performance.
10. If source is huge and lookup table is also huge, then also use persistent cache.
11. If target table is the lookup table, then use dynamic cache. The Informatica server updates the lookup cache as it passes rows to the target.
12. Use only the lookups you want in the mapping. Too many lookups inside a mapping will slow down the session.
13. If lookup table has a lot of data, then it will take too long to cache or fit in memory. So move those fields to source qualifier and then join with the main table.
14. If there are several lookups with the same data set, then share the caches.
15. If we are going to return only 1 row, then use unconnected lookup.
16. All data are read into cache in the order the fields are listed in lookup ports. If we have an index that is even partially in this order, the loading of these lookups can be speeded up.
17. If the table that we use for look up has an index (or if we have privilege to add index to the table in the database, do so), then the performance would increase both for cached and un cached lookups.
No comments :
Post a Comment