Table of Contents

ETL

ETL performance tuning :The data load using ETL in a typical DW is usually a very time-consuming process A data warehousing system that has its ETL process finishing right on-time is going to have a lot of problems simply because often the jobs do not get started on-time due to factors that is beyond the control of the data warehousing team. As a result, it is always an excellent idea for the data warehousing group to tune the ETL process as much as possible

ETL Setup/Performance Tuning

The first step of ETL performance tuning is to identify performance bottlenecks. This can happen in the Source or Target database, Mapping, Transformation and System. The strategy is to identify a performance bottleneck, eliminate it and then identify the next performance bottleneck until it is satisfied with the performance.
Run Test Sessions
Study performance details and thread statistics
Monitor system Performance
Eliminate Source and Target Database Issues
Eliminate Mapping Issues
Eliminate Transformation Issues

Mapping Issues

Mapping level optimization may take time to implement but it can significantly boost session performance. Generally you reduce the number of transformations in the mapping and delete unnecessary links between transformations to optimize the mapping. Configure the mapping with the least number of transformations and expressions to do the most amount of work possible. Here are some simple steps:

Optimize the flat file sources
Configure Single Pass reading
Optimize Simple Pass through Mappings
Optimize filters
Optimize data type conversions
Optimize expressions

Transformation Issues

You can optimize following types of Transformation in Mapping:
Aggregator Transformation
Joiner Transformation
Lookup Transformation
Sequence Generator Transformation
Source Qualifier Transformation

Aggregator Transformation

Here are some guidelines to optimize the performance of an aggregator transformation:
Group by simple columns
Use sorted Inputs
Use Incremental aggregation
Filter data before you aggregate it
Limit port connections

Joiner Transformation

Joiner transformations can slow performance because they need additional space at run time.
Designate the master source as the source with fewer duplicate key values
Designate the master source as the source with the fewer rows
Perform joins in a database when possible

Lookup Transformation

Lookup table is on the same database as the source table in your mapping and caching is not feasible; join the tables in the source database rather than using a lookup transformation.
Use the optimal database driver
Cache lookup tables
Optimize the lookup condition
Index the lookup table
Optimize multiple lookups

Sequence Generator Transformation

The Number of cached values property determines the number of values the Integration Service caches at one time. Make sure that the number of cached value is not too small. You may consider configuring the number of cached values to a value greater than 1000
SQL Qualifier Transformation:
- Use the select distinct option for the source qualifier transformation if you want the Integration Service to select unique values from a source. Use Select Distinct option to filter unnecessary data earlier in the data flow. This can improve performance

Identifying Target Bottlenecks

The most common performance bottleneck occurs when the Informatica Server writes to a target database. Target bottlenecks can be identified by configuring the session to write to a flat file target. If the session performance increases significantly when written to a flat file, then there is a target bottleneck

Identifying Source Bottlenecks

Performance bottlenecks can occur when the Informatica Server reads from a source database. If the session reads from relational source use the following methods:

Add a filter condition after Source qualifier to false so that no data is processed past the filter transformation. If the time it takes to run the new session remains about the same, then there is a source bottleneck.
In a test mapping remove all the transformations and if the performance is similar, then there is a source bottleneck.
Source bottlenecks can be identified by executing the read query directly against the source database. Measure the query execution time and the time it takes for the query to return the first row

Identifying Mapping Bottlenecks

Mapping bottlenecks can be identified by using a Filter transformation in the mapping. If there is no source bottleneck, then add a Filter transformation in the mapping before each target definition. Set the filter condition to false so that no data is loaded into the target tables. If the time it takes to run the new session is the same as the original session, then there is mapping bottleneck.

Identifying a Session Bottleneck

Use Collect Performance Details. Any value other than zero in the readfromdisk and writetodisk counters for Aggregator, Joiner, or Rank transformations indicate a session bottleneck. Low (0-20%) BufferInput_efficiency and BufferOutput_efficiency counter values also indicate a session bottleneck.

Identifying a System Bottleneck

If there is no source, Target, Mapping or Session bottleneck, then there may be a system bottleneck.
Use system tools to monitor CPU usage, memory usage, and paging.
On Windows :- Task Manager
On Unix – Systems toots like sar, iostat. For E.g. sar –u (%usage on user, idle time, i/o waiting time)

Query optimization

Understand how the database is executing the query
Retrieve as little data as possible
Store intermediate results

Below are several specific query optimization strategies.

UseIndex
Using an index is the first strategy one should use to speed up a query. In fact, this strategy is so important that index optimization is also discussed.
AggregateTable
Pre-populating tables at higher levels so less amount of data need to be parsed.
VerticalPartitioning
Partition the table by columns. This strategy decreases the amount of data a SQL query needs to process.
HorizontalPartitioning
Partition the table by data value, most often time. This strategy decreases the amount of data a SQL query needs to process.
Denormalization
The process of denormalization combines multiple tables into a single table. This speeds up query performance because fewer table joins are needed.
ServerTuning
Each server has its own parameters, and often tuning server parameters so that it can fully take advantage of the hardware resources can significantly speed up query performance.

ETL

Incremental load: Since ETL has designed to load the DWH, and normally loading will happen during non business hours, need to check the ETL for incremental loading to evaluate the performance.
- Stage Load
- Data load Volume
- Time check Vs. Data

Check General Logs
- Reference Table Manager – Look into this log if you have issues with RTM operation
- Preview – This is DTM log for preview operation
- Model Repository Service – This logs all MRS operations
- Profile – This is DTM logs for profile runs
- Data Integration Service – This logs all DIS operations
- Deployed Mapping – This is DTM log for running deployed mappings

Target loading(Data warehouse loading)

When ETL sessions writing into Data Warehouse tables, you can perform following tasks to increase performance:
- Increase Checkpoint Intervals
- Use Bulk Loading
- Minimize Deadlocks
- Increase database network packet size
- Optimize Target Database

Increase Checkpoint Intervals

The Integration service performance slows each time it waits for the database to perform a checkpoint. To increase performance, consider increasing the database checkpoint interval. When you increase the database checkpoint interval, you increase the likelihood that the database performs checkpoints as necessary, when the size of the database log file reaches its limit.

Use Bulk Load

When bulk loading enabled, the Integration service bypasses the database log, which speeds performance. Without writing to the database log, however, the target database cannot perform rollback. As a result, you may not able to perform recovery. But when you use bulk loading, weigh the importance of improved session performance against the ability to recover an incomplete session.

Minimize Deadlocks

To improve session performance, you can increase the number of target connection groups the Integration Services uses to write to the targets in a sessions. To use a different target connection group of each target in a session, use the different database connection name for each target instance. You can specify the same connection information for each connection name.

Increase Database Network Packet Size

Increase the network packet size to allow larger packets of data to cross the network at one time. Increase the network packet size based on database you use:
Oracle – You can increase the database server network packet size in listener.ora and tnsnames.ora file.

Optimize Target Database

If the target database is Oracle, we can optimize the target database by checking the storage clause, space allocation and rollback or undo segments.
When you write to an Oracle Database, check the storage clause for the database objects. The database should also store table and index data in separate table spaces, preferably on different disks.
When you write to Oracle Database, it uses rollback or undo segments during loads. Make sure you have appropriate table spaces. The rollback or undo segments should also have appropriate storage clauses.

Performance in terms of Data retrieval

System Limitations

Each system has finite capacity
- volume of data can be measured easily
- elapsed time is fairly easily captured
- how fast can data be delivered?
- how fast can data be written?
- is the network a bottleneck?
Need baseline measures
- taken under controlled conditions
- standard data set
- known amount of data
- statistically significant amount of data
- reproducible test(s)
Concerns on Disk Deliver
- Local disk can deliver data at finite rate
- Rate depends mainly on rotation
- 4,000 rpm
- 5,400 rpm
- 7,700 rpm
- 10,000 rpm (“Barracuda”)
- Exacerbated by contention, fragmentation
  (especially Windows NT)

Keep Jobs Simple

Don’t do unnecessary reference lookups, validations
Don’t process more rows than necessary
- perform selection at extraction time
- perform aggregation early in job
Don’t process more columns than necessary
- don’t select (extract) columns you don’t need
- just because it’s in the table definition doesn’t
  mean you have to use it
Parallel Processing
- Job design can take advantage of parallel processing capabilities
- Independent paths through job are run as separate processes
- Independent jobs can run simultaneously

Sorting

Sorted data substantially improve performance of Aggregator stage
Prefer to sort at extraction time
- ORDER BY clause in SELECT
- perform UNIX sort in before-stage routine
Can use Sort stage
- not as fast
- memory hog

Limiting Processing

Select only needed rows from source
- change data capture
- selection criteria
- extraction through views/stored procedures
Select/process only needed columns
Pre-sort data to be aggregated
Avoid unnecessary multiple execution of functions and traversal of data

Performance Tuning : Can also be looked into

Optimized use of the Tool.
Better memory management
Better resource management
Reduction in processing time.

Approach for Performance Tuning

Parallel Processing
Partitioning
Distributed Processing
Memory Management

Performance Tuning – Parallel Processing

Pipeline Parallelism
Parallel Load and Extract

When to Use
- DSS Environment
- Complex queries on very large tables
Supported Systems
- SMP, Clusters, MPPS
- Sufficient I/O bandwidth
- Underutilized CPU (upto 30%)
- Sufficient Memory to support extra processes (Sorts, hashing, I/O buffers)

Performance Tuning – Partitioning

Data Partitioning
- Range
- Hash
- Round-Robin
Continues ETL
- Large data volume
- Windowing technique
Set Processing
- Saves time for background processing
- Processing many records at a time
- Re-usable Caching

Performance Tuning – Distributed Processing

Hub Spoke Architecture to n-tier Architecture
- Reduce network traffic
- Load distribution
- Load delegation
- All types of sources
Distributed architecture

Performance Tuning in Various ETL Tools

Performance Tuning – AbInitio

Defining multiple nodes to achieve parallel processing.
Data is split into smaller partitions to achieve parallel processing .
Different Partition and Collection Techniques to improve performance of the job.
Buffering Techniques to improve performance.
Data stored as datasets helps jobs to execute faster as datasets are stored in native format. This also helps in the restart capabilities of the job.
Built-in sorting and aggregation capabilities to achieve optimized performance.
Native calls to source and target avoid performance problems due to ODBC
Monitor facility helps in identifying the Congestion points and helps in further tuning of the jobs.