Joiner Transformation
Joiner transformation is an active and connected transformation. It provides the option of creating joins in the Informatica. By using the joiner transformation, the created joins are similar to the joins in databases.
The joiner transformation is used to join two heterogeneous sources. The joiner transformation joins sources on the basis of a condition that matches one or more pairs of columns between the two sources.
The two input pipelines include a master and a detail pipeline. We need to join the output of the joiner transformation with another source to join more than two sources. And to join n number of sources in mapping, we need n-1 joiner transformations.
In joiner transformation, there are two sources which we are using for joins, such as:
- Master Source
- Detail Source
In the properties of joiner transformation, we can select which data source can be a Master source and which source can be a detail source.
During execution, the master source is cached into the memory for joining purpose. So it is necessary to select the source with less number of records as the master source.
Configuring Joiner Transformation
In Informatica, we configure the following properties of joiner transformation, such as:
- Case-Sensitive String Comparison: The integration service uses this option when we are performing joins on string columns. By default, the case sensitive string comparison option is checked.
- Cache Directory: Directory used to cache the master or detail rows. The default directory path is $PMCacheDir. We can override this value as well.
- Join Type: The type of join to be performed as Master Outer Join, Detail Outer Join, Normal Join, or Full Outer Join.
- Tracing Level: It is used to track the Level of tracing in the session log file.
- Joiner Data Cache Size: It tells the size of the data cache. And Auto is the default value of the data cache size.
- Joiner Index Cache Size: It tells the size of the index cache. And Auto is the default value of the index cache size.
- Sorted Input: This option is used when the input data is in sorted order. And it gives better performance.
- Master Sort Order: It gives the sort order of the master source data. If the master source data is sorted in ascending order, then we choose Ascending. We have to enable the Sorted Input option if we choose Ascending. And Auto is the default value for this option.
- Transformation Scope: We can select the transformation scope as All Input or Row.
Types of Joins
In Informatica, the following joins can be created using joiner transformation, such as:
- Master outer join
In Master outer join, all records from the Detail source are returned by the join, and only matching rows from the master source are returned. - Detail outer join
In detail outer join, only matching rows are returned from the detail source, and all rows from the master source are returned. - Full outer join.
In full outer join, all records from both the sources are returned. Master outer and Detail outer joins are equivalent to left outer joins in SQL. - Normal join
In normal join, only matching rows are returned from both the sources.
Example
In the following example, we will join emp and dept tables using joiner transformation in the following steps:
Step 1: Create a new target table EMP_DEPTNAME in the database using the below script and import the table in Informatica targets.
Step 2: Create a new mapping and import source tables “EMP” and “DEPT” and target table, which we created in the previous step.
Step 3: From the transformation menu, select create option and,
- Select joiner transformation
- Enter transformation name “jnr_emp_dept”
- Select create option
Step 4: Drag and drop all the columns from both the source qualifiers to the joiner transformation.
Step 5: Double click on the joiner transformation, then in the edit transformation window:
- Select the condition tab.
- Click on the add new condition icon.
- Select deptno in master and detail columns list.
Step 6: Then, in the same window:
- Select the properties tab.
- Select normal Join as join type.
- Click on the OK button.
For performance optimization, we assign the master source to the source table pipeline, which is having less number of records. To perform this task:
Step 7: Double click on the joiner transformation to open the edit properties window, and then
- Select the ports tab.
- Select any column of a particular source that you want to make a master.
- Click on the OK button.
Step 8: Link the relevant columns from the joiner transformation to the target table.
Now save the mapping and execute it after creating a session and workflow for it. The join will be created using Informatica joiner, and relevant details will be fetched from both the tables.
Sorted Input
When both the Master and detail source are sorted on the ports specified in the join condition, then use the sorted input option in the joiner properties tab.
We can improve the performance by using the sorted input option as the integration service performs the join by minimizing the number of disk IOs. It gives excellent performance when we are working with large data sets.
here are some steps to configuring the sorted input option, such as:
- Sort the master and detail source either by using the source qualifier transformation or sorter transformation.
- Sort both the source on the ports to be used in join conditions either in ascending or descending order.
- Specify the Sorted Input option in the joiner transformation properties tab.
Blocking Transformation
The joiner Transformation is called as the blocking transformation. The integration service blocks and unblocks the source data depending on whether the joiner transformation is configured for sorted input or not.
Unsorted Joiner Transformation
In the case of unsorted joiner transformation, the integration service first reads all the master rows before it reads the detail rows.
The integration service blocks the detail source while it caches all the master rows. Once it reads all the master rows, then it unblocks the detail source and understands the details rows.
Sorted Joiner Transformation
The blocking logic may or may not possible in case of sorted joiner transformation. The integration service uses blocking logic if it can do so without blocking all sources in the target load order group. Otherwise, it does not use blocking logic.
How to Improve Joiner Transformation Performance?
Below are some important points to improve the performance of a joiner transformation, such as:
- If possible, perform joins in a database. Performing joins in a database is faster than performing joins in a session.
- We can improve the session performance by configuring the Sorted Input option in the joiner transformation properties tab.
- Specify the source with fewer rows and with fewer duplicate keys as the Master and the other source as detail.
Limitations of Joiner Transformation
Here are the following limitations of joiner transformation, such as:
- We cannot use joiner transformation when the input pipeline contains an update strategy transformation.
- We cannot connect a sequence generator transformation directly to the joiner transformation.