ODI- Load Data Directly From The Source To The Target Without Creating Any Temporary Table


It is useful to undestand why and when ODI 10g and ODI 11g is creating temporary tables. The descriptions below applies to standard cases.

C$ Collect tables

A C$_ table is created by an LKM in order to transfer data from a source server to the staging area.

This will allow ODI to join data sets coming from heterogeneous source data servers. For example, if as sources you are using an Excel spreadsheet and an Oracle table to load a Teradata database, ODI will retrieve the content of the Oracle table and the Excel file into 2 collect tables, both hosted on the staging area. This will allow ODI to join the data coming from Excel and Oracle.

If the Integration Interface contains several source Datastores which are joined on the Source, the C$_ table will contain the result of the join operation. If the join is set to be executed on the staging area, ODI will create 2 distinct C$_ tables.

Notice that when the source Datastores are on the same server as the target Datastore or same server as the staging area, there is no LKM to set at the Integration Interface level.

I$ Integration tables

Most of the time, the I$_ tables are created by the IKMs whenever there is a need for updating a target table or running a flow control.

If this is not required, the creation of the I$_ table is of no use since ODI can retrieve the data directly from the collect tables. The structure of the I$_ table is to have an almost identical structure as the target table in order to contain the data that ODI is about to insert or to use to update the target table.

E$ Error tables

From an Integration Interface, E$_ tables are created and populated by the CKM if the IKM option called Flow Control is set to yes.

So, only under the following conditions:

Only one source Datastore, not journalized
Loading strategy = Truncate/Insert or basic Insert (no updates or deletes)
No flow nor static control is required
… are the C$, I$, E$ tables of no use, and in such a case, it would be more efficient to directly load data into the target table.

Solution 1: Use the existing “IKM SQL to SQL Append”

The simplest solution is to set the Staging Area to the same Logical Schema as the one that hosts the Source Datastore Model (even if the source Datastore is present on a technology that does not offer any transformation engine, (for example, File or Excel Technologies).

At the Flow tab level, click on the Target and select the “IKM SQL to SQL Append”.

Even if this solution works very well it is not the most efficient one in terms of execution time.
Solution 2: Write your own IKM, using database load utilities (Oracle SQL*Loader, Microsoft SQLServer BCP…)

When dealing with Flat Files for example, using a database loader utility such as SQLLDR is more efficient.

In this example, using the “LKM File to Oracle (SQLLDR)” is not a solution because the LKM will create a C$_ table.

Unfortunately, there does not exist any Integration Knowledge Module (IKM) that uses SQLLDR.

This is a typical situation where a customer specific Knowledge Module is required.

The example below illustrates how to create such an IKM, from the “IKM SQL to SQL Append”.
1. Duplicate the “IKM SQL to SQL Append” and rename the copy as “IKM File to SQL Append (SQLLDR)”

2. Edit this news IKM and from the “Details” tab, remove the following command lines and then click on Apply to validate the updates:
Lock journalized table
Cleanup journalized table

Those two KM lines are of no use as we are dealing with flat files.

3. Right click on the “IKM File to SQL Append (SQLLDR)” from the Knowledge Module tree and add 3 new options:
LOA_DIRECT
LOA_DISCARDMAX
LOA_ERRORS

The definition of those new options should be strictly identical to the options having the same name under the “LKM File to Oracle (SQLLDR)”.

When creating the new options do not set any value in the “Position” field, it will automatically be filled by ODI.

4. Edit the “IKM File to SQL Append (SQLLDR)” and add a new KM line called “Generate CTL file”.

From the “Command on Target” tab, set the technology to ODI Tools.

Do not close the window.

5.Right click on the “LKM File to Oracle (SQLLDR)”, select “Open in a new window” and go to the Details tab.

6.Right click on the command line called “Generate CTL file” and select the code written in the “Command” field of the “Command on Target” Tab.

7. Copy the code and paste it into the “Command” field of the “Command on Target” Tab of the new KM line created during step 4 and click on Apply.

8. Launch the expression editor and search for the following section of code:
INTO TABLE <%=snpRef.getTable(“L”, “COLL_NAME”, “W”)%>
(
<%=snpRef.getColList(“”, “[CX_COL_NAME]\tPOSITION([FILE_POS]:[FILE_END_POS])”, “,\n\t”, “”,””)%>
)
and replace it with:
INTO TABLE <%=snpRef.getTable(“L”, “TARG_NAME”, “A”)%>
(
<%=snpRef.getColList(“”, “[COL_NAME]\tPOSITION([FILE_POS]:[FILE_END_POS])”, “,\n\t”, “”,””)%>
)
9. Search for this other piece of code:
INTO TABLE <%=snpRef.getTable(“L”, “COLL_NAME”, “W”)%>
FIELDS TERMINATED BY X'<%=snpRef.getSrcTablesList(“”, “[FILE_SEP_FIELD]”, “”, “”)%>’
(
<%=snpRef.getColList(“”, “[CX_COL_NAME]”, “,\n\t”, “”,””)%>
)
and replace it with:
INTO TABLE <%=snpRef.getTable(“L”, “TARG_NAME”, “A”)%>
FIELDS TERMINATED BY X'<%=snpRef.getSrcTablesList(“”, “[FILE_SEP_FIELD]”, “”, “”)%>’
(
<%=snpRef.getColList(“”, “[COL_NAME]”, “,\n\t”, “”,””)%>
)
10. Click on Apply at the Expression Editor level and KM line level.

11. Move up the new KM line right after the one called “Delete target table”.

12. Add a new KM line called “Call sqlldr via Jython”. From the “Command on Target” tab, set the technology to Jython. The command field of the “Command on Target” tab should have strictly the same content as the one of the same step of the “LKM File to Oracle (SQLLDR)”.

13. Click on Apply at both the new KM line and KM level.

14. Override the existing Description and Restriction of the “IKM File to SQL Append (SQLLDR)” with the ones of the “LKM File to Oracle (SQLLDR)”.

15. To use the IKM in an Integration Interface, set the Staging Area to the same Logical Schema as the one that host the source Model Datastore.
Thanks,
~KKT~

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s