Building a List in Talend to Use as a Filter in Another Component

This is an answer to a question about Talend that was posted on StackOverflow. I wasn’t able to post as many pictures as I needed due to house rules in place, so I have moved the whole thing here and linked back to it. The general gist of the question was that there was a mahoosive table that would have eaten too much memory if its entire contents were dragged into Talend, so the user wanted to be able to take some values from a different table, string them out into a list, pop them into a context variable and then squirt them back into another SQl query so that he ended up with a smaller set of results to work with. With me so far? OK, read on!

Hi

This should be possible. I’m not working in MySQL but I have something roughly equivalent here that I think you should be able to adapt to your needs.

As you can see, I’ve got some data coming out of the table and getting filtered by tFilterRow_1 to only show the rows I’m interested in.

main

The next step is to limit it to just the field I want to use in the variable. I’ve used tMap_3 rather than a tFilterColumns because the field I’m using is a string and I wanted to be able to concatenate single quotes around it but if you’re using an integer you might not need to do that. And of course if you have a lot of repetition you might also want to get a tUniqueRows in there as well to save a lot of unnecessary repetition

xmap1

The next step is the one that does the magic. I’ve got a list like this:
‘A1’
‘A2’
‘B1’
‘B2’
etc, and I want to turn it into ‘A1′,’A2′,’B1′,’B2’ so I can slot it into my where clause. For this, I’ve used tAggregateRow_1, selecting “list” as the aggregate function to use.

aggreg

Next up, we want to take this list and put it into a context variable (I’ve already created the context variable in the metadata – you know how to do that, right? If not, here’s a quick rundown). Use another tMap component, feeding into a tContextLoad widget. tContextLoad always has two columns in its schema, so map the output of the tAggregateRows to the “value” column and enter the name of the variable in the “key”. In this example, my context variable is called MyList

xmap2

Now your list is loaded as a text string and stored in the context variable ready for retrieval. So open up a new input and embed the variable in the sql code like this

“SELECT distinct MY_COLUMN
from MY_SECOND_TABLE where the_selected_row in (“+
context.MyList+”)”

It should be as easy as that, and when I whipped it up it worked first time, but let me know if you have any trouble and I’ll see what I can do.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s