Talend Tutorials

Using tMemorizerows To Compare Data

Using tMemorizerows To Compare Data

| Matt Irvin

In this job, the tMemorizeRows component will be demonstrated so you can use it in your own applications. In this context, we will use it to check individual rows against each other, specifically to check start and end dates of items. The output will be an indicator of any information that may be in error.

Step 1: Prepare the data

Create the expirations.csv file and save it to a place where you can load it into Talend. The fields listed are Id, Name, Date Available, and Expiration Date. Our job will go through these items and make sure that the next “shipment” of a product happens before the last “shipment” expires. This only happens once with this set of data, but feel free to edit it.

image 1

The boxed dates are the ones that don’t overlap, so they should trigger output.

Step 2: Component Setup

For this job, you’ll need a FileInputDelimited, a SortRow, a MemorizeRows and a JavaRow component, to be set up as shown below:

image 2

Step 3: Schema Setup

Open the component tab of tFileInputDelimited. Click the “…” button next to File name/Stream to select the location of your Expirations.csv file. You’ll want to skip 1 Header row, so indicate that as well. Other default options do not need to be changed.

image 3

The schema for the input should be as follows: four columns, with all as type “String”, except for “Id” which is an Integer. You can also make “Available” and “Expiration” Dates, but for the purposes of this demonstration, just leave as String or now.

image 4

Step 3: Sort and Memorize

We will want to be sure that as the data is fed to MemorizeRows, the different products are grouped so they are properly checked against themselves. In a more complicated situation, we would also want the dates to be sorted, but since that is usually done as they are inserted, we can ignore that sorting step.

Simply sort the “Name” column alphabetically in ascending order.

image 5

Then, open the tMemorizeRows component. Here, we want to save the current row, and the one that comes right before it, so in the “Row count to memorize” text box, enter “2”.

Then, we want to memorize all rows except the Id column, so check those boxes.

image 6

Step 4: MemorizeRows in Logic

The javarow component should be opened next, and this is where we access the memorized rows, and compare them to current rows passing through. The Syntax for calling a memorized row is ((String[]) globalMap.get(“tMemorizeRows1{Column}”)){row}. Since we have two rows memorizes if row=0, we will get the current iteration of data. But is row =1, we will get the previous one.

The code below will ensure the current Name equals the Name in the previous Row, then convert our Dates into Date objects and see if the Expiration Date of the previous row happens before the available date of our current row. If it does, a message is returned.

image 7

Here’s the code:

if (row3.Name.equals(((String[]) globalMap.get(“tMemorizeRows_1_Name”))[1])){


TalendDate.parseDate(“dd-MM-yyyy”, ((String[]) globalMap.get(“tMemorizeRows_1_Expiration”))[1]).before(TalendDate.parseDate(“dd-MM-yyyy”, ((String[]) globalMap.get(“tMemorizeRows_1_Available”))[0]))){

System.out.println(row3.Name + ” has a stock gap on ” + ((String[]) globalMap.get(“tMemorizeRows_1_Expiration”))[1]);



Step 5: Run the Job

Go to the Run tab, and run the process. You should get a notification that Milk has a stop gap, and when.

image 8

Looking For More Talend Help?
Contact Us Today