Extract from XML to Flat Files

This tutorial will demonstrate how to extract sample data from an xml file and read each xml record into a separate output file. The example file we will use is a simple file with the following format:

<root>

<field1>Test Data 1</field1>

<field1>Test Data 2</field1>

<field1>Test Data 3</field1>

</root>

Screen Shot 2014-05-19 at 12.25.15 PM

Firstly we will read the data from the source xml file using the tFileInputXML component:

 

Screen Shot 2014-05-19 at 12.25.21 PM

In our case the xPath query and the Mapping are quite simple, but these could be infinitely more complex depending upon your use case.

After we connect the data flow from the tFileInputXML to the tFlowtoIterate component, we then want to iterate over each record we extract from the XML. We do this by connecting an iterate link from the tFlowToIterate component to a tFixedFlowInput component.

Screen Shot 2014-05-19 at 12.25.30 PM

Here we are able to access the value for the current row in the iteration using the globalMap.get method. A shortcut to get values that are being written into the tFlowToIterate component is to type “tFlow” and then press cntrl +space, this will bring up the available data in that component and you can select the field you want.

Next we write the data into an output file and name the output file with the name of the row data string from the current row in the xml file. In our case this will generate 3 output files.

 

Screen Shot 2014-05-19 at 12.25.37 PM

This should give you an idea of how to convert data rows into singular variables / outputs.

 

 

  1. Neha Reply

    Hi Admin,

    I have a requirement where each row from my database has to be transformed into a separate .txt file. How can I achieve so? Please help.

    Regards,
    Neha Mishra

  2. lucian Reply

    I’m interested in doing vice-versa: extract to xml from a flat unstructured file (SWIFT MT format). I’m thinking to something with regex but I don’t know if it’s possible. Any ideas?
    Thanks

    • admin Reply

      Hi Lucian, try using the tFileInputDelimited – > tFileOutputXML component. If you have a complex xml output component you can look at the tFileOutputMSXML or tAdvancedFileOutputXML components.

  3. Prajan Reply

    Hi, I have a use case where i ll get the list of xml files(each xml file will have a different schema), Based on the schema i need to map with the corresponding RDBMS table(table already Exists) and insert the records. Can you give me some vision is it possible to achieve if yes can you give me some idea how can be achieved?

    • admin Reply

      Hi Prajan. I suggest the following:

      1. Iterate on each file with the tFileList component connect via an iterate link to a tJava component
      2. Connect the tJava component to your multiple tFileInputXML components via “If” links. The condition of each if link should route the job based upon the file name of the current file from the tFileList component in step 1. This assumes your have predefined xml file name rules for each file schema type.
      3. Build out your separate mappings from tFileInputXML -> DB Table Output from there.

  4. Prajan Reply

    Thank you so much for your suggestion. I will try to implement the steps and get back to you. Once again Thank you.

  5. Sowmya Reply

    Hi,
    Can u say diffrence between flow to itterate and itterate to float?

  6. Sowmya Reply

    Hi,
    I have inserted 10 rows in DB using sequence again if re-run the job rows are inserting but id is starting from 1 agin I want to start it from 11 and end to 20 can u please explain how can I ovecome this issue?

Leave a Reply

*

captcha *