Our client maintained a database of company information including street address data which was aggregated from a variety of sources. In order to reduce the number of duplicate entries upon loading, we built an integrated solution in Talend Open Studio to transform each street address into a standardized version based upon USPS specifications.
This job contains the following features:
- Connection to Oracle database
- Extraction of street address data from loading table
- Custom java string manipulation of address data
- Look-up and replace of particular stop words in company and address data utilizing a particular specified order
- Loading of standardized data into review table