SAP Blog
Getting Started with SAP Data Quality (Part 3)

Getting Started with SAP Data Quality (Part 3)

Data Quality, part of SAP Data Services, is a powerful tool for cleansing and matching customers, businesses, postal addresses, and much more. However, the wide range of options and capabilities can be intimidating for new users.

If you’ve followed the steps in the previous posts in this series, you should have functioning cleanse transforms generating a basic set of customer and address data, as well as a general understanding of the contents of those fields and how they are parsed and standardized. The next step in many cases is to feed the resulting data to one or more Match transforms.

For the purpose of the Match transforms, some of these fields will be more critical than others. In this post, we will briefly review the output fields from the Address Cleanse and Data Cleanse specifically for use in downstream Match transforms.

AddressCleanse Output

The Address Cleanse generates a lot of columns, but we want the ones that provide the most flexibility and clarity to the match engine. This means we generally avoid the combined fields, opting instead for discrete elements. For example, instead of the full PRIMARY_ADDRESS, we will generate the PRIMARY_NAME, PRIMARY_NUMBER, PRIMARY_TYPE, etc. These are, in fact, the fields the Match is expecting to see. Less granular columns have to be assigned to generic ADDRESS_DATA or custom fields for the Match to parse, often duplicating the work already performed by the cleanse.

To download full PDF and Continue Reading…

Getting Started with SAP Data Quality (Part 1)
Getting Started with SAP Data Quality (Part 2)

Bruce Labbate Headshot About Bruce Labbate
Bruce is a business intelligence consultant specializing in data warehousing, data quality, and ETL development. Bruce delivers customized SAP Data Services solutions for customers across all industries. With Decision First Technologies, Bruce utilizes Data Integrator, Data Quality, Information Design Tool, and a variety of database technologies, including SQL Server, DB2, Oracle, Netezza, and HANA.


Add comment