SAP Information Steward is a web-based application that provides users with centralized environment to analyze, assess, categorize, monitor and improve the enterprise data quality. This tool consists of five core functionalities: data insight, metadata management, cleansing package builder and match review. In this article, we will go over the basic functionalities of each area, followed by our best practices on setting up an omni-channel environment with Information Steward for better data management.
Assumptions: Prior to building projects in SAP Information Steward, connections must be set up first via Central Management Console (CMC). The knowledge below assumes such connections have already been established.
When launching Information Steward web application, the core functionalities mentioned above are the five top level options to choose.
Data insight can be used to profile data, create rules, and allow viewing scorecards to gain understanding of the data quality at real-time. Six different types of profiling can be performed in data insight: address profiling, column profiling, dependency profile, redundancy profiling, uniqueness profiling and content type profiling.
Address profiling is used to gauge the percentage of address data that is valid, correctable or invalid basing on the engine selected. Valid addresses are deliverable. Invalid means data services address cleansing transform cannot correct the addresses. Detailed reasons can be determined by using address cleansing transform. Correctable addresses can be corrected by the address cleansing transform action.
Metadata management is used to categorize data across all systems and achieve in-depth understanding of the data relationships. Linear data analysis helps to understand the data flow.
Metapedia allows users to define standard business terminologies for words, phrases or concepts and associate these terminologies to metadata objects as well as applying them to data insight quality rules and scorecards.
Data cleanse builder sets standards on cleansing data and can be published for use on various source systems in DQM projects. Working on top of the default package: PERSON_FIRM is usually the common strategy for creating a new source focused data cleanse builder.
Opening the package will display the rules in this default package. Any modified package can be saved and published as a new package to be used by data cleanse transforms within Data Services Designer.
Match review is used for reviewing matching results and allowing approval or disapproval of any matched sets. This allows end-users to gain detailed insights into the data quality and decide on actions to take accordingly.
Omni-channel data management process
Information Steward allows users to provide data governance, but this does not fix any underlying data. Setting up an omni-channel data management process allows users to analyze, review and fix the data to eliminate potential data problems. With the integration of Data Services Designer, an omni-channel data management process could be established.
After profiling data and building a customized cleansing package, real-time or batch jobs can be developed using the customized cleansing package to perform additional data merging and matching. Results could then be published into a predefined database table with the desired formats required by the information steward match review. Kindly note, when establishing connection via Information Steward, connection to the database must be set up first.
The following screenshots illustrate steps required to setup the Match Reviews for end users to perform review and approval tasks on. This is initiated by clicking on Manage on the upper right corner then select Match Review Configurations.
As the image below illustrates, there are a few required columns. These need to be properly mapped in order to set up the match review correctly.
The last step is to assign permissions based on who could be the reviewers vs. approvers.
Once the setup is completed, the task needs to be run to display results for end-users.
After a task is created, reviewers and approvers will be able to open the task and select sections as needed for further actions.
When a user approves a matching group, this information is then captured in the corresponding action table. A real-time or batch job can be built utilizing this table and approvers’ responses to fix the data in the original system.
With this method, the first batch job could be rerun to obtain the latest result sets. Updating the task will allow the review data to be refreshed. Reviewers and approvers performing their tasks will trigger the job to take actions in the original source system. In turn, this completes the omni-channel data management process.
Protiviti understands that an organization’s data is essential to its success, yet as companies generate more and more data every day, the question of how to manage and transform that data into valuable business intelligence becomes more difficult to answer. Our data management team helps organizations through the entire information lifecycle, including strategy, management and reporting, to ensure decision-makers have the right information at the right time. Protiviti’s data management project experience includes:
- Identifying data issues
- Designing data warehouses and data strategy
- Defining data mapping logic
- Testing and support UAT