In Part 1 of this blog series, I discussed Information Views and provided some tips on how to use them more effectively. In this part, I will be discussing two design strategies and the pros and cons of each.
We’ll look at two methodologies:
- A “real-time centric” design will excel at operational reporting, but performance may be a challenge with increasing data volumes and significant transformation.
- A “storage centric” design will always scale well and offer exceptional performance.
Real-time Centric Design
In terms of HANA, this design style is where natural OLTP tables are utilized as the basis for creating real-time OLAP multi-dimensional models. When using these tables as the source, all join, calculation and transformation logic has to be implemented in the SAP HANA information views. The operations are conducted at runtime each time a query is executed. Generally, this is a very resource expensive way of designing a solution because the work has to be repeated over and over again. It also has a tendency to consume a lot of memory on the SAP HANA system. In some cases, more memory is required to compute the model than is actually required to store the data. It is sometime called ELT (Extract, Load, Transform) and it is a required design technique to achieve true zero latency real-time analytics.
- Products like SAP HANA Live ERP reporting utilize this concept heavily
- Effective for real-time as you can report directly against normalized source tables or replicated source tables in an SAP HANA ‘Sidecar’ format
- No physical ETL is required, but all transformations and processing occur at each query’s runtime
- Great for moderately complex operational reporting
- Excels for reporting scenarios where real-time is the highest concern and you are facing only one transaction table joined to several descriptive tables
- Use with care as the Calculation Engine is the slowest engine in SAP HANA
- Extreme table sizes + heavy transformations = slow performance
- May not scale well
- More difficult to do true business intelligence reporting using this design paradigm
- Many BI concepts require components and concepts that are difficult to model using only Calculation Views. This is due mostly to the number of projections that are required to manipulate multiple datasets for complex comparison activities
In terms of HANA, this design style is where the natural OLTP tables are converted into conformed Dimensions and Fact tables. To produce these new tables, an ETL process is executed. Most of the calculations and transformation logic is conducted in the ETL process. The SAP HANA OLAP model (or Information Views) are then only responsible for aggregating and joining the dimensions to the facts. Because most of the calculation and transformation logic is only calculated a few times each day with the ETL process, the SAP HANA models are not as expensive to execute. Generally, they use considerably less memory and CPU power when executed. Because the OLTP tables have to be copied into new structure, we call this a storage heavy model. It’s a model that re-persists some of the more complex transformations into new tables.
- Exceptional performance
- Scales incredibly well
- The majority of processing is in the OLAP engine; this is the fastest engine in SAP HANA
- ETL/ELT is required for physical transformations
- Data is physically modeled into a star schema design using ETL/ELT processing
- Analytic Views are used extensively
- Real-time is more difficult and complex but still possible with SAP HANA Smart Data Integration
- Multiple sources will likely need a latency based ETL approach. This will render some sources to be batch based in nature. Many customers just run batches every 5 – 15 minutes if near real-time is needed
For pure performance and scalability, stay in the OLAP engine. A storage heavy design paradigm ensures this is maintained and scaling is very linear.
ETL/ELT processing and physically moving data is required for a storage heavy design paradigm. This is technically a drawback, but keep in mind the ETL/ELT becomes simpler as you are really only transforming basic structures. Use SAP HANA to transform what is not required to support a basic star schema.
You can read more about modeling techniques in my article “Mastering SAP HANA Data Modeling for Maximum Performance.”
Don Loden is an information management and information governance professional with experience in multiple verticals. He is an SAP-certified applications associate on SAP EIM products. He has more than 15 years of information technology experience in the following areas: ETL architect, development, and tuning; logical and physical data modeling; and mentoring on data warehouse, data quality, information governance, and ETL concepts. Don speaks globally and mentors on information management, governance, and quality. He authored book SAP Information Steward: Monitoring Data in Real-Time and is the co-author of two books: Implementing SAP HANA, as well as Creating SAP HANA Information Views. Don has also authored numerous articles for publications such as SAPinsider magazine, Tech Target and Information Management magazine.