HANA News Blog

SAP HANA data statistics for the optimizer

Jens Gleichmann • 28. Mai 2025

Popular misconcept: HANA needs no statistics

In the world of modern database management systems, query performance is not just a matter of hardware—it’s about smart execution plans. At the core of generating these plans lies a critical component: optimizer statistics. Every DBA is aware of the optimizer data statistics. Without this details the optimizer can only generate wrong execution plans. There are multiple optimizer statistics depending on the database:

system statistics (CPU, I/O, memory)
workload statistics
noworkload statistics
table statistics
column statistics
index statistics

Besides the different objects also the collect method is important. There are dynamic or runtime statistics and histograms. As always up-to-date and accurate statistics will lead to better execution plans depending on the costs. If a table is changing too frequently accurate and up-to-date statistics are not possible and would generate too heavy overhead. It must be in balance and this is depending on the different workloads.

Accurate statistics help the optimizer decide on:

Join order
Join method (e.g., hash join vs. nested loop)
Index usage
Degree of parallelism

Poor decisions can lead to excessive memory usage and CPU spikes.

The note 2100010 - SAP HANA: Popular Misconceptions deserves an update regarding the optimizer stats. Many "experts" talk about HANA as magical database which tunes itself and needs no optimizer statistics. The truth is that also HANA relies on different statistic data to generate an optimal execution plan. For virtual/remote tables like for SDA scenarios the manual statistics are used since HANA 1.0 SPS 06. There are different types of data statistics:

Statistics type	Source	Details	Created
HISTOGRAM	explicit	Creates a data statistics object that helps the query optimizer estimate the data distribution in a single-column data source. If you specify multiple columns in <data_sources>, then multiple data statistics objects (HISTOGRAM) are created--one per column specified.	manually
SAMPLE	explicit, runtime	When beneficial, the SQL optimizer generates system SAMPLE data statistics objects automatically on column and row store tables. However, this behavior can incur a cost to performance. You can avoid this cost by creating SAMPLE data statistics objects explicitly (in advance). Creating them explicitly is especially useful in situations where sampling live table data is expensive (for example, very large tables). (default: 1000 records)	automatically on column and row store tables
SIMPLE	explicit, runtime	Simple column statistics likemin, max, null count, count, and distinct count for a single-column data source. When beneficial, the SQL optimizer maintains system SIMPLE data statistics objects automatically on column and row store tables only. Superset of RECORD COUNT statistics	automatically on column and row store tables
SKETCH	explicit	Column group statistics, e.g. number of combination of values of multiple columns Useful for selectivity estimations when more than one column is evaluated	manually
TOPK	explicit, runtime	Column statistics including most frequent column values Mainly used for join-size estimation Superset of SIMPLE statistics	automatically on column store tables
RECORD_COUNT	explicit, runtime	Record count on table level	automatically on column and row store tables

runtime: Automatic ad-hoc calculation

explicit: Manual creation via Syntax CREATE STATISTICS

For SAMPLE statistics, explicit statistics take precedence over runtime statistics.

For other statistics types, runtime statistics take precedence over explicit statistics. If you want to change this behavior, you can force the use of explicit statistics instead of runtime statistics using the following parameter: indexserver.ini -> [datastatistics] -> dev_force_use_non_runtime_datastatistics = true

Although HANA is an in-memory database not always all data is loaded into the memory. In some cases only a partition or some columns are loaded. The runtime statistics can only be created if all data is loaded. This happens during the prepare and compile phase. In some cases e.g. in the context of NSE this can take significant longer.

If the needed column is not fully loaded the data statistics are estimated. This approach eliminates column load overhead, but the data statistics estimations may be imprecise, imposing the risk that a bad execution plan is picked. The distinct count of a column is in this case set to the row count, representing a unique column. This can be far away from the real distinct count and unfavorable join orders and execution engines can be the consequence. In this case it can make sense to use manual statistics which have also to be updated frequently.

If you create data statistics manually you can choose between different refresh types:

Refresh type	Detail
AUTO	specifies that the data statistics object is refreshed automatically when underlying data changes. AUTO is only supported on column store, extended store, and multistore tables.
MANUAL	specifies that the database statistics object is not refreshed until a rebuild is explicitly requested by a REFRESH STATISTICS statement.
DEFAULT	specifies that the database server decides the best refresh strategy based on the data source. For example, for data statistics objects on column store data sources, the database server applies AUTO for the default.

Examples:

CREATE STATISTICS ON TABLE_NAME.COLUMN TYPE SAMPLE SAMPLE SIZE 10000 REFRESH TYPE MANUAL;

CREATE STATISTICS ON TABLE_NAME.COLUMN TYPE TOPK BUCKETS 1500 REFRESH TYPE AUTO;

Details:

2800028 - FAQ: SAP HANA Optimizer Statistics

2000002 - FAQ: SAP HANA SQL Optimization

< Älterer Beitrag

E-Mail

SAP HANA News by XLC

HANA OS maintenance

von Jens Gleichmann • 27. Mai 2025

Please notice that when you want to run HANA 2.0 SPS07, you need defined OS levels. As you can see RHEL7 and SLES12 are not certified for SPS07. The SPS07 release of HANA is the basis for the S/4HANA release 2023 which is my recommended go-to release for the next years. Keep in mind that you have to go to SPS07 when you are running SPS06 because it will run out of maintenance end of 2023.

HANA performance degradation after upgrade to SPS07+SPS08

Ongoing HANA HEX performance issues with SPS07+SPS08

von Jens Gleichmann • 27. Mai 2025

With SPS06 and even stronger in SPS07 the HEX engine was pushed to be used more often. This results on the one hand side in easy scenario to perfect results with lower memory and CPU consumption ending up in faster response times. But in scenarios with FAE (for all entries) together with FDA (fast data access), it can result in bad performance. After some customers upgraded their first systems to SPS07 I recommended to wait for Rev. 73/74. But some started early with Rev. 71/72 and we had to troubleshoot many statement. If you have similar performance issues after the upgrade to SPS07 feel free to contact us! Our current recommendation is to use Rev. 74 with some workarounds. The performance degradation is extreme in systems like EWM and BW with high analytical workload.

The HANA Optimizer is one of the core components of the HANA database. It determines how SQL is exec

SAP HANA Optimizer - the heart of the performance

von Jens Gleichmann • 25. Mai 2025

A database optimizer behaves similarly to the navigation system in your car. You use various parameters to determine which route you want to take to reach a particular destination. Potential additional costs for tolls and ferries also play a role, as do other limitations such as the vehicle's height, length, and width. From these input parameters, the navigation system determines the optimal route based on current traffic information (traffic volume, construction sites, congestion, accidents, closures, etc.), weather conditions, and the length of the route. This means that with exactly identical input parameters, different routes, costs, and thus different travel times can arise depending on the given conditions.

Is NSE worth the effort or is the better question: Do you know your cost per GB of RAM?

Is the effort for NSE worth it?

von Jens Gleichmann • 18. April 2025

Most of our presentations on data tiering projects end with these typical questions: How much we will save? How fast can it be implemented? Is the effort worth it over time? My counter question: "Do you know how much 1 GB of memory costs your company per month or year?" => how much memory we have to save to be beneficial?

Buchtipp: SAP HANA Deepdive: Optimierung und Stabilität im Betrieb

von Jens Gleichmann und Matthias Sander • 30. März 2025

Unser erster Buch mit dem Titel "SAP HANA Deepdive: Optimierung und Stabilität im Betrieb" ist erschienen.

More time to switch from BSoH to S/4HANA

SAP ERP Extended Maintenance until 2033 - but with conditions

von Jens Gleichmann • 7. Februar 2025

Recently handelsblatt released an article with a new SAP RISE option called SAP ERP, private edition, transition option. This option includes a extended maintenance until the end 2033. This means 3 years more compared to the original on-prem extended maintenance. This statement was confirmed by SAP on request of handelsblatt, but customers receive more details, such as the price, in the first half of the year. This is a quite unusual move of SAP without any official statement on the news page. Just to raise more doubts? Strategy? However a good move against the critics and the ever shorter timeline. Perhaps it is also a consequence of the growing shortage of experts for operating and migrating the large number of systems.

Smart SAP HANA Optimization: Applying NSE for Better Efficiency

von Matthias Sander • 15. Januar 2025

When it comes to optimizing SAP HANA, the balance between performance and cost efficiency is critical. I am happy to share a success story where we used the Native Storage Extension (NSE) to significantly optimize memory usage while being able to adjust the sizing at the end. The Challenge: Our client was operating on a 4 TB memory SAP HANA system, where increasing data loads were driving up costs and memory usage. They needed a solution to right-size their system without compromising performance or scalability. The client wanted to use less hardware in the future. The Solution: We implemented NSE to offload less frequently accessed data from memory. The activation was customized based on table usage patterns: 6 tables fully transitioned to NSE 1 table partially transitioned (single partition) 1 table transitioned by specific columns

SAP HANA NSE - a technical deepdive with Q&A

HANA NSE - technical details Q&A

von Jens Gleichmann • 6. Januar 2025

SAP NSE was introduced with HANA 2.0 SPS04 and based on a similar approach like data aging. Data aging based on a application level approach which has a side effect if you are using a lot of Z-coding. You have to use special BADI's to access the correct data. This means you have to adapt your coding if you are using it for Z-tables or using not SAP standard functions for accessing the data in your Z-coding. In this blog we will talk about the technical aspects in more detail.

The SAP Enterprise Cloud Services Private Cloud Customer Center (PC3) - a new digital delivery

3 years of RISE with SAP: ECS realignment

von Jens Gleichmann • 5. Januar 2025

The SAP Enterprise Cloud Services Private Cloud Customer Center (PC3) - a new digital delivery engagement model dedicated to manage service delivery for RISE with SAP S/4HANA Cloud, private edition customers.

Proactive maintenance for SAP RISE will start now in 2025

SAP RISE: Proactive Maintenance