Ansicht
Dokumentation

DHAMB_PROFILING_JOB - SAP Data Intelligence - Profiling

DHAMB_PROFILING_JOB - SAP Data Intelligence - Profiling

RFUMSV00 - Advance Return for Tax on Sales/Purchases   PERFORM Short Reference  
This documentation is copyright by SAP AG.
SAP E-Book

Purpose

This report provides the SAP Data Intelligence profiling functionality for ABAP datasets. It is scheduled internally by the ABAP Profiling operator, processes exactly one dataset (table, view, or CDS view) and determines the following information:

  • The total number of records for the specified dataset.
  • The minimum, maximum, and average number of values for each column, as well as the average length of values for String columns.
  • The percentage of null and blank values.
  • The ten values that occur most frequently for each column.
  • Whether a column is rarely populated with values (sparsely populated) or whether a value in a column is repeated often (low cardinality).

You can access these results in the SAP Data Intelligence Metadata Explorer, and evaluate them there.

Integration

The report is called by the ABAP Profiler operator in SAP Data Intelligence.

When the ABAP Profiler operator calls the report, it profiles a dataset using the values specified in the operator and saves the results in a JSON file. The JSON file is then saved in a database table so that the results can be viewed and evaluated in SAP Data Intelligence.

Prerequisites

Features

Selection

Under General Settings, you can view the following:

  • Sparsely Populated Percentage
The percentage of null and initial values in column for the column to be considered sparsely populated.
  • Low Cardinality Percentage
The percentage at which a column is considered to have a low cardinality.
  • Graph ID
The ID of the SAP Data Intelligence graph that schedules this report.
  • Dataset Name
The name of the dataset that will be profiled. You can determine the name of the dataset in SAP Data Intelligence. The name has the following naming convention: /TABLES/SD/SLS/VBAK
  • Run Report in Test Mode
You can run the report in test mode to only view the generated JSON file. No profiling data is saved for the dataset.

Under Data Sampling Settings, you can view the following:

  • Sample Size (in MB)
The size of sample set of data MB.
  • Sampling Threshold (in MB)
The minimum size of the dataset for data sampling. If a dataset is smaller than this number, then no sampling will take place.
  • Package Size (in Records)
For data sampling, the report processes the dataset in packages. You can adjust the number of records processed in a package.
  • Max. Records to Read
The maximum number of records to read if data sampling is active.
  • Activate Data Sampling for SAP HANA
If you select this checkbox, then the report will use a sample set of data to for the profiling. Note that this option in only relevant for SAP HANA databases. Not that for all other databases, data sampling will take place.





BAL_S_LOG - Application Log: Log header data   rdisp/max_wprun_time - Maximum work process run time  
This documentation is copyright by SAP AG.

Length: 3658 Date: 20240531 Time: 080116     sap01-206 ( 50 ms )