Wednesday, 12 September 2012

Classic and Extended Star Schema Comparisons

In Classic star schema, dimension and master data table are same. But in Extend star schema, dimension and master data table are different. (Master data resides outside the Infocube and dimension table, inside Infocubecube).

In Classic star schema we can analyze only 16 angles (perspectives) whereas in extended star schema we can analyze in 16*248 angles. Plus the performance is faster to that extent.

Below are some of the basic differences between the two.

Classic Star Schema               SAP BW Star Schema
Cube                                         InfoCube
Fact Table                                 Key Figure or KPI
Dimension Attribute                   Characteristic, Attributes, Hierarchy Node
Dimension Table                        Dimension Table, Master Data Table, External Table, SID Table
-                                                Standard Business Content
-                                                Hierarchies
-                                                MultiCube
-                                                Remote Cube

Advantages of Classic Star Schema


Data access runs performantly due to the small number of joining operations.

There are only join operations between the fact tables and the involved dimension tables.

Disadvantages of Classic Star Schema


Redundant entries exist in the dimension tables

Historization of dimensions is not easy to model. The dimension changing may be done slowly.
No multi language capability
It is difficult to model some hierarchy’s dimension.
Query performance is also made worse, since aggregates and fact data are stored in the same table (fact table).

Advantages of the SAP BW Star Schema


Faster access to data than via long alpha-numeric keys. SAP BW use automatically generated INT4 keys for SID and Dimension ID

Can model in easy way: Historizing, multi-lingual, and shared dimensions. It is happen because of the excavation of master data from the dimension tables using the SID technique.
The query performance is improved here as aggregated key figures can be stored in their own fact tables.

OLAP and OLTP


Online Transaction Processing (OLTP) refers to a class of systems that facilitate and managetransaction-oriented applications, typically for data entry and retrieval transaction processing.

On Line Analytical Processing (OLAP), a series of protocols used mainly for business reporting. Using OLAP, businesses can analyze data in all manner of different ways planning, simulation, data warehouse reporting, and trend analysis.

OLAP and OLTP are two absolutely different systems since they have different purpose and environments. OLAP for analytical compare to OLTP for transactional.

Difference between OLAP and OLTP

Target 


OLTP is used in operative environment to get efficiency through automation of businessprocesses. OLAP is used in informative environment, usually used by management to support in decisions making.


Priorities 


As transactional system, OLTP has high availability and higher data volume.OLAP as analytical system is very simple data and has flexible data access.


Level of detail

OLTP stores data in a very high level of detail, whereas OLAP stores data in aggregation.

Age of data 


OLTP data are current data. It means the data stored in OLTP with minimal history. OLAP data are historical data.


Database operation


Frequent data changes are a feature of operative system. So, in OLTP system we can read, add, change, delete or refresh data. In OLAP, we only can read the data since they are frozen after a certain point for analysis purpose.


Integration of data from various applications (system) 


Since the OLTP system is for operation, it has minimal integration with other applications. In contrast to the OLTP system, OLAP need high integration of information from many application or system because it used for analysis.


Normalization in database 


Due to reduction in data redundancy, normalization is very high requirement in OLTP. In OLAP, typically de-normalized with fewer tables; use of extended star schema and lower performance.

SAP BI Terminology

InfoArea

Info Area is like “Folder” in Windows. InfoArea is used to organize InfoCubes, InfoObjects, MultiProviders, and InfoSets in SAP BW.

InfoObject Catalog 

Similar to InfoArea, InfoObject Catalog is used to organize the InfoObject based on their type. So we will have InfoObjects Catalogs of type Characteristics & KeyFigures.

Info Objects

It is the bsic unit or object in SAP BI used to create any structures in SAP BI. 
Each field in the source system is referred as InfoObject on SAP BI.
We have 5 types of Info Objects: Characteristic, KeyFigure, Time Characteristic, Unit Characteristic, and Technical Characteristic.

Data Source

Data Source defines Transfer Structure.
Transfer Structure indicates what fields and in what sequence are they being transferred from the source system. 
We have 4 types of data source:
Attr: used to load master data attr 
Text: Used to load text data 
Hier: used to load hierarchy data 
Transcation data: used to load transaction data to Info cube or ODS.

Source System

Source system is an application from where SAP BW extracts the data. 
We use Source system connection to connect different OLTP applications to SAP BI.
We have different adapters / connectors available:
SAP Connection Automatic
SAP Connection Manually 
My Self Connection
Flat file Interface
DB connect
External Systems with BAPI

Info Package

Info package is used to schedule the loading process. 
Info package is specific to data source. 
All properties what we see in the InfoPackage depends on the properties of the DataSource.

Administrator Workbench (AWB) in SAP BW

The Administrator Workbench (AWB) is the tool for data warehouse management in SAP BW (Business Information Warehouse). Using AWB we can manage, control and monitor all relevant objects and processes in SAP BW including scheduling, data load monitoring and metadata maintenance. Following are the functions that we can perform using AWB:modeling, monitoring, transport connection, documents, business content, translation, and metadata repository.

Modeling

Here we can create and maintenance objects relevant to the data staging process in SAP BW. For example, create InfoProvider, InfoObject, InfoSource, maintenance and define source system and PSA.

Transaction code (T-Code): RSA1


Monitoring

In monitoring Function area, we can monitor and control data loading process and other data process in SAP BW.

Transaction code (T-Code): RSMON


Transport Connection

This function is used to maintenance and move object between SAP systems: development to quality assurance (QA/Test) and QA to production.


Document

The document function area, we can maintenance links for one or more documents in various formats, versions and languages for SAP BW Objects.


Business Content

Pre-configured information models based on metadata is maintenance in this area. Business Content Function are provides us a selection of information that we can use to fulfill our tasks.

Transaction code (T-Code): RSORBCT.


Translation

The use of translation function area is to translate short and long texts belonging to SAP BW- objects.


Metadata Repository

In the HTML-based SAP BW- Metadata Repository, all SAP BW meta objects and the corresponding links to each other are managed centrally. Together with an integrated Metadata Repository browser, a search function is available enabling a quick access to the meta objects. In addition, metadata can also be exchanged between different systems, HTML pages can be exported, and graphics for the objects can be displayed.

Transaction code: RSOR.


Modeling, monitoring, transport connection, documents, business content, translation, and metadata repository are tasks or functions that we can do in SAP BW Administrator Workbench (AWB). 

More About Star Schema

SAP BW Star schema is based on the extended of the classic star schema (aka “snowflake” schema). The enhancement comes from the fact that the dimension table does not contain master data information. This master data information is stored in separate tables, called master data tables. In SAP BW Star Schema, the distinction is made between two self-contained areas: infocube and surrogate ID (SID) tables.


Infocube

Infocubes are the central objects on which reports and analysis are based in SAP BW. It describes a self-contained data set within a business area, for which we can define queries.


Infocube consist of a central fact table and several surrounding dimension tables. In SAP BW star schema , the facts in a fact table refers to key figure and the dimension attributes refer to characteristics.

In contrast to classic star schema concepts, characteristics are not component the dimension tables since the characteristic values are not stored in the dimension tables. They are stored in master data tables. There are foreign keys that replace the characteristics as the component of dimension table, i.e.: SID stand for Surrogate ID. In picture above, these keys are given the prefix SID_.

Each Dimension table has a generated primary key, called the dimension key. (In Picture above, the keys are given the prefix DIM_ID_..). As the classic star schema, the primary key of the fact table is made up of dimension keys.

Master data Table or Surrogate ID (SID) Table
In SAP BW, additional information about characteristic is referred to as master data. Master data information is stored in separate tables called master data tables. There are 3 types of master data, i.e.: Attributes, Text and Hierarchies.


Here, SID tables play an important role in linking the data warehouse information structured to the subject-oriented infocubes. The master data tables, text tables, and hierarchy tables are not directly linked to the associated dimension tables. These tables are infocube independent and joined with the dimension tables using SID table. The SID therefore provides the link to the dimension tables.

The connection between Infocube and Master Data Tables



Picture above illustrates the connection between master data table and infocube. Master data tables are connected to an infocube by way of the SID tables. The picture above also explains that in SAP BW star schema, the master data is independent infocube and can be used by several infocubes at the same time.