Showing posts with label Modeling. Show all posts
Showing posts with label Modeling. Show all posts

Friday, 14 September 2012

Creating Process Chains for DSO

A process chain is a sequence of processes that are scheduled to wait in the background for an event. Some of these processes trigger a separate event that can, in turn, start other processes. It looks similar to a flow chart. You define the list of Infopackages / DTPs that is needed to load the data, say delta or Full load. Then you schedule the Process chain as hourly, daily, monthly, etc, depending on the requirement 

Use: If you want to automatically load and schedule a hierarchy, you can include this as an application process in the procedure for a Process Chain 

Note: Before you define process chains, you need to have the Objects ready i.e DSO, Cubes etc 

Steps to Create Process Chains 

Step 1: Go to RSPC transaction. Click on Create. Give Process chain name and Description. 


Step 2: Give the Start process name and description and click on Enter 

Note: Each process chain should start with a Start Process 


Step 3: The next screen will show the Scheduling options. 

There are two options:

Direct Scheduling:
Start Using Meta Chain or API 

In my example I have chosen Direct Scheduling as I am processing only one chain i.e DSO. Click on “Change Selections” 


In the below screen shot, you can give the scheduling details i.e the Immediate, Date& time etc. and click on SAVE 


The Screen shot below indicates that we have started the process. 


Step 4:

Click on the icon process types as shown in the below figure. You will get a list of options.
In my example I am scheduling for DSO. To process we need to have InfoPackage, DTP’s for the corresponding DSO
Open the tree Load Process and Post Processing, We need to drag and drop “Execute InfoPackage” 



Step 5: Once you drag and drop the “Execute infopackage” we get the below Popup. We need to keyin the Infopackage name. To do this click on F4 and chose your Infopackage and click onENTER


Step 6: Once you drag & drop the InfoPackage, the corresponding DTP’s and the corresponding Active Data table are automatically called. 


Step 7: Save + Activate and Execute the Process.

Step 8: Once the process is completed, we can see the whole chain converted to Green, Which indicates the process is successfully completed 

Note: In case of errors in any step, the particular chain will be displayed in red 


Step 9: We can see if the data is successfully updated by Right-click on the Data store Data


Step 10: On selecting Administer Data Target, will lead you to InfoProvider Administration. 


Step 11: Click on “Active Data” tab to see the data successfully uploaded. 


Note:

Similarly the process can be done for Cubes, Master data and Transactional data
When you create Process chains, by default they are stored in “Not Assigned”. If you want to create your own folders, 

a) Click on the icon “Display Components” on your toolbar => choose F4 => Create




b) Give appropriate name for the folder => Enter


c) SAVE + ACTIVATE

Wednesday, 12 September 2012

Data Warehousing: Step by Step

Administrator Workbench (AWB) in SAP BW

The Administrator Workbench (AWB) is the tool for data warehouse management in SAP BW (Business Information Warehouse). Using AWB we can manage, control and monitor all relevant objects and processes in SAP BW including scheduling, data load monitoring and metadata maintenance. Following are the functions that we can perform using AWB:modeling, monitoring, transport connection, documents, business content, translation, and metadata repository.

Modeling

Here we can create and maintenance objects relevant to the data staging process in SAP BW. For example, create InfoProvider, InfoObject, InfoSource, maintenance and define source system and PSA.

Transaction code (T-Code): RSA1


Monitoring

In monitoring Function area, we can monitor and control data loading process and other data process in SAP BW.

Transaction code (T-Code): RSMON


Transport Connection

This function is used to maintenance and move object between SAP systems: development to quality assurance (QA/Test) and QA to production.


Document

The document function area, we can maintenance links for one or more documents in various formats, versions and languages for SAP BW Objects.


Business Content

Pre-configured information models based on metadata is maintenance in this area. Business Content Function are provides us a selection of information that we can use to fulfill our tasks.

Transaction code (T-Code): RSORBCT.


Translation

The use of translation function area is to translate short and long texts belonging to SAP BW- objects.


Metadata Repository

In the HTML-based SAP BW- Metadata Repository, all SAP BW meta objects and the corresponding links to each other are managed centrally. Together with an integrated Metadata Repository browser, a search function is available enabling a quick access to the meta objects. In addition, metadata can also be exchanged between different systems, HTML pages can be exported, and graphics for the objects can be displayed.

Transaction code: RSOR.


Modeling, monitoring, transport connection, documents, business content, translation, and metadata repository are tasks or functions that we can do in SAP BW Administrator Workbench (AWB). 

More About Star Schema

SAP BW Star schema is based on the extended of the classic star schema (aka “snowflake” schema). The enhancement comes from the fact that the dimension table does not contain master data information. This master data information is stored in separate tables, called master data tables. In SAP BW Star Schema, the distinction is made between two self-contained areas: infocube and surrogate ID (SID) tables.


Infocube

Infocubes are the central objects on which reports and analysis are based in SAP BW. It describes a self-contained data set within a business area, for which we can define queries.


Infocube consist of a central fact table and several surrounding dimension tables. In SAP BW star schema , the facts in a fact table refers to key figure and the dimension attributes refer to characteristics.

In contrast to classic star schema concepts, characteristics are not component the dimension tables since the characteristic values are not stored in the dimension tables. They are stored in master data tables. There are foreign keys that replace the characteristics as the component of dimension table, i.e.: SID stand for Surrogate ID. In picture above, these keys are given the prefix SID_.

Each Dimension table has a generated primary key, called the dimension key. (In Picture above, the keys are given the prefix DIM_ID_..). As the classic star schema, the primary key of the fact table is made up of dimension keys.

Master data Table or Surrogate ID (SID) Table
In SAP BW, additional information about characteristic is referred to as master data. Master data information is stored in separate tables called master data tables. There are 3 types of master data, i.e.: Attributes, Text and Hierarchies.


Here, SID tables play an important role in linking the data warehouse information structured to the subject-oriented infocubes. The master data tables, text tables, and hierarchy tables are not directly linked to the associated dimension tables. These tables are infocube independent and joined with the dimension tables using SID table. The SID therefore provides the link to the dimension tables.

The connection between Infocube and Master Data Tables



Picture above illustrates the connection between master data table and infocube. Master data tables are connected to an infocube by way of the SID tables. The picture above also explains that in SAP BW star schema, the master data is independent infocube and can be used by several infocubes at the same time.

Extended Star Schema

The BW extended star schema differs from the basic star schema, in case of extended star schema, we will have Fact table connected to the Dimension table and the Dimension table is connected to the SID table and SID table is connected to the master data tables.

Fact Table and Dimension table will be inside the cube.

SID table and Master data tables are outside the cube.

One Fact table can get connected to 16 Dimension tables, one Dimension table can be assigned with maximum of 248 SID tables (248 characteristics).

When we load Transaction data into InfoCube, System generates DIM ID based on the SID’s and uses the Dim ID’s in the Fact Table.

Each Characteristic can have its own master data tables (ATTR, TEXT, HIER). Attribute Table is used to store all the attribute data, Text table is used to store the description in multiple languages, Hier table is used to store the Parent-Child data.


Fact Table

Fact Table will have Dimension ID’s and Key figures.
Maximum DIM ID’s – 16
Maximum KeyFigure – 233
The Dimension ID’s in the Fact Table is connected to the Dimension Table.
Fact Table must have at least one Dimension ID.

Dimension Table

Dimension Table contains Dimension ID and SID columns.
One column is used for Dimension ID.
We have maximum of 248 SID Columns.
We can assign maximum of 248 characteristics to one dimension.

VirtualProviders

InfoSet

MultiProviders

More About DSO

Open Hub Destination

Definition

The open hub destination is the object that allows you to distribute data from a BI system to non-SAP data marts, analytical applications, and other applications. It ensures controlled distribution across multiple systems.

The open hub destination defines the target to which the data is transferred.

In earlier releases, the open hub destination was part of the InfoSpoke. It is now an independent object that provides more options as a result of its integration into the data flow.

The open hub service previously provided with the InfoSpoke can still be used. We recommend, however, that you use the new technology to define new objects.

The following figure outlines how the open hub destination is integrated into the data flow:

Use

Database tables (in the database for the BI system) and flat files can act as open hub destinations. You can extract the data from a database to non-SAP systems using APIs and a third-party tool.

Structure

The open hub destination contains all the information about a data target: the type of destination, the name of the flat file or database table and its properties, and the field list and its properties.

BI objects such as InfoCubes, DataStore objects, InfoObjects (attributes or texts), and InfoSets can function as open hub data sources. Note that DataSources may not be used as the source.

Integration

You can use the data transfer process to update data to the open hub destination. This involves transforming the data. Not all rule types are available in the transformation for an open hub destination: Reading master data, time conversion, currency translation, and unit conversion are not available.

InfoProviders

Definition

Generic term for BI objects into which data is loaded or that display views of data. You analyze this data in BEx queries.

Use

InfoProviders are different metaobjects in the data basis that can be seen within query definition as uniform data providers. Their data can be analyzed in a uniform way. The type of data staging and the degree of detail or "proximity" to the source system in the data flow diagram differs from InfoProvider to InfoProvider. However, in the BEx Query Designer, they are seen as uniform objects.

The following graphic shows how InfoProviders are integrated in the dataflow:

Structure

The term InfoProvider encompasses objects that physically contain data:

Staging is used to load data into these InfoProviders.
InfoProviders can also be objects that do not physically store data but which display logical views of data, such as:

The following figure gives an overview of the BI objects that can be used in analysis and reporting. They are divided into InfoProviders that contain data and InfoProviders that only display logical views and do not contain any data. In BEx, the system accesses an InfoProvider; it is not important how the data is modeled.

InfoObject

Definition

Business evaluation objects are known in BI as InfoObjects. They are divide into characteristics (for example, customers), key figures (for example, revenue), units (for example, currency, amount unit), time characteristics (for example, fiscal year) and technical characteristics (for example, request number).

Use

InfoObjects are the smallest units of BI. Using InfoObjects, information is mapped in a structured form. This is required for constructing InfoProviders.
InfoObjects with attributes or texts can themselves also be InfoProviders (if in a query).

Structure

Characteristics are sorting keys, such as company code, product, customer group, fiscal year, period, or region. They specify classification options for the dataset and are therefore reference objects for the key figures. In the InfoCube, for example, characteristics are stored in dimensions. These dimensions are linked by dimension IDs to the key figures in the fact table. The characteristics determine the granularity (the degree of detail) at which the key figures are kept in the InfoCube. In general, an InfoProvider contains only a sub-quantity of the characteristic values from the master data table. The master data includes the permitted values for a characteristic. These are known as the characteristic values.

The key figures provide the values that are reported on in a query. Key figures can be quantity, amount, or number of items. They form the data part of an InfoProvider.

Units are also required so that the values for the key figures have meanings. Key figures of type amount are always assigned a currency key and key figures of type quantity also receive a unit of measurement.

Time characteristics  describe the time reference of business events. The complete time characteristics provided by SAP:

calendar day (0CALDAY)
calendar week (0CALWEEK)
calendar month (0CALMONTH)
calendar quarter (0CALQUARTER)
calendar year (0CALYEAR)
fiscal year (0FISCYEAR) and
fiscal period (0FISCPER).
Incomplete time characteristics: CALMONTH2, 0CALQUART1, 0HALFYEAR1, 0WEEKDAY1, 0FISCPER3.

Technical characteristics have only one organizational meaning within BI. An example of this is the request number in the InfoCube, which is obtained as ID when loading requests. It helps you to find the request again.

Special features of characteristics:

If characteristics have attributes, texts, or hierarchies at their disposal then they are referred to as master data-bearing characteristics. Master data is data that remains unchanged over a long period of time. Master data contains information that is always needed in the same way. References to this master data can be made in all InfoProviders. You also have the option of creating characteristics with references. A reference characteristics provides the attributes, master data, texts, hierarchies, data type, length, number and type of compounded characteristics, lower case letters and conversion routines for new characteristics.

A hierarchy is always created for a characteristic. This characteristic is the basic characteristic for the hierarchy (basic characteristics are characteristics that do not reference other characteristics). Like attributes, hierarchies provide a structure for the values of a characteristic. Company location is an example of an attribute for Customer. You use this, for example, to form customer groups for a specific region. You can also define a hierarchy to make the structure of the Customer characteristic clearer.

Special features of key figures:

A key figure is assigned additional properties that influence the way that data is loaded and how the query is displayed. This includes the assignment of a currency or unit of measure, setting aggregation and exception aggregation, and specifying the number of decimal places in the query.

Integration

InfoObjects can be part of the following objects:

1. Component of an InfoSource
    An InfoSource is a quantity of InfoObjects that logically belong together and are updated in InfoProviders.
2. Composition of an InfoProvider:
    An InfoProvider consists of a number of InfoObjects.
    In an InfoCube, the characteristics, units, and time characteristics form the basis of the key fields, and the key figures form the data part of the fact table of the InfoCube.
    In a DataStore object, characteristics generally form the key fields, but they can also be included in the data part, together with the key figures, units and time characteristics.
3. Attributes for InfoObjects

Transformation

Use
The transformation process allows you to consolidate, cleanse, and integrate data. You can semantically synchronize data from heterogeneous sources.

When you load data from one BI object into a further BI object, the data is passed through a transformation. A transformation converts the fields of the source into the format of the target.

Features

You create a transformation between a source and a target. The BI objects DataSource, InfoSource, DataStore object, InfoCube, InfoObject and InfoSet serve as source objects. The BI objects InfoSource, InfoObject, DataStore object and InfoCube serve as target objects.

The following figure illustrates how the transformation is integrated in the dataflow:


A transformation consists of at least one transformation rule. Various rule types, transformation types, and routine types are available. These allow you to create very simple to highly complex transformations:

● Transformation rules: Transformation rules map any number of source fields to at least one target field. You can use different rules types for this.
● Rule type: A rule type is a specific operation that is applied to the relevant fields using a transformation rule.
For more information, see Rule Type.
● Transformation type: The transformation type determines how data is written into the fields of the target.
For more information, see Aggregation Type.
● Rule group: A rule group is a group of transformation rules. Rule groups allow you to combine various rules.
For more information, see Rule Group.
● Routine: You use routines to implement complex transformation rules yourself. Routines are available as a rule type. There are also routine types that you can use to implement additional transformations.
For more information, see Routines in the Transformation.

Persistent Staging Area

Purpose

The Persistent Staging Area (PSA) is the inbound storage area in BI for data from the source systems. The requested data is saved, unchanged from the source system.

Request data is stored in the transfer structure format in transparent, relational database tables in BI. The data format remains unchanged, meaning that no summarization or transformations take place, as is the case with InfoCubes.

When loading flat files, the data does not remain completely unchanged, since it is adjusted by conversion routines, where necessary (for example, the date format 31.21.1999 is converted to 19991231 in order to ensure uniformity of data).

You determine the PSA transfer method in transfer rule maintenance.

If you set the PSA when you are extracting data, you get improved performance if you use TRFCs for loading the data. The temporary storage facility in the PSA also allows you to check and change the data before the update into data targets. Coupling the load process for further processing in BI also contributes to an improved load performance. In contrast to a data request with IDocs, a data request in the PSA also gives you variousoptions for further updating data to the data targets. Coupling the load process for further processing in BI also contributes to an improved loading performance. If errors occur when data is processed further, the operative system is not affected.

The PSA delivers the backup status for the ODS (until the total staging process is confirmed). The duration of the data storage in the PSA is medium-term, since the data can still be used for reorganization. However, for updates to ODS objects, data is stored only for the short-term.

In the PSA tree of the Administrator Workbench, a PSA is displayed for every InfoSource. You get to the PSA tree in the Administrator Workbench using either Modeling or Monitoring. The requested data records appear, divided according to request, under the source system they belong to for an InfoSource in the PSA tree.

Features

The data records in BI are transferred to the transfer structure when you load data with the transfer method PSA. One TRFC is performed for each data package. Data is written to the PSA table from the transfer structure, and stored there. A transparent PSA table is created for each transfer structure that is activated. The PSA tables each have the same structure as their respective transfer structures. They are also flagged with key fields for the request ID, the data package number, and the data record number.

Since the requested data is stored unchanged in the PSA, it may contain errors if it contained errors in the source system. If the requested data records have been written to the PSA table, you can check the data for the request and change incorrect data records.

Depending on the type of update, data is transferred from the PSA table into the communication structure using the transfer rules. From the communication structure, the data is updated to the corresponding data target.

Using partitioning, you can separate the dataset of a PSA table into several smaller, physically independent, and redundancy-free units. This separation can mean improved performance when you update data from the PSA. In the BW Customizing Implementation Guide, under Business Information Warehouse ® Connections to Other Systems ® Maintain Control Parameters for Data Transfer,you determine the number of data records from which you want to create a partition. Only data records from a complete request are stored in a partition. The specified value is a threshold value.

As of SAP BW 3.0, you can use the PSA to load hierarchies from the DataSources released for this purpose. The corresponding DataSources will be delivered with Plug-In (-A) 2001.2, at the earliest. You can also use a PSA to load hierarchies from files.

Constraints

The number of fields is limited to a maximum of 255 when using TRFCs to transfer data. The length of the data record is limited to 1962 bytes when you use TRFCs.

Data transfer with IDocs cannot be used in connection with the PSA.

Modeling

Purpose

The tool you use for modeling is the Data Warehousing Workbench. Depending on your analysis and reporting requirements, different BI objects are available to you for integrating, transforming, consolidating, cleaning up, and storing data. BI objects allow efficient extraction of data for analysis and interpretation purposes.

Process Flow

The following figure outlines how BI objects are integrated into the dataflow:



Data that logically belongs together is stored in the source system as DataSources. DataSources are used for extracting data from a source system and transferring it into the BI system.

The Persistent Staging Area (PSA) in the BI system is the inbound storage area for data from the source systems. The requested data is saved, unchanged from the source system.

The transformation specifies how the data (key figures, time characteristics, characteristics) is updated and transformed from the source, into an InfoProvider or InfoSource. The transformation rules map the fields of the source to at least one InfoObject in the target. The information is mapped in structured form using the InfoObjects.

You need to use an InfoSource if you want to execute two transformations one after the other.

Subsequently, the data can be updated to further InfoProviders. The InfoProvider provides the data that is evaluated in queries. You can also distribute data to other systems using the open hub destination.