QPR ProcessAnalyzer ETL Tutorial

From QPR ProcessAnalyzer Wiki
Jump to: navigation, search

Introduction to QPR ProcessAnalyzer ETL Tutorial

The QPR ProcessAnalyzer ETL system enables data extraction, transformation and loading based on SQL queries implemented by scripts. Scripts are written in SQL and consist of standard SQL commands, QPR ProcessAnalyzer commands and special parameters related to QPR ProcessAnalyzer. In this tutorial, you will learn the basics of building ETL scripts for extracting data and how to utilize that data for different purposes in QPR ProcessAnalyzer.

In general, ETL (Extract, Transform, Load) refers to:

  • Extracting data from the source system
  • Transforming the data to a specific data format
  • Loading the data to the target database, or other system.

Scripts are useful for performing several database related operations. More specifically, with the QPR ProcessAnalyzer ETL system it is possible to, for example:

  • extract data from a source system to data tables
  • load data from data tables and create events and case attributes to new or existing models based on this data
  • read data from existing models and create reports
  • import data from existing models and create new models extending the current features of QPR ProcessAnalyzer

Prerequisites for QPR ProcessAnalyzer ETL Tutorial

This tutorial is intended for users familiar with the basics of SQL. For studying SQL, please see, for example, http://www.w3schools.com/sql/default.asp.

For completing the tutorial, you need to have QPR ProcessAnalyzer Pro installed and have the SAP OrderToCash sample Excel file available for importing into QPR ProcessAnalyzer.

Principles of QPR ProcessAnalyzer ETL Scripts

Here is an example of a simple QPR ProcessAnalyzer ETL script that prints data taken from the SQL table #Customers: to a QPR ProcessAnalyzer client Excel sheet. (We will explain later on in this tutorial how this kind of a table can be created in the ETL environment.)

SELECT * FROM #Customers; 
(SELECT 'Title', 'Customers') UNION ALL 
(SELECT 'MaximumCount', '0'); 
--#ShowReport 

This script consists of SQL statements (lines 1-3) and a QPR ProcessAnalyzer command (line 4) that take the results of the preceding SQL SELECT statements as parameters.

Let’s go through this example in more detail:

Line 1:
SELECT * FROM #Customers; 

SELECT is one of the most basic SQL commands you can use. The SELECT statement is used to extract data from a table. You can define the criteria for what data is selected for extraction. On line 1 of the example above, the SQL statement selects all the columns from the #Customers table. In addition, this produces the first input argument for the QPR ProcessAnalyzer command —#ShowReport by defining what to print.

Note on SQL syntax: You may wonder why a temporary table #Customers is used instead of an ordinary table (The preceding hash symbol (#) indicates that it is a temporary table.) In the QPR ProcessAnalyzer Pro ETL system only temporary tables can be used. They exist only during the execution of the script so they are not stored permanently.

Lines 2-3:
(SELECT 'Title', 'Customers') UNION ALL 
(SELECT 'MaximumCount', '0'); 

These lines create the second input argument for the command --#ShowReport by giving a label for the sheet and defining how many lines to print. ‘0’ means that all rows will be printed.

SQL uses the UNION operator to combine the result-set of two or more SELECT statements. UNION ALL is used to select all the values (including duplicate values) whereas UNION is used to select only the distinct values.

Line 4:
--#ShowReport 

When writing scripts with the QPR ProcessAnalyzer ETL system, the lines in the script that start with "--#" (without the quotes and without any preceding blank spaces at the beginning of the line) are treated as QPR ProcessAnalyzer Commands and SQL ignores these lines as comments. The supported QPR ProcessAnalyzer commands and some examples of how to use them in scripts are listed in Data Extraction, Transformation, and Loading.

To create scripts, you need to have Administrator rights in the project and RunScripts rights for all projects.

Things to Note about SQL Syntax

  • It is important to follow carefully the syntax when writing SQL queries. QPR ProcessAnalyzer uses the Transact-SQL syntax.
  • SQL is not case sensitive, so you can write either SELECT or select. However, the parameter values and the QPR ProcessAnalyzer commands given as comments are case sensitive. For example, on lines 2-3 of the example above typing 'Maximumcount' instead of 'MaximumCount' would cause the QPR ProcessAnalyzer to omit the parameter, or in some cases, lead to an error. Also, typing --#Showreport instead of --#ShowReport would cause an error as well.
  • Some of the examples in this tutorial use parentheses around SELECT statements for the sake of clarity.
  • Semicolon is used to end SQL statements. It is not always required, but in some cases the missing semicolon may cause SQL errors due to ambiguity. We recommend using a semicolon especially at the end of SELECT statements preceding the QPR ProcessAnalyzer ETL commands (see lines 1-3 in the example above) to avoid SQL errors.
  • SQL data table field names may contain white spaces, special characters, or reserved words but then they must be referred to by using brackets ([]) around them. This situation happens frequently in QPR ProcessAnalyzer ETL scripts since the case and event attribute data contain fields that have extra spaces.

Getting Started with the Tutorial

Before you start creating your first script in QPR ProcessAnalyzer, create a new empty project and a model and import sample data manually into it:

  1. On the QPR tab in Excel, click Project Workspace.
  2. On the Project Workspace dialog, click the New button and select New Project...
  3. Give the new project the Name “ETL Tutorial” and click OK.
  4. Click the New button and select New Model…
  5. Give the model the Name “ETL Example Model” and click OK.
  6. On the QPR tab, click Help and then select Sample Files.
  7. Select the file SAP_OrderToCash and click Open.
  8. On the QPR tab, click Project Workspace.
  9. Select Import… and Data Table. Click Next.
    Etl import1.png
  10. Select Current Worksheet as the source and click Next.
    Etl import2.png
  11. Select Create New and enter the Table Name as “ETL Event Data”. Click Import.
    Etl import3.png
  12. Click OK.
    Etl import4.png
  13. Now you need to import the case attribute data in a similar manner. In the SAP_OrderToCash file, go to the sheet “Case Attribute Data”. Open Project Workspace.
  14. Select Import… and Data Table. Click Next.
    EtlImport5.png
  15. Select Current Worksheet as the source and click Next.
    EtlImport6.png
  16. Select Create New and enter the Table Name as “ETL Case Attribute Data”. Click Import.
    EtlImport7.png
  17. Click OK.
    EtlImport8.png
Now you have successfully imported the sample data by creating two data tables under the “ETL Tutorial” Project.
EtlImport9.png
Next you can start creating your own scripts by following the step-by-step exercises in the links below. We recommend completing them in the given order.

Exercises for QPR ProcessAnalyzer ETL Tutorial

Exercise 1: Creating Your First ETL Script
Exercise 2: Creating a Report in Excel
Exercise 3: Showing a Data Table
Exercise 4: Creating a Model from Data Tables
Exercise 5: Copying a Script to Another Context and Deleting a Script

Further Reading

W3Schools SQL tutorial
Transact-SQL Reference
SQL in Wikipedia