SQL Scripting for ETL: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page describes how to extract data from source systems and load data into QPR ProcessAnalyzer, i.e. how to do ETL (extract, transform and load). The supported commands and their descriptions are listed in [[QPR ProcessAnalyzer Scripting Commands]]. It is possible to load raw data into QPR ProcessAnalyzer, and do the data transformation and loading into QPR ProcessAnalyzer Service via scripts using temporary database tables, so that the resulting transformed data can be used for analyses in QPR ProcessAnalyzer. Scripts can be written in the [[Manage Scripts in QPR ProcessAnalyzer Excel Client#Script Manager|Manage Scripts]] dialog in the Excel Client. The script consists of SQL statements and QPR ProcessAnalyzer commands that take the preceding SQL statements as parameters.
This page describes how to extract data from source systems and load data into QPR ProcessAnalyzer, i.e. how to do ETL (extract, transform and load). The supported commands and their descriptions are listed in [[QPR ProcessAnalyzer Scripting Commands]]. It is possible to load raw data into QPR ProcessAnalyzer, and do the data transformation and loading into QPR ProcessAnalyzer Service via scripts using temporary database tables, so that the resulting transformed data can be used for analyses in QPR ProcessAnalyzer. Scripts can be written in the [[Manage Scripts in QPR ProcessAnalyzer#Script Manager|Manage Scripts]] dialog in the Excel Client. The script consists of SQL statements and QPR ProcessAnalyzer commands that take the preceding SQL statements as parameters.


__TOC__
__TOC__


== Script Variables ==
== Script Variables ==
The scripts have variables available listed in the table below. Even when a script context hasn't been selected, the variables get their values from a default filter. The default filter is the filter that was last used by the user in the model the user has last opened. If the user hasn't opened any model before, the default filter is taken from the last created model to which the user has access rights.
Scripts have the below listed variables available.


{| class="wikitable"
{| class="wikitable"
Line 10: Line 10:
! Description
! Description
|-
|-
|@_CurrentScriptId
|@_ProjectId (BIGINT)
|Id the the script being run. BIGINT. The difference between @_ScriptId and @_CurrentScriptId is apparent with nested scripts. @_ScriptId gives the id of the script that was started and which can call other scripts, @_CurrentScriptId gives the id of the script that is being run after the script has been called by the original script. That is, if you have nested scripts, @_ScriptId doesn't change, whereas @_CurrentScriptId is updated when the original script calls some other script.
|Id of the project in which context the script is run. Undefined if project context has not been specified in script execution parameters.
|-
|-
|@_DatabaseId
|@_ModelId (BIGINT)
|the Id of the database in use. GUID.
|Id of the model in which context the script is run. Undefined if model context has not been specified in script execution parameters.
|-
|-
|@_EngineScriptingVersion
|@_FilterId (BIGINT)
|The scripting version that identifies the functionalities available from QPR ProcessAnalyzer server. A script created for newer (bigger) scripting version doesn't necessarily work on a QPR ProcessAnalyzer server supporting older (smaller) version. INT.
|Id of the filter in which context the script is run. Undefined if filter context has not been specified in script execution parameters.
|-
|-
|@_ExceptionDetails
|@_UserId (INT)
|if there was an exception when running the script, contains the details that would have been displayed, including the system stack trace, NVARCHAR(MAX), otherwise NULL.
|Id of the user running the script.
|-
|-
|@_ExceptionMessage
|@_ExceptionOccurred (INT)
|if there was an exception when running the script, contains a message that would have been displayed, NVARCHAR(MAX), otherwise NULL.
|If there was an exception when running the script, the value is '''1''', otherwise '''0'''. INT
|-
|-
|@_ExceptionOccurred
|@_ExceptionMessage (NVARCHAR(MAX))
|if there was an exception when running the script, then this value is 1, otherwise 0. INT
|If there was an exception when running the script, contains the message of the exception.
|-
|-
|@_ExceptionType
|@_ExceptionDetails (NVARCHAR(MAX))
|if there was an exception when running the script, shows the C# class name for the exception, NVARCHAR(MAX), otherwise NULL.
|If there was an exception when running the script, contains the details of the exception, including the system stack trace.
|-
|-
|@_FilterId
|@_ExceptionType (NVARCHAR(MAX))
|the Id of the filter in whose context the script is being run. BIGINT. Undefined if filter context has not been specified in script execution parameters.
|If there was an exception when running the script, shows the C# class name for the exception.
|-
|-
|@_ModelId
|@_ScriptId (BIGINT)
|the Id of the model in whose context the script is being run. BIGINT. Undefined if model context has not been specified in script execution parameters.
|Id of the script that originally called to star the script run.
|-
|-
|@_ProjectId
|@_CurrentScriptId (BIGINT)
|the Id of the project in whose context the script is being run. BIGINT. Undefined if project context has not been specified in script execution parameters.
|Id of the script where the execution currently is. @_ScriptId gives the id of the script that was started and which can call other scripts, @_CurrentScriptId gives the id of the script that is being run after the script has been called by the original script. That is, if you have nested scripts, @_ScriptId doesn't change, whereas @_CurrentScriptId is updated when the original script calls some other script.
|-
|-
|@_QPRProcessAnalyzerVersion
|@_QPRProcessAnalyzerVersion (NVARCHAR(64))
|the QPR Process Analyzer core dll version as string in the format: <major>.<minor>.<build>.<revision>. NVARCHAR(64).
|QPR ProcessAnalyzer version.
|-
|@_ScriptId
|Id of the script that was started. BIGINT.
|-
|@_ScriptingVersion
|The scripting version number that identifies the version which was used when script was saved. Can be used for indicating, for example, the version for which the script was originally planned. A script created for newer (bigger) scripting version doesn't necessarily work on a PA server supporting older (smaller) version. INT.
|-
|@_UserId
|the Id of the user running the script. INT.
|-
|#_ViewSettings
|The Settings shown in the right-hand pane in Excel used to generate the selected view. Note that this doesn't include analysis results or QPR ProcessAnalyzer Server related information, such as FilterId, TotalEventCount, or DatabaseId. Represented as a table two column table variable. The columns are:
* '''Name''': the name of the parameter. NVARCHAR(440)
* '''Value''': the value of the parameter. NVARCHAR(MAX)
|-
|#_Selection
|Objects user has currently selected on the analysis. Supported objects are the event types and flows in the Flowchart or Path analyses and the Excel cells in the Duration, Profiling, Cases, Variations, Event Types, or Flows analyses. Any filters already applied are included in the selection. The maximum amount for the selection is 1000 separate objects and 10000 cells and their values. Represented as a four column table variable, where columns are:
* '''Type''': type of the selected object. INT. Types are described in the table below.
* '''IntValue''': integer representation of the selection value (used for object IDs etc.). Data type is BIGINT.
* '''Value''': string representation of the selected value. Used for selections that require more complex representation than a single integer value. Data type is NVARCHAR(MAX).
* '''Name''': the unique name for the selection value. See the table above for the list of name used for each type. Note that there can be multiple rows with the same name if the selection consists of multiple "objects". Data type is NVARCHAR(MAX).
|-
|-
|}
|}


<div id="ObjectSelectionTypes"></div>
== Script Operation Parameters ==
{|
It's possible to pass parameters to scripts. If not passed, variables get their values from a default filter. The default filter is the filter that was last used by the user in the model the user has last opened. If the user hasn't opened any model before, the default filter is taken from the last created model to which the user has access rights.
|STYLE="vertical-align: top; width: 50%;"|
{| class="wikitable"
! Type
! Description
|-
|0
|Common. Values for the Common Type are represented as a Value-IntValue pair. See the table on the right for descriptions.
|-
|2
|Activity. Has the following names and values:
*SelectedActivities (IntValue): One activity/event type id in every row.
|-
|6
|Activity Path. Has the following names and values:
*SelectedActivityPaths (Value): One activity path in every row.
|-
|8
|Attribute Value. Has the following names and values:
*SelectedAttributeValues (Value): One attribute value in every row.
|-
|10
|Case. Has the following names and values:
*SelectedCases (IntValue): One case id in every row.
|-
|12
|Count. Has the following names and values:
*SelectedCounts (IntValue): One variation event type count in every row.
|-
|14
|Duration. Has the following names and values:
*SelectedDurations (IntValue): One duration in every row.
|-
|16
|Transition. Has the following names and values:
*SelectedTransition (Value): One transition/flow in every row.
|-
|18
|Variation. Has the following names and values:
*SelectedVariations (IntValue): One variation in every row.
|-
|20
|Trends. Has the following names and values:
*SelectedEventTypeTrends (Value): One selected event type trend with all its selected period identifiers in every row.
*SelectedTrendPeriodLevel: Selected period level for trend selection, possible values are 0 = Day, 1 = Week, 2 = Month, 3 = Quarter, 4 = Year.
*SelectedTrendStartDate: Selected start date in ISO format (e.g. 2016-01-28) for the trend selection.
|}
|STYLE="vertical-align: top; padding-left: 20px"|
{| class="wikitable"
! Common Type Value
! Description
|-
|NumberOfActivities
|The amount of activities in the selection.
|-
|NumberOfActivityPaths
|The amount of activity paths in the selection.
|-
|NumberOfAttributeValues
|The amount of attribute values in the selection.
|-
|NumberOfCases
|The amount of cases in the selection.
|-
|NumberOfCounts
|See the "Group By Event Count" entry in the [[Variation Analysis_(PAPO)]] page.
|-
|NumberOfDurations
|The amount of durations in the selection.
|-
|NumberOfTransitions
|The amount of transitions in the selection.
|-
|NumberOfVariations
|The amount of variations in the selection.
|-
|DurationGranularity
|The duration time unit used in the selection. See the "Group By" entry in the [[Duration AnalysisFlowchart_Analysis_(PAPO)|Duration analysis]] page for reference.
|-
|DurationMaximum
|The duration limit in the selection. See the "Duration Limit" entry in the [[Duration AnalysisFlowchart_Analysis_(PAPO)|Duration analysis]] page for reference.
|-
|ReversedPaths
|The value is "1" if the "Predecessors" Direction was selected in the [[Path Analysis in QPR ProcessAnalyzer Excel Client]], otherwise "0".
|-
|SelectionType
|What was selected. The value for this parameter is shown on the "Value" column instead of the IntValue column.
|}
|}


== Script Variable Examples ==
Parameters of an operation are the following:
* FilterId: Specifies the id of the filter in which to operate.
* ModelId: Specifies the id of the model in which to operate.
* ProjectId: Specifies the id of the project in which to operate.
* ProjectName: Specifies the name of the project in which to operate (ignored if ProjectId defined). Can not be used when importing data since the possible new project name resulting from the import operation is given using ProjectName parameter.
* ModelName: Specifies the name of the project in which to operate (ignored if ModelId defined). Can not be used when importing any model data, such as events, since the possible new model name resulting from the import operation is given using ModelName parameter.
* FilterName: Specifies the name of the project in which to operate (ignored if FilterId defined). If multiple context parameters have been defined, they all must point to the same context (intersection of all the defined contexts is not empty).


The following script fragment defines the ProjectId, ModelId, and FilterId parameters by using the script variables:
'''Model import parameters''' are parameters in which the whole script is being run and the parameters change when import operations are made during the operation of the script. When a script run is started, script parameters are copied as model import parameters. When running import operations during the script execution (e.g. ImportEvents or ImportDataTable) the target model or datatable of the import replace the earlier value in the model import parameters.
<pre>SELECT @_ProjectId as ProjectId, @_ModelId as ModelId, @_FilterId as FilterId;</pre>


The following script gets the View Settings currently in use in the Settings pane in Excel and shows them as a table in a new Excel sheet:
When running a script, parameters can be given either for the entire script run or explicitly for each operation. When running any operation, the used parameter values are determined primarily from the parameters given to the operation, or (if not given) from the model import parameters.
<pre>(SELECT * FROM #_ViewSettings)
(SELECT 'SheetName', 'ViewSettings')
--#ShowReport</pre>


The following script gets the current selection on the analysis and shows the data as a table in a new Excel sheet:
== Script Variable Examples ==
<pre>(SELECT * FROM #_Selection)
The following script command defines ProjectId, ModelId, and FilterId variables by using the script variables:
(SELECT 'SheetName' , 'Selection')
<pre>SELECT @_ProjectId as ProjectId, @_ModelId as ModelId, @_FilterId as FilterId;</pre>
--#ShowReport</pre>


The following script gets various information about the environment and shows them in a new Excel sheet:
The following script gets various information about the environment:
<pre>SELECT  @_QPRProcessAnalyzerVersion as QPRProcessAnalyzerVersion, @_ScriptingVersion as ScriptingVersion, @_EngineScriptingVersion as EngineScriptingVersion, @_UserId as Userid, @_DatabaseId as DatabaseId
<pre>SELECT  @_QPRProcessAnalyzerVersion as QPRProcessAnalyzerVersion, @_UserId as Userid
(SELECT 'SheetName', 'Info')
(SELECT 'SheetName', 'Info')
--#ShowReport</pre>
--#ShowReport</pre>
Line 188: Line 80:
@_ExceptionDetails 'ExceptionDetails'
@_ExceptionDetails 'ExceptionDetails'
INTO #PACommandExceptions</pre>
INTO #PACommandExceptions</pre>
== Using temporary tables in SQL Server ==
When transforming data in QPR ProcessAnalyzer Pro (i.e. when connected to the QPR ProcessAnalyzer Service), only temporary tables ('''#''') should be used. Note that '''global''' temporary tables ('''##''') cannot be used in the SQL scripts, as using them is not allowed. When using QPR ProcessAnalyzer Xpress, these limitations do not apply.
== Print Command in SQL Server ==
The [http://msdn.microsoft.com/en-us/library/ms176047.aspx Print SQL statement] can be used to generate log entries into the script execution log.


== Running Scripts ==
== Running Scripts ==
A script can be run in the following ways:
Scripts can be run in the following ways:
 
* Running scripts from the [[Manage Scripts in QPR ProcessAnalyzer|Manage Scripts]] dialog
*# On the QPR tab, click '''Manage Scripts'''.
*# From the dialog, select the context in which the script you wish to run exists.
*# Select the script you wish to run.
*# Click '''Run'''.
* Running scripts from the '''Ribbon'''
* Running scripts from the '''Ribbon'''
*# On the '''QPR''' tab of the ribbon, click '''Run'''.
*# On the '''QPR''' tab of the ribbon, click '''Run'''.
*# Select the script to be run from the opening '''Script Gallery'''.
*# Select the script to be run from the opening '''Script Gallery'''.
* Running scripts from the '''[[Manage Scripts in QPR ProcessAnalyzer Excel Client|Manage Scripts]]''' dialog
*# On the QPR tab of the ribbon, click '''Manage Scripts'''.
*# From the dialog, select the context in which the script you wish to run exists.
*# Select the script you wish to run.
*# Click '''Run'''.


== Terminating Scripts ==
== Terminating Scripts ==
Line 218: Line 103:
Note that terminating a script will not revert any changes the script has already done in the database before the '''Cancel''' button is clicked.
Note that terminating a script will not revert any changes the script has already done in the database before the '''Cancel''' button is clicked.


== Things to Note About Scripts ==
== Notes for Scripting ==
When writing and using scripts, take the following things into account:
When writing and using scripts, take the following points into account:
* When transforming data in QPR ProcessAnalyzer, '''temporary tables''' can be used (temporary tables names start with '''#'''). Note that using the '''global temporary tables''' (tables names start with '''##''') is not allowed.
* The [http://msdn.microsoft.com/en-us/library/ms176047.aspx Print SQL statement] can be used to generate log entries into the script execution log.
* Only those lines in the script that start with "--#" (without the quotes) are treated as QPR ProcessAnalyzer Commands, i.e. if there are leading whitespaces before the command, the line is treated as a comment.
* Only those lines in the script that start with "--#" (without the quotes) are treated as QPR ProcessAnalyzer Commands, i.e. if there are leading whitespaces before the command, the line is treated as a comment.
* If you don't define a value for the MaximumCount parameter, 1000 will be used as default, i.e. only the 1000 first rows from a given table or model will be used.
* If you don't define a value for the MaximumCount parameter, 1000 will be used as default, i.e. only the 1000 first rows from a given table or model will be used.

Revision as of 09:57, 11 June 2020

This page describes how to extract data from source systems and load data into QPR ProcessAnalyzer, i.e. how to do ETL (extract, transform and load). The supported commands and their descriptions are listed in QPR ProcessAnalyzer Scripting Commands. It is possible to load raw data into QPR ProcessAnalyzer, and do the data transformation and loading into QPR ProcessAnalyzer Service via scripts using temporary database tables, so that the resulting transformed data can be used for analyses in QPR ProcessAnalyzer. Scripts can be written in the Manage Scripts dialog in the Excel Client. The script consists of SQL statements and QPR ProcessAnalyzer commands that take the preceding SQL statements as parameters.

Script Variables

Scripts have the below listed variables available.

Variable name Description
@_ProjectId (BIGINT) Id of the project in which context the script is run. Undefined if project context has not been specified in script execution parameters.
@_ModelId (BIGINT) Id of the model in which context the script is run. Undefined if model context has not been specified in script execution parameters.
@_FilterId (BIGINT) Id of the filter in which context the script is run. Undefined if filter context has not been specified in script execution parameters.
@_UserId (INT) Id of the user running the script.
@_ExceptionOccurred (INT) If there was an exception when running the script, the value is 1, otherwise 0. INT
@_ExceptionMessage (NVARCHAR(MAX)) If there was an exception when running the script, contains the message of the exception.
@_ExceptionDetails (NVARCHAR(MAX)) If there was an exception when running the script, contains the details of the exception, including the system stack trace.
@_ExceptionType (NVARCHAR(MAX)) If there was an exception when running the script, shows the C# class name for the exception.
@_ScriptId (BIGINT) Id of the script that originally called to star the script run.
@_CurrentScriptId (BIGINT) Id of the script where the execution currently is. @_ScriptId gives the id of the script that was started and which can call other scripts, @_CurrentScriptId gives the id of the script that is being run after the script has been called by the original script. That is, if you have nested scripts, @_ScriptId doesn't change, whereas @_CurrentScriptId is updated when the original script calls some other script.
@_QPRProcessAnalyzerVersion (NVARCHAR(64)) QPR ProcessAnalyzer version.

Script Operation Parameters

It's possible to pass parameters to scripts. If not passed, variables get their values from a default filter. The default filter is the filter that was last used by the user in the model the user has last opened. If the user hasn't opened any model before, the default filter is taken from the last created model to which the user has access rights.

Parameters of an operation are the following:

  • FilterId: Specifies the id of the filter in which to operate.
  • ModelId: Specifies the id of the model in which to operate.
  • ProjectId: Specifies the id of the project in which to operate.
  • ProjectName: Specifies the name of the project in which to operate (ignored if ProjectId defined). Can not be used when importing data since the possible new project name resulting from the import operation is given using ProjectName parameter.
  • ModelName: Specifies the name of the project in which to operate (ignored if ModelId defined). Can not be used when importing any model data, such as events, since the possible new model name resulting from the import operation is given using ModelName parameter.
  • FilterName: Specifies the name of the project in which to operate (ignored if FilterId defined). If multiple context parameters have been defined, they all must point to the same context (intersection of all the defined contexts is not empty).

Model import parameters are parameters in which the whole script is being run and the parameters change when import operations are made during the operation of the script. When a script run is started, script parameters are copied as model import parameters. When running import operations during the script execution (e.g. ImportEvents or ImportDataTable) the target model or datatable of the import replace the earlier value in the model import parameters.

When running a script, parameters can be given either for the entire script run or explicitly for each operation. When running any operation, the used parameter values are determined primarily from the parameters given to the operation, or (if not given) from the model import parameters.

Script Variable Examples

The following script command defines ProjectId, ModelId, and FilterId variables by using the script variables:

SELECT @_ProjectId as ProjectId, @_ModelId as ModelId, @_FilterId as FilterId;

The following script gets various information about the environment:

SELECT  @_QPRProcessAnalyzerVersion as QPRProcessAnalyzerVersion, @_UserId as Userid
(SELECT 'SheetName', 'Info')
--#ShowReport

Exception Handling

In general, scripts are meant to be developed in such a way that in the end you can run the scripts without any errors. However, sometimes there may be some system issues (timeouts SAP etc.) that can cause for example data extraction scripts to fail. For these kind of situations and for development time and troubleshooting purposes, you can use the CatchOperationExceptions parameter and the @_ExceptionOccurred, @_ExceptionType, @_ExceptionMessage, and @_ExceptionDetails script variables with the QPR ProcessAnalyzer script commands to handle exceptions in ProcessAnalyzer. Out of these, the @_ExceptionOccurred is handy for example in defining some other command to be run in case there was an exception. For SQL errors, the TRY-CATCH error handling should be used.

Note that the CatchOperationExceptions parameter is in effect only for the command it is used with, i.e. it isn't in effect in child scripts or scripts that are run via the --#Exit command. In addition, when there are multiple ProcessAnalyzer script commands in the script, the @_ExceptionOccurred, @_ExceptionType, @_ExceptionMessage, and @_ExceptionDetails script variables get updated each time, i.e. the variable values are available only until the next ProcessAnalyzer command is executed. To use the same variable values in multiple ProcessAnalyzer commands in the script, place the values into a temporary table:

SELECT
@_ExceptionOccurred 'ExceptionOccurred',
@_ExceptionType 'ExceptionType',
@_ExceptionMessage 'ExceptionMessage',
@_ExceptionDetails 'ExceptionDetails'
INTO #PACommandExceptions

Running Scripts

Scripts can be run in the following ways:

  • Running scripts from the Manage Scripts dialog
    1. On the QPR tab, click Manage Scripts.
    2. From the dialog, select the context in which the script you wish to run exists.
    3. Select the script you wish to run.
    4. Click Run.
  • Running scripts from the Ribbon
    1. On the QPR tab of the ribbon, click Run.
    2. Select the script to be run from the opening Script Gallery.

Terminating Scripts

Scripts can be terminated in the following ways:

  • The user running the script can click the Cancel button when the script is running.
  • The script can use the --#Exit command to stop script execution.
  • A QPR ProcessAnalyzer Administrator user can terminate scripts in the Operation Log.
  • The SQL Server System Administrator can kill the session that uses the script by using e.g. SQL Server Management Studio.
  • The Internet Information Services Administrator can recycle the application pool if the script has caused it to hang. This may also cause other requests by other users being processed at the same time to be aborted.
  • The Windows Administrator can kill the w3wp.exe process processing a problematic script. This may also cause other requests by other users being processed at the same time to be aborted.

Note that terminating a script will not revert any changes the script has already done in the database before the Cancel button is clicked.

Notes for Scripting

When writing and using scripts, take the following points into account:

  • When transforming data in QPR ProcessAnalyzer, temporary tables can be used (temporary tables names start with #). Note that using the global temporary tables (tables names start with ##) is not allowed.
  • The Print SQL statement can be used to generate log entries into the script execution log.
  • Only those lines in the script that start with "--#" (without the quotes) are treated as QPR ProcessAnalyzer Commands, i.e. if there are leading whitespaces before the command, the line is treated as a comment.
  • If you don't define a value for the MaximumCount parameter, 1000 will be used as default, i.e. only the 1000 first rows from a given table or model will be used.
  • When doing more advanced operations with scripts, you may run into the error messages such as: "The data types sql_variant and varchar are incompatible in the add operation.", "Argument data type sql_variant is invalid for argument 1 of like function.", "Argument data type sql_variant is invalid for argument 1 of left function.". This is because case attributes, event attributes, and data inside data tables are sql_variant type data. In order to use them with more advanced operations (e.g. Add), you need to CONVERT or CAST them into some other data type before the operations. See this example.
  • For certain characters in attribute values, you need to use escaping in order to have them interpreted correctly in the script. For more information, see Escaping for Attribute Values.

Supported Transport Layer Security Protocols

The following transport layer security protocols are supported when using for example the CallWebService, ImportOdbcQuery, ImportOleDbQuery, ImportSapQuery, ImportSqlQuery, and SendEmail script commands:

  • TLS 1.0
  • TLS 1.1
  • TLS 1.2

SSL2 and SSL3 are not supported.

See Also