Quantcast
Channel: ATeam Chronicles
Viewing all 255 articles
Browse latest View live

Uploading CSV files to BICS Visual Analyzer and Data Visualization Cloud Service

$
0
0

Introduction

This post details a new feature introduced in Version 2.1 of the Oracle BICS Data Sync tool.

Currently BICS Visual Analyzer (VA) and Data Visualization Cloud Service (DVCS) users may upload Microsoft Excel Workbooks (.XLSX) but not Comma Separated Values (CSV) files. This is an issue for use cases that use a CSV file produced from a data extraction utility, particularly if the CSV file is updated regularly.

This post provides an easy way to upload CVS files to BICS or DVCS as a Data Set for the above use case.

The Data Sync tool also provides the following advantages:

* The ability to schedule periodic loads of the file(s)
* The ability for the Data Sync Load to be triggered by the successful completion of an event.

Prerequisites

If necessary, download and install the Oracle BICS Data Sync utility from the Oracle Technology Network http://www.oracle.com/technetwork/middleware/bicloud/downloads/index.html and the accompanying installation documentation BICS Data Sync Getting Started Guide

Other A-Team Chronicles Blogs detail how to perform the installation, for example: Configuring the Data Sync Tool for BI Cloud Service (BICS)

Steps

Create a BICS/DVCS Target Connection

If you already have a BICS or DVCS connection, proceed to Create a Project and/or a Job below.

The Data Sync installation may have created a connection named Target with a connection type of Oracle (BICS). If so, edit this one or create a new one. As shown in the figure below, enter the User and Password for the BICS or DVCS you want to upload to. Enter the URL of the service and click on Test Connection.

Note: The connection type of Oracle (BICS) is the correct type for DVCS also.

P1

Note: The URL is the URL shown in your browser minus the “/va” and everything following. An example is shown in the figure below.

P2

Create a Project and/or a Job

If you already have a project that contains a job whose primary target is the BICS or DVCS connection, proceed to Create a File Data Task below.

From the Menu Bar, select File > Projects > Create a New Project, enter a name and click OK as shown below.

P3.JPG

P4.JPG

Create a File Data Task

Under the Menu Bar, select the Project group, select the new project name, select the File Data tab below the Project group and click New as shown below.

P5.JPG

Select the CSV File Location, accept the File Name, assign a Logical Name (with no spaces) and click Next as shown below.

p6

Edit or accept the Import Options and click Next as shown below.

Note: This step imports only the column metadata in the file (data type, length, etc.) and not the actual data. The sampling size is usually sufficient.

p7

Check the Create New box, enter a Data Set name, select Data Set for the Output Option and click Next as shown below.

p8

Edit or accept the Map Columns settings and click OK as shown below.

p9

Update and Run the Data Sync Job and Review the Results

Under the Menu Bar, select the Jobs group, select the Jobs tab below the Jobs group, right-click on the job name and click Update Job as shown below.

p10

To the right of the Jobs group, click on Run Job as shown below.

p11

The job should run quickly. Select the History tab and the job will show completed. Click on the Tasks tab below the job status line and the task will show the number of records uploaded as shown below.

p12

View the Cloud Service Data Set

Log into the BICS VA or DVCS, click on New Project, select Data Sets as the Source and the uploaded Data Set created from the CSV file will be displayed as shown below.

p13

Summary

This post describes a method of using the Oracle Data Sync utility to upload a CSV file to either BICS or DVCS as a Data Set that may be used in VA / DVCS projects.

Additional information on Data Sync, including the scheduling and triggering Data Sync jobs, may be found on OTN at http://www.oracle.com/technetwork/middleware/bicloud/downloads/index.html.

For more BICS best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-Team Chronicles for BICS.

 

 


Installing Data Sync in Compute for Cloud to Cloud Loading into BICS

$
0
0

For other A-Team articles about BICS and Data Sync, click here

Introduction

The Data Sync tool provides the ability to extract from both on-premise, and cloud data sources, and to load that data into BI Cloud Service (BICS), and other relational databases.  In some use cases, both the source databases, and the target, may be in ‘the Cloud’.  Rather than run the Data Sync tool ‘On-Premise’ to extract data down from the cloud, only to load it back up again, this article outlines an approach where the Data Sync tool is installed and run in an Oracle Compute Instance in the Cloud.  In this way all data movement and processing happens in ‘the cloud’ and no on-premise install is required.

 

Main Article

In this example Data Sync will be installed into its own Instance in Oracle Compute.

In theory you could install into any existing compute instance, for example JCS, DBCS, etc, although there the Data Sync tool would be sharing the same file system as other applications.  This could, for example, be a problem in the case of a restore where files may be overwritten.  Where possible, it is therefore recommended that a separate Compute Instance is created for Data Sync.

Create Compute Instance

1. In Compute, chose a suitable Image, Shape and Storage for the planned workload.  It is recommended to give Data Sync at least 8 GB of memory.  It is suggested NOT to select the ‘minimal’ image as that will require additional packages to be loaded later.

2. In this example the OL-6.6-20GB-x11-RD image was used, along with a general purpose oc4 shape with 15 GB of memory and 20GB of storage:

Oracle_Compute_Cloud_Service_-_Instance_Creation

3. Once created, obtain the Public IP from the instance.

Oracle_Compute_Cloud_Service_-_Instance_Details

 

Create SSH Session and Install VNC

We will set up an SSH connection and a VNC session on the Compute Instance for Data Sync to run in. When the user disconnects from the session, Data Sync will continue to operate.  It will also allow multiple developers to connect to VNC and share the same session from anywhere in the world.

There are many SSH tools, in this case the free windows tool, Putty, will be used, although other tools can be configured in a similar manner.  Putty can be download from here.

1. Open Putty and Set Up a Connection using the IP of the Instance obtained in step (a) and port 22.

Cursor

2. Expand the ‘Connection’ / ‘SSH’ / ‘Auth’ menu item.  Browse in the ‘Private key file for authentication’ section to the Private Key companion to the Public Key used in the creation of the Compute Instance in the previous section.

Windows7_x64

3. Return to the ‘Session’ section, give the session a name and save it.  Then hit ‘Open’ to start the connection to the Compute Instance.

Cursor

4. For the ‘Login as’ user, enter ‘opc’ and when prompted for the ‘Passphrase’, use the passphrase for the SSH Key.

If the connection is successful, then a command prompt should appear after these have been entered:

Cursor

5. As the opc user, edit sshd_config.

sudo vi /etc/ssh/sshd_config

Uncomment all instances of X11Forwarding and change the following word to be ‘yes’

Screenshot_9_29_16__5_17_PM

6. Save the file, and then restart sshd by running the following command:

sudo /etc/init.d/sshd restart

7. Switch to the Oracle user

sudo -su oracle

8. Run the following command to prevent the Window Manager from displaying a lock screen:

gconftool-2 -s -t bool /apps/gnome-screensaver/lock_enabled false

9. Start VNC server with the following command:

vncserver :1 -depth 16 -alwaysshared -geometry 1200×750 -s off

10. Figure out which port VNC is using

We’re going to use SSH port forwarding.  To do this, we need to confirm the port that is being used by VNC.

Typically the port is 5900 + N, where N is the display number.

In the screenshot below when VNC was started, it shows the screen is number 1 (the value after the ‘:’ in “d32f4d : 1” ) so in this case the port is 5901.  This will typically be the port number, but if other VNC sessions are already running, then it may be different.

To test this, run this command:

netstat -anp | grep 5901

This should confirm the process listening on that port – in this case, VNC:

Cursor

11. Exit the putty session by typing ‘exit’ and return once to exit the oracle user, and ‘exit’ and return again to exit the putty session.

 

Create SSH Tunnel and Start VNC Session

1. Create the SSH Tunnel

Open putty again and load the saved session from earlier.  Open the ‘Connection’ / ‘SSH’ / ‘Tunnel’ menu item.

We need to create an SSH tunnel to forward VNC traffic from the local host to port 5901 on the Compute Instance.

In this example we enter the Local Port also as 5901, and then in the Destination, the IP address of the Compute Instance, followed by a ‘:’ and then 5901.  Select ‘Add’ to set up the tunnel.

Cursor

2. Return back to the top ‘Session’ menu and ‘Save’ the session again to capture the changes, then Open the session again and connect as ‘opc’ and enter the passphrase.

Cursor

3.  If a VNC client is not installed on the user’s machine, download one.  In this case the free viewer from RealVNC which can be downloaded from here is being used.

4. Open VNC viewer and for the target, enter ‘localhost:5901’.  VNC will attempt to connect to the local port 5901, which will then be redirected by SSH to port 5901 on the target.

Cursor

Anytime a VNC session is going to be used, the putty session must be open (although some VNC tools will also set up the SSH session for you, in which case you can use that if preferred).

5. Enter the VNC password and the session will be connected.  If there is an error message within the VNC session stating ‘Authentication is Required to set the network proxy used for downloading packages’, then click ‘Cancel’ to ignore it.

 

Install Data Sync Software in Compute Instance

1. Within the connected VNC session, open a Terminal session

Screenshot_9_30_16__2_24_PM

2. To turn on copy and paste between the client and the VNC session, enter:

vncconfig -nowin &

 

3. Download the Data Sync and JDK Software

Open Firefox within the VNC session and download the required software.

Data Sync can be found here:  http://www.oracle.com/technetwork/middleware/bicloud/downloads/index.html

JDK downloads can be found here:  http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

For the JDK, select one of the Linux x64 versions.

4. Plan where to in install the software.

Take a look at the file system and see which makes the most case in your scenario.  In this example we are using the /home/oracle directory in a sub-directory we created called ‘datasync’.  Depending on the configuration of the Compute Instance and its storage, there may be better choices.

5. Extract both the JDK and Data Sync software to that directory.

Screenshot_9_30_16__2_43_PM

6. Edit the ‘config.sh’ file to point to the location of the JDK

Screenshot_9_30_16__2_49_PM

7. Start Data Sync by running

./datasync.sh

 

Then go through the standard steps for setting up and configuring the Data Sync tool.

For more information on setting up Data Sync, see this article.

For information on setting up Data Sync to source from Cloud OTBI environments, see this article.

Other Data Sync documentation can be found here.

 

Once the VNC session has been set up, then other users can also connect.  They will just need to complete the following steps from above:

Create SSH Session and Install VNC, Steps 1, 2 & 3

Create SSH Tunnel and Start VNC Session, Steps 1 & 2

 

Summary

This article walked through the steps to create a Compute Instance, accessible through VNC over SSH, and then to install Data Sync into that  for loading scenarios where an on-premise footprint is not required.

For other A-Team articles about BICS and Data Sync, click here.

Oracle GoldenGate: Testing the Extract’s maximum read performance in extreme environments

$
0
0

MOS Doc ID 2193584.1

Version 1.2 10/14/16

If you have a requirement to be able to process over a Terabyte of redo per hour you may want to first run a simple test to figure out if your system is capable of reading that much data before you spend a large amount of time trying to tune your Oracle GoldenGate (OGG) configuration.

This method is a simple way to test your systems ability to read a very large amount of data.  This test only covers the read and not the rest of the processing. Once you determine that extract can keep up with the read rate, then you can move down the chain to the next step in the process.  That will be covered in another paper.

In order to test the extract maximum read rate all you need to do is create the built-in heartbeat table and create a extract with no map statements in it.  In the latest version of OGG, when you create the heartbeat table in GGSCI, OGG will create the heartbeat table and add a job to update the heartbeat every minute.   The extract process automatically adds the heartbeat to the OGG processes, no map statement is required.

First step: install OGG.  The install process is detailed in the OGG documentation.  This process will assume that the OGG software has been installed according to the documentation.

Second step: Check that you have the minimal setup configured.  You need to check that the database privileges, supplemental logging, and Stream pool size are set correctly.  If any of these values are incorrect or not set to the minimum required, please review the OGG install guide for correct settings.  To check this, issue the following commands in SQLPLUS –

Check Supplemental Logging –

col supplemental_log_data_min format A15 heading 'Minimum|supplemental|log data'
col force_logging format A10 heading 'force|logging'
SQL> SELECT supplemental_log_data_min, force_logging FROM v$database;

Minimum
supplemental    force
log data        logging
--------------- ----------
YES             NO

If the result is NO for either or both properties, refer to the OGG install to set.  Note: you can also do force logging at a tablespace level.  See OGG install documentation for more details.

Check OGG user privileges –

col USERNAME format A10 heading 'User Name'
col PRIVILEGE_TYPE format A10 heading 'PRIVILEGE|TYPE'
col GRANT_SELECT_PRIVILEGES format A10 heading 'GRANT|SELECT|PRIVILEGES'
col CREATE_TIME format A30 heading 'CREATE TIME'
SQL>
SQL> select * from dba_goldengate_privileges;
GRANT
PRIVILEGE SELECT
User Name TYPE PRIVILEGES CREATE TIME
---------- ---------- ---------- ------------------------------
GGADMIN * YES 08-JUL-16 11.56.51.792606 AM

Check to make sure Streams Pool size and GoldenGate replication parameters are set –

set linesize 130
col name format a30
col value format a10
col description format a40

select name, value, description, ISSYS_MODIFIABLE
from v$parameter
where
        name like 'enable_goldengate_replication'
        or name like 'streams_pool_size';

NAME                           VALUE      DESCRIPTION                              ISSYS_MOD
------------------------------ ---------- ---------------------------------------- ---------
streams_pool_size              2147483648 size in bytes of the streams pool        IMMEDIATE
enable_goldengate_replication  TRUE       goldengate replication enabled           IMMEDIATE

If enable_goldengate_replication is not set to true you will not be able to start OGG.  If the streams pool is not sized to at least the minimum recommended size, it can cause performance issues.   Please check OGG documentation for recommended streams_pool_size.

 

Switch the log files.

SQL&gt; ALTER SYSTEM SWITCH LOGFILE;<strong> </strong>

Third Step:   Add the heartbeat table.  This feature is part of OGG 12.2 functionality.

Verify that GGSCHEMA name is in the GLOBALS file –

$ (slc09ujv)[a11204s] /scratch/oracle/OGG12.2\> more GLOBALS
GGSCHEMA ggadmin

If GGSCHEMA is not in the GLOBALS file, please add it.

Enable the Heartbeat functionality by executing the GGSCI command ‘ADD HEARTBEATTABLE’.

GGSCI (slc09ujv) 1> dblogin userid ggadmin password ggs
Successfully logged into database.

GGSCI (slc09ujv as ggadmin@a11204s) 2> add heartbeattable
2016-10-04 12:05:02  INFO    OGG-14001  Successfully created heartbeat seed table ["GG_HEARTBEAT_SEED"].
2016-10-04 12:05:02  INFO    OGG-14032  Successfully added supplemental logging for heartbeat seed table ["GG_HEARTBEAT_SEED"].
2016-10-04 12:05:02  INFO    OGG-14000  Successfully created heartbeat table ["GG_HEARTBEAT"].
2016-10-04 12:05:02  INFO    OGG-14033  Successfully added supplemental logging for heartbeat table ["GG_HEARTBEAT"].
2016-10-04 12:05:02  INFO    OGG-14016  Successfully created heartbeat history table ["GG_HEARTBEAT_HISTORY"].
2016-10-04 12:05:02  INFO    OGG-14023  Successfully created heartbeat lag view ["GG_LAG"].
2016-10-04 12:05:02  INFO    OGG-14024  Successfully created heartbeat lag history view ["GG_LAG_HISTORY"].
2016-10-04 12:05:02  INFO    OGG-14003  Successfully populated heartbeat seed table with [A11204S].
2016-10-04 12:05:02  INFO    OGG-14004  Successfully created procedure ["GG_UPDATE_HB_TAB"] to update the heartbeat tables.
2016-10-04 12:05:02  INFO    OGG-14017  Successfully created procedure ["GG_PURGE_HB_TAB"] to purge the heartbeat history table.
2016-10-04 12:05:02  INFO    OGG-14005  Successfully created scheduler job ["GG_UPDATE_HEARTBEATS"] to update the heartbeat tables.
2016-10-04 12:05:02  INFO    OGG-14018  Successfully created scheduler job ["GG_PURGE_HEARTBEATS"] to purge the heartbeat history table.

Fourth Step: Configure a Manager process-

 

If you don’t already have a manager process set up you will need to create one.  For the proposes of this test a very simple one liner is all that is needed  –

GGSCI> edit params mgr
port 7809

Start the manager process –

GGCSI> start mgr

Fifth step: Create an extract

Add the extract process –

./ggsci
ADD EXTRACT ext_test INTEGRATED TRANLOG, BEGIN NOW

Add the trail –

./ggsci
ADD EXTTRAIL ./dirdat/ET, EXTRACT ext_test

 

Register the Extract –

./ggsci
DBLOGIN USERID gadmin PASSWORD ggs
register extract ext_test database

Create an extract parameter file –

GGSCI> edit params ext_test

extract ext_test
userid gadmin, password ggs
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
TRANLOGOPTIONS INTEGRATEDPARAMS (MAX_SGA_SIZE 200)
exttrail ./dirdat/ET

Start the extract process –

GGSCI> start extract ext_test

Sending START request to MANAGER ...
EXTRACT EXT_TEST starting

 

Monitoring performance

 

Once you have started the extract you can monitor the performance using the following query –

set verify off
set linesize 200
set pagesize 80
col extract_name format a8 heading 'Extract|Name'
col Run_time_HR format 99,999.99 heading 'Run Time'
col mined_GB format 999,999.99 heading 'Total GB|mined'
col sent_GB format 999,999.99 heading 'Total GB|sent'
col Sent_GB_Per_HR format 999,999.99 heading 'Total GB|Per HR'
col capture_lag Heading 'Capture|Lag|seconds'
col Current_time Heading 'Current|Time'
col extract_name format a8 heading 'Extract|Name'
col GB_Per_HR format 999,999.99 heading 'GB Mined|Per HR'
alter session set nls_date_format='YYYY-MM-DD HH24:Mi:SS';

select
        EXTRACT_NAME,
        TO_CHAR(sysdate, 'HH24:MI:SS MM/DD/YY') Current_time,
        ((SYSDATE-STARTUP_TIME)*24) Run_time_HR ,
        (SYSDATE- capture_message_create_time)*86400 capture_lag,
        BYTES_OF_REDO_MINED/1024/1024/1024 mined_GB,
        (BYTES_OF_REDO_MINED/1024/1024/1024)/((SYSDATE-STARTUP_TIME)*24) GB_Per_HR,
        BYTES_SENT/1024/1024/1024 sent_GB,
        (BYTES_SENT/1024/1024/1024)/((SYSDATE-STARTUP_TIME)*24) Sent_GB_Per_HR
   from gv$goldengate_capture;
                                       Capture
Extract  Current                           Lag    Total GB    GB Mined    Total GB    Total GB
Name     Time                Run Time  seconds       mined      Per HR        sent      Per HR
-------- ----------------- ---------- -------- ----------- ----------- ----------- -----------
EXT_TEST 09:30:52 10/06/16        .56        2         .00         .01         .00         .00

The output is in GB per hour.   If the column “SENT_GB” and SENT_GB_PER_HR” are blank, then the process is not running.

Clean up of the heartbeat table and the heartbeat scheduler

In order to cleanup your environment after you are done with the test you may want to remove the heartbeat table and the DBMS scheduler that updates the heartbeat table.   Please note that it is a best practice to use the heartbeat table in a OGG environment.

To remove the Heartbeat table and the DBMS scheduler, issue the following commands in GGSCI –

GGSCI (slc09ujv) 1> dblogin userid ggadmin password ggs
Successfully logged into database.

GGSCI (slc09ujv as ggadmin@a11204s) 2> delete heartbeattable

2016-10-06 14:13:57  INFO    OGG-14007  Heartbeat seed table ["GG_HEARTBEAT_SEED"] dropped.
2016-10-06 14:13:57  INFO    OGG-14009  Heartbeat table ["GG_HEARTBEAT"] dropped.
2016-10-06 14:13:57  INFO    OGG-14011  Heartbeat history table ["GG_HEARTBEAT_HISTORY"] dropped.
2016-10-06 14:13:57  INFO    OGG-14026  Heartbeat lag view ["GG_LAG"] dropped.
2016-10-06 14:13:57  INFO    OGG-14028  Heartbeat lag history view ["GG_LAG_HISTORY"] dropped.
2016-10-06 14:13:57  INFO    OGG-14013  Procedure ["GG_UPDATE_HB_TAB"] dropped.
2016-10-06 14:13:57  INFO    OGG-14020  Procedure ["GG_PURGE_HB_TAB"] dropped.
2016-10-06 14:13:57  INFO    OGG-14015  Scheduler job ["GG_UPDATE_HEARTBEATS"] dropped.
2016-10-06 14:13:57  INFO    OGG-14022  Scheduler job ["GG_PURGE_HEARTBEATS"] dropped.

GGSCI (slc09ujv as ggadmin@a11204s) 3>

At this point the heartbeat table and DBMS scheduler job have been removed from the database.

 

Oracle GoldenGate: Apply to Apache Flume File Roll Sink

$
0
0

Introduction

This is part 1 of a two part article demonstrating how to replicate data in near-real time from an on-premises database to Oracle Storage Cloud Service.

In this article we shall demonstrate Oracle GoldenGate functionality to capture transactional data in near real-time from an on-premises Oracle Database and apply the records as delimited text data to an Apache Flume File Roll Sink. The Flume Sink may be on-premises or located in the Cloud; however, we will be using an on-premises server for this demonstration. A subsequent article will demonstrate Oracle Storage Cloud Service functionality to store the Flume data files in the Cloud for later use and analysis by Oracle Big Data tools.

We used the Oracle Big Data Lite Virtual Machine as the test bed for this article. The VM image is available for download on the Oracle Technology Network website.

Main Article

Simply put, Apache Flume is a data ingestion mechanism for collecting, aggregating, and transporting large amounts of streaming data from various sources to a centralized data store. The basic architecture of a Flume Agent is depicted below.

Flume ArchitectureData Generator

A data generator is any data feed; such as Twitter, Facebook, or in our case the Oracle GoldenGate Big Data Adapter, that creates data to be collected by the Flume Agent.

Flume Agent

The Flume Agent is a JVM daemon process that receives the events from data generator clients or other agents and forwards it to a destination sink or agent. The Flume Agent is comprised of:

Source

The source component receives data from the data generators and transfers it to one or more channels in the form of Flume events.

Channel

A channel is a transient store which receives Flume events from the source and buffers them till they are consumed by sinks.

Sink

A sink consumes Flume events from the channels and delivers it to the destination. The destination of the sink may be another agent or an external repository; such as, HDFS, HBase, or in our case text files in a Linux file system.

Oracle GoldenGate

Now that we have an understanding of Apache Flume, we can begin setting up Oracle GoldenGate for data capture and delivery. The first step is to install Oracle GoldenGate for my source database, Oracle 12c in this case, alter the database settings to enable data capture, and create a GoldenGate Change Data Capture Extract that will retrieve near real-time transactional data from Oracle Redo.

The Oracle GoldenGate architecture we’ll be configuring is depicted below.

OGG Architecture

Oracle GoldenGate Source

The Oracle database was altered to support Oracle GoldenGate data capture, and the Container Database user c##ggadmin was created as part of my setup procedures. These requirements are covered in detail in the various Oracle GoldenGate Installation Guides, so I am not covering the steps in detail. To setup the database:

1. Start pl/sql as sysdba via the command:
+++ a) sqlplus / as sysdba
2. Make sure you are in the root container:
+++ a) SELECT SYS_CONTEXT (‘USERENV’, ‘CON_NAME’) FROM DUAL;
3. Execute the following commands:
+++ a) create user c##ggadmin identified by Oracle1 container=all;
+++ b) grant connect, resource, dba to c##ggadmin;
+++ c) exec dbms_goldengate_auth.grant_admin_privilege(‘C##GGADMIN’,container=>’all’);
+++ d) grant dba to c##ggadmin container=all;
+++ e) alter database force logging;
+++ f) alter database add supplemental log data;
+++ g) shutdown immediate;
+++ h) alter database archivelog;
+++ i) alter database open;
+++ j) alter system set enable_goldengate_replication=true scope=both;

By default Oracle Database only logs the changed column information for update operations. Since we are applying data to Flume for further downstream analysis, I want to force Oracle to log all table data when an update occurs. This is done via the GGSCI add schematrandata command.

In my Big Data Lite virtual machine, GoldenGate for Oracle Database is install at /u01/ogg. I go to that directory, start the GGSCI command interpreter, establish a database connection, and execute the command add schematrandata allcols.

ggsci add schematrandata

Execute the command edit param emov to configure the CDC Extract. The configuration file should look like the one below (for more information on the listed parameter settings, refer to the Oracle GoldenGate Reference Guide at Oracle Technology Network).

Extract EMOV Configuration

Execute the ggsci status all command. If the EMOV Extract does not exist, create it and register it with the database as shown below (you will need to start the GoldenGate Manager first).

Create Extract EMOV

Next create the ./dirdat/tm Extract Trail. This series of disk files will contain data retrieved from Oracle Redo by the EMOV CDC Extract.

Add Exttrail

To create the Extract Data Pump, execute the GGSCI command edit param pflume. The configuration file should look like the one below.

pflume settings

The rmthost option defines the DNS name or IP address of a target server running Oracle GoldenGate and the port where the Oracle GoldenGate Manager is listening for incoming connections. In this case, I am defining a loopback for the Oracle GoldenGate Big Data Adapter instance installed on Big Data Lite machine, so I could have excluded the Extract Data Pump. However, Extract Data Pumps are required whenever data is to be transmitted over a network, so it is a good practice to always configure them in your test systems.

Save and close the file, then we can register the Extract Data Pump and its associated Trail file in our Oracle GoldenGate instance.

Register PFLUME

 

Now we can configure the Oracle GoldenGate target instance.

Oracle GoldenGate Target

In my Big Data Lite virtual machine, GoldenGate Generic with the Oracle GoldenGate Big Data Adapter is install at /u01/ogg-bd. I go to that directory, start the GGSCI command interpreter, and start the Oracle GoldenGate Manager.

OGG Target

Execute the command edit param rflume to configure the Flume Apply Replicat. The configuration file should look like the one below.

RFLUME Config

These parameter settings specify the runtime options for this Replicat process; however, you may be unfamiliar with these two:

TARGETDB LIBFILE libggjava.so SET property=dirprm/flume_frs.properties

This parameter serves as a trigger to initialize the Java module. The SET clause specifies the location and name of a Java properties file for the Java module. The Java properties file location may be specified as either an absolute path, or path relative to the Replicat executable location. In the configuration above, I used a relative path for the Oracle GoldenGate instance installed at /u01/ogg-bd.

GROUPTRANSOPS 100

This parameter controls the number of SQL operations that are contained in a Replicat transaction. The default setting is 1000, and is the best practice setting for a production environment. However, my test server is very small and I do not what to overwhelm my flume channel; so I elected to reduce the number of operations the Replicat will apply in a transaction.

Save and close the file. Register the Replicat with the Oracle GoldenGate instance being sure to specify that it will read from the Trail file ./dirdat/pf.

Register RFLUME

To create the flume_frs.properties file; exit GGSCI, change to the dirprm directory, and edit the file with vi or your favorite text editor. The file contents should look similar to the following:

 

flume_frs.properties

 

The flume_frs.properties file defines the Java Virtual Machine settings for the Oracle Big Data Flume Adapter:

gg.handlerlist=flumehandler

Defines the name of the handler configuration properties in this file.

gg.handler.flumehandler.type=flume

Defines the type of this handler.

gg.handler.flumehandler.RpcClientPropertiesFile=custom-flume-rpc.properties

Specifies the name of the Flume Agent configuration file. This file defines the Flume connection information for Replicat to perform remote procedure calls.

gg.handler.flumehandler.format=delimitedtext

Sets the format of the output. Supported formatters are: avro_row, avro_op, delimitedtext, xml, and json.

gg.handler.flumehandler.format.fieldDelimiter=|

The default delimiter is an unprintable character, I changed it to be a vertical bar via this setting.

gg.handler.flumehandler.mode=tx

Sets the operating mode of the Java Adapter. In tx mode, output is written in transactional groups defined by the Replicat GROUPTRANSOPS setting.

gg.handler.flumehandler.EventMapsTo=tx

Defines whether each flume event would represent an operation or a transaction, based upon the setting of gg.handler.flumehandler.mode.

gg.handler.flumehandler.PropagateSchema=true

Defines whether, or not, the Flume handler publishes schema events.

gg.handler.flumehandler.includeTokens=false

When set to true, includes token data from the source trail files in the output. When set to false excludes the token data from the source trail files in the output.

gg.classpath=dirprm/:/usr/lib/flume-ng/lib/*:

Specifies user-defines Java classes and packages used by the Java Virtual Machine to connect to Flume and run. The classpath setting must include (1) the directory location containing the Flume Agent configuration file and (2) a list of the Flume client jars required for the Big Data Adapter to work with Flume. In this example, the Flume client jars are installed at /usr/lib/flume-bg/lib and we use a wildcard, *, to load all of the jars.

The Flume client library versions must match the version of Flume to which the Flume Handler is connecting.

javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

Sets the JVM runtime memory allocation and the location of the Oracle GoldenGate Java Adapter dependencies (ggjava.jar) file.

Save and close the file.

To create the Flume Agent configuration file, custom-flume-rpc.properties, edit the file with vi or your favorite text editor. The file contents should look similar to the following:

custome-flume-rpc.properties

This file contains the settings Oracle GoldenGate will use to connect to the Flume Agent. In this example, the Oracle GoldenGate Big Data Adapter will attempt an Avro connection to a Flume Agent running on the local machine and listening for connections on port 41414. Data will be sent to Flume in batches of 100 events. Connection and requests to the Flume Agent will time-out and fail if there is no response within 2000 ms.

Flume Agent

As previously shown, the Oracle GoldenGate Big Data Adapter is configured to connect to a Flume Agent on the local machine, listening on port 41414. To configure the Flume Agent, I created the file /home/oracle/flume/flume_frs.conf; with the following settings:

Flume Agent Config

In this file, we create a single Flume Agent named oggfrs. The oggfrs Agent will consist of an Avro RPC source that listens on port 41414 of our local host, a channel that buffers event data in memory, and a sink that writes events to files in a directory of the local file system. The sink will close the existing file and create a new one every 120 seconds.

You will notice that I set the sink directory to include the Oracle source schema and table name. This is a personal preference as the File Roll Sink does not provide a mechanism for naming the output files. Currently, the File Roll Sink names the files based upon the current timestamp when the file is created. The Flume source code sets the file name as follows:

public class PathManager {

  private long seriesTimestamp;
  private File baseDirectory;
  private AtomicInteger fileIndex;

  private File currentFile;

  public PathManager() {
    seriesTimestamp = System.currentTimeMillis();
    fileIndex = new AtomicInteger();
  }

  public File nextFile() {
    currentFile = new File(baseDirectory, seriesTimestamp + "-"
        + fileIndex.incrementAndGet());

    return currentFile;
  }

  public File getCurrentFile() {
    if (currentFile == null) {
      return nextFile();
    }

    return currentFile;
  }

We could write a Java module to override the Flume PathManager class; however, that is beyond the scope of this article.

For more information on setting up Apache Flume Agents, refer to the Apache documentation.

Save and close the file.

We are now ready to start the Flume Agent. In my /home/oracle/flume directory, I created a shell script to start the agent, with the following commands:

Flume Agent shell

After creating the output directory /u01/ogg-db/flumeOut/movidemo/movie, execute the script to start the agent. Upon a successful agent startup, you will see output similar to the following on the terminal screen:

Agent Start

You will also see the files created by the File Roll Sink in the output directory.

File Roll Files

The File Roll Sink will create new files at the interval specified in the Flume Agent configuration file, even when there is no Oracle GoldenGate activity.

Start Oracle GoldenGate and Test

To start and test our configuration, go to the target Oracle GoldenGate environment, /u01/ogg-bd, and make sure the Oracle GoldenGate Manager is in the RUNNING state.

Target OGG Manager

Go to the source Oracle GoldenGate environment, /u01/ogg, start GGSCI and start the emov and pflume Extract Groups. Executing the status all command should show each is in the RUNNING state.

Source OGG Status

On the Oracle GoldenGate target, start the rflume Replicat. Executing the command status rflume should show it in the RUNNING state.

OGG RFLUME Start

 

The Flume Agent will show the Replicat connect to the Flume Source.

Agent Connect

Now let’s generate some data in the Oracle Database and verify it flows through to the Flume File Roll Sink. In SQL Developer, I connect to the Oracle Pluggable Database, select the moviedemo schema, and execute the following:

INSERT INTO "MOVIEDEMO"."MOVIE" (MOVIE_ID, TITLE, YEAR, BUDGET, GROSS, PLOT_SUMMARY) VALUES ('1173971','Jack Reacher: Never Go Back', '2016','96000000','0','Jack Reacher must uncover the truth behind a major government conspiracy in order to clear his name. On the run as a fugitive from the law, Reacher uncovers a potential secret from his past that could change his life forever.');
INSERT INTO "MOVIEDEMO"."MOVIE" (MOVIE_ID, TITLE, YEAR, BUDGET, GROSS, PLOT_SUMMARY) VALUES ('1173972','Boo! A Madea Holloween', '2016','0','0','Madea winds up in the middle of mayhem when she spends a haunted Halloween fending off killers, paranormal poltergeists, ghosts, ghouls and zombies while keeping a watchful eye on a group of misbehaving teens.');
INSERT INTO "MOVIEDEMO"."MOVIE" (MOVIE_ID, TITLE, YEAR, BUDGET, GROSS, PLOT_SUMMARY) VALUES ('1173973','Like Stars on Earth', '2007','0','1204660','An eight-year-old boy is thought to be lazy and a trouble-maker, until the new art teacher has the patience and compassion to discover the real problem behind his struggles in school.');
commit;

Executing the GGSCI command view report rflume on the Oracle GoldenGate target, shows that data was captured and sent to the Oracle GoldenGate Replicat.

RFLUME Report

We will also see a file containing data in out Flume File Roll Sink output directory:

FRS DirectoryThis is a delimited text file, so we can view the contents of the file using cat.

flume frs output

 

In the next part of this series we shall take the delimited text files that contain data, move them to Oracle Storage Cloud Service, and analyze the contents with some of the Oracle Big Data analysis tools.

Summary

In this article we demonstrated the functionality of Oracle GoldenGate to capture database transactions and apply this data to an Apache Flume Agent configured with a File Roll Sink.

Continue to the next part of this article: Uploading a file to Oracle storage cloud service using REST API

ODI on Compute Cloud Service: Step by Step Installation

$
0
0

Introduction

We have seen in Connect ODI to Oracle Database Cloud Service (DBCS) how to connect ODI on premise to DBCS.
But it is also possible to deploy ODI in the Cloud – either on the PAAS (on JCS) or the IAAS (on Compute Cloud Service).
In cases where a JEE ODI Agent is not needed, deploying ODI on Compute is a good alternative.

We are describing here step by step instructions to deploy ODI in this environment.

Prerequisites

    • We already have a Database Cloud Service Instance up and running (please refer to this Oracle By Example tutorial for detailed steps: Creating a Database Cloud Service Instance).
    • We also have our Private and Public SSH Keys stored safely.
    • Of course we have an access to Compute Cloud Service Console.

Let’s connect to the Compute Cloud Services console

Storage Volumes

Based on Best Practices for Using Oracle Compute Cloud Service we are going to create 2 distinct Storage Volumes.

  • One Bootable Storage, which will contain the OS image. We will select an Oracle Linux Image, version 6.6. To see the detail of Oracle Linux Images please refer to About Oracle-Provided Linux Images
  • One Software Storage (to store ODI install and other software if needed).

In the Compute Cloud Services console, we go on the Storage tab and click on Create Storage Volume:

 

 

  • For the Bootable Storage:
      • We select the Oracle Linux Boot Image: OL‐6.6‐20GB‐x11RD

      • For the Software Storage:
          • We select Boot Image: none
          • Size: 100 Gb

Compute Instance

We can now create the Compute Instance:

First, we need to select the image we want to deploy on our Cloud Instance.
As explained above while creating the storage volume, we want to install an Oracle Linux 6.6, so we are going to select OL-6.6-20GB-x11-RD

We will now click on each “tab” of the creation wizard.

On Shape, we select oc4

In the Instance tab, we enter the Name, Description, DNS Hostname prefix and select our SSH key.

We click on Storage and on Attach Existing Volume

ODI on Compute Linux08

We select the volumes we have created: Linux_Boot as “Boot Drive” and Linux_Soft. To keep the instance clean, we delete the default image (in our example CF_LINUX_boot)

We are almost there – we click on Review and check we have attached the SSH key and the Storage – then we can click on Create

ODI on Compute Linux09

We can go in Orchestrations to see the instance is starting

ODI on Compute Linux10

We go back in “Instances” … we wait for a while (few minutes) then we see our instance created and running.

ODI on Compute Linux11

We note the IP address; it will be used in next step

Connection through SSH

In order to access to our new Compute Instance, we can log in using SSH as opc user.

Refer to Accessing an Oracle Linux Instance Using SSH for more details.

From Windows, we create a new Putty Session to connect to the newly created Compute Service. We enter:

  • Host Name= the above IP
  • Port = 22

ODI on Compute Linux12

Then we enter the Private SSH Key in Connection/SSH/Auth

We go back to Sessions and Save.

First time we open the connection we get a security alert:

ODI on Compute Linux13

Click yes – then we are in!!

ODI on Compute Linux14

Storage Volume Handling

While we have created the Compute Instance the Boot Storage is automatically mounted.
We need to mount the additional Storage Volume we have created (Linux_Soft) and attach a folder to it. We will install ODI in that folder.
Refer to Mounting a Storage Volume on a Linux Instance for more details.

First, let’s create a new folder that we will use to store all downloads and ODI installation.

sudo mkdir /u01
sudo chmod 755 /u01

As the Storage Volume we want to mount, Linux_Soft, is on disk 3, the device name will be /dev/xvdd.

We can check the devices on the instance:

ls /dev/xvd*

We create the file system on xvdd

sudo mkfs -t ext4 /dev/xvdd

And finally mount it as u01

sudo mount /dev/xvdd /u01

We create a folder to store downloaded software

sudo mkdir /u01/backup
sudo chmod 755 /u01/backup

VNC Server set-up

As user opc, we edit sshd_config:

vi /etc/ssh/sshd_config

To forward the application display to our local Windows machine, we change all occurrences of X11Forwarding to yes:

X11Forwarding yes

We restart sshd by running the following command:

sudo /etc/init.d/sshd restart

We run the following command to prevent the Window Manager from displaying a lock screen:

gconftool-2 -s -t bool /apps/gnome-screensaver/lock_enabled false

To ensure that the $TMP and $TMPDIR directories are accessible and have write permissions before starting the VNC server, we edit the .bashrc file and set the following values:

vi ~opc/.bashrc

export TMPDIR=/tmp

export TEMPDIR=/tmp

export TMP=/tmp

export TEMP=/tmp

We Rerun the profile (notice the leading period and space):

[OS]$ . ~opc/.bashrc

Observe the current temp environment variables to make sure they all point to the new /tmp. At a command prompt, re-enter this command:

env | grep tmp .

We start VNC server with the following command:

vncserver :1 -depth 16 -alwaysshared -geometry 1200×750 -s off

We stop the SSH connection and update it to add a Tunnel to connect to our Compute Cloud Service through SSH tunneling. We click on Connection/SSH/Tunnels and add the following tunnel:

Session/save and … we open our updated connection.

Once we are connected, we can now launch our favorite VNC client to connect to our VNC Server:

ODI on Compute Linux18

Now we are ready to download and install ODI!

JDK and ODI Downloads

First thing to do is to download the required JDK certified with ODI.

As our goal is to install ODI 12.2.1.0.0 we can see in the Certification Matrix that the minimum JDK version is 1.8.0_77. At the date of writing (Sept 2016), the best selection is jdk-8u101-linux-x64.rpm

At this point ODI is available for download at http://www.oracle.com/technetwork/middleware/data-integrator/downloads/index.html

If you have difficulties connecting to the Oracle web site from the VNC Server, a workaround is to force the DNS to another value than the default one. (cf. https://community.oracle.com/message/13903890#13903890).

JDK Install

We move the downloaded files to /u01/backup

mv /home/opc/Downloads/* /u01/backup

We go in /u01/backup and install the package using

sudo rpm -ivh jdk-8u101-linux-x64.rpm

We check the JDK is installed:

java -version

java version “1.8.0_101”

ODI Install

Now we are ready to install ODI – refer to Installation Guide for Oracle Data Integrator for details

java -jar fmw_12.2.1.1.0_odi.jar

ODI on Compute Linux19

Oracle_Home must be under u01 directory to have enough space and take benefit of the Linux_Soft Storage Volume we have mounted.

ODI on Compute Linux20

Repository Creation

Refer to Creating the Master and Work Repository Schemas for details on how to launch RCU.
In this case we are using a DBCS Instance sharing the same Domain as the Compute Instance.

ODI on Compute Linux21

Conclusion

We have seen the steps to install ODI on a Compute Instance – which is as easy as installing ODI on any Linux Server. We will detail the connections between the ODI Cloud instance and other DBCS in a next article.

For more ODI best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-Team Chronicles for ODI.

Acknowledgements

Special thanks to my A-Team fellows Richard Williams and Roland Koenn, A-Team Cloud Architects, for their help and support.

References

Connect ODI to Oracle Database Cloud Service (DBCS)

Creating a Database Cloud Service Instance

Best Practices for Using Oracle Compute Cloud Service

 

Oracle GoldenGate: How to Configure On-Premise to GoldenGate Cloud Services (GGCS) Replication with Corente VPN

$
0
0

Introduction

This document will walk you through how to configure Oracle GoldenGate replication between On-Premise to GoldenGate Cloud Service (GGCS) on Oracle Public Cloud (OPC) via Virtual Private Network (VPN) using Corente Services Gateway (CSG).

The high level steps for this replication configuration are as follows:

  • Creation of SSH Public/Private Key Files
  • Provisioning of Database Cloud Service (DBCS) which is a pre-requisite of GGCS
  • Provisioning of GoldenGate Cloud Service (GGCS)
  • On-Premise Corente Services Gateway Configuration and Setup
  • Provisioning of Compute Instance for OPC Corente Services Gateway
  • On-Premise and OPC Corente VPN Tunnel configuration
  • GGCS VPN Tunnel Configuration via Generic Routing Encapsulation (GRE) protocol
  • On-Premise and GGCS GoldenGate Sample Replication Configuration

Note: Provisioning Resources in this article requires Oracle Cloud and Corente VPN credentials. If you don’t have one, please contact your Oracle Sales Representative.

The following assumptions have been made during the writing of this article:

  • The reader has a general understanding of Windows and Unix platforms.
  • The reader has basic knowledge of Oracle GoldenGate products and concepts.
  • The reader has a general understanding of Cloud Computing Principles
  • The reader has basic knowledge of Oracle Cloud Services
  • The reader has a general understanding of Network Computing Principles

Main Article

The GoldenGate Cloud Service (GGCS), is a cloud based real-time data integration and replication service, which provides seamless and easy data movement from various On-Premises relational databases to databases in the cloud with sub-second latency while maintaining data consistency and offering fault tolerance and resiliency.

GoldenGate Cloud Service (GGCS) Architecture Diagram:

GGCS_Architecture_v2

In a typical implementation of On-Premise to GGCS, the connectivity is accomplished through the use of SSH, since this is the only port opened by default on the cloud. The On-Premise server communicates directly to the GGCS server through the use of SOCKS proxy.

However, in cases where the security policy dictates or the client doesn’t want to use SSH, as an alternative a VPN connection between On-Premise and the OPC can be used. Currently, for GGCS it has been certified with Corente Services Gateway for VPN connectivity.

Corente VPN Service Architecture Diagram:

corente_architecture_v4

GGCS Corente VPN Deployment Architecture diagram depicted in this article:

ggcs_corente_architecture_v2

GoldenGate Connectivity Flow:

  • On-Premise Network to OPC Network: GGCS Instance can be reached via GRE IP address 172.16.201.3
  • OPC Network to On-Premise Network: On-Premise OGG VM Server can be reached via IP address 192.168.201.51

The complete document can be found on the Oracle Support site under the document ID: 2198461.1

 

BICS Data Sync – Running Post Load Procedures Against DBCS and Oracle RDBMS

$
0
0

For other A-Team articles about BICS and Data Sync, click here

Introduction

The Data Sync tool provides the ability to extract from both on-premise, and cloud data sources, and to load that data into BI Cloud Service (BICS), and other relational databases. In the recent 2.2 release of Data Sync, the functionality to run Post Load SQL and Stored Procedures was added.

Currently this functionality is only available for Oracle DBCS or Oracle DB target databases – it will NOT work for a Schema Service database target – although this article provides details of a workaround when the target is a schema service database.

This article will walk through an example to set up both a post load SQL command, and to execute a stored procedure on the target database.

 

Download The Latest Version of Data Sync Tool

Be sure to download and install the latest version of the Data Sync Tool from OTN through this link.

For further instructions on configuring Data Sync, see this article.  If a previous version of Data Sync is being upgraded, use the documentation on OTN.

 

Main Article

This article will present a simple use case that can be expanded for real world load scenarios.

A Post Load Processing session will be set up to run both a SQL statement, and a stored procedure on the target database once the load has completed.

 

Create Target Summary Table

In this example, a summary table will be loaded with a row of data once the underlying fact table has been refreshed in BICS.  Because the summary table only exists in the target database, we need to create it as a target in data sync.

1. Under ‘Project’ / ‘Target Tables/ Data Sets’, select ‘New’

In this example the summary table is called ‘AUDIT_EVENT_SUMMARY‘, and consists of just 2 fields.

 

Cursor

An ‘AUDIT_RECORD_COUNT‘ numeric field, and a ‘CAPTURE_DATE‘ date field.

Cursor

2. Create the fields as shown, then ‘Save’

 

Create Post Load Processing Process

Now that we have the target table defined, we can set up the post-load SQL, and the stored procedure.

1. From ‘Project’ / ‘Post Load Processing’ select ‘New’

Cursor

2. Enter an appropriate name, hit ‘Save’, then select the ‘SQL Source Tables’ tab

Cursor

Data Sync offers the ability to execute the SQL and Stored Procedure either at the end of the entire load process, or after the load completion of one or more individual tables.  This is controlled within the ‘SQL Source Tables’ section.

If the post load processing is to be run after all tables have been loaded, then no source tables need to be added.  If this section is left empty, then by default the data sync tool will run the post load processing only after all tables are loaded.

If the post load processing can be run after one or more tables have been loaded, then that dependency can be set up here.

3. Select ‘Add/Remove’ and then the ‘Go’ search button to generate a list of table sources being used.

Cursor

In this example we will trigger the load after the fact table (‘AUDIT_EVENT_DBAAS’) has been loaded.

4. Select the table, then hit ‘Add’, and finally ‘Save’ to close out of the screen.

Cursor

There is a ‘SQL Target Tables’ tab as well.  This is useful if the target table needs to be truncated as part of the update process.

Truncating and reloading tables with indexes and large record volumes can result in performance issues.  The data sync tool will handle this by having the target database perform the following steps:

 

  • Truncate the table
  • Drop all indexes
  • Insert the data
  • Re-create the indexes
  • Analyze the table

If the target table is always going to be loaded incrementally, then select the ‘Truncate for Full Load’ check box, else ‘Truncate Always’.

For demonstration purposes, we will select our target summary table.

5. Select ‘Add/Remove’

Cursor

6. Select ‘Go’ to list the available target tables

Cursor

7. Select the table(s) and ‘Add’.  Then chose the appropriate option as to whether to ‘Truncate Always’ or only ‘Truncate For Full Load’.

Cursor

The next steps will be used to define the SQL and Stored Procedure.

8. Select ‘OK’ to return to the ‘Edit’ tab, hit ‘Save’, and then select the radio button within the ‘SQL(s)/Stored Procedure(s)’ box

Screenshot_11_2_16__11_40_AM

9. In the next screen select ‘Add’, enter an appropriate name, and then select whether this step is to run a ‘SQL’ statement, or a ‘Stored Procedure’.  In this first example we will set up a post load SQL command.

Cursor

10. There is also the option to run this post load process on just an ‘Initial Load’, an ‘Incremental Load’ or ‘Both’.  In this example we select ‘Both’.

Cursor

11. In the section below, as shown, enter the valid SQL statement to be run on the target database.  In this case a single row is added to the summary table that we had created previously.

Cursor

12. Click ‘OK’ to return to the previous screen.

To create a Stored Procedure follow similar steps.  In this example we will set up the post load processing entry to run both the SQL, and a Stored Procedure.

13. Select ‘Add’, enter a suitable name, and select the ‘Stored Procedure’ type.

14. Enter the name of the procedure in the entry box.  You do not need to type in ‘execute’ – the data sync tool will take care of that – just enter the name of the stored procedure, then click ‘OK’ and ‘OK’ again to exit out of the Post Load Processing set-up.

Cursor

 

When the Job is next run, the SQL and Procedure will be run after the fact table has been loaded.

It is possible to set up multiple post load processes, with different dependencies.  Each will be run independently once the source tables defined have been loaded.

 

Summary

This article walked through the steps to create a Post Load SQL and Stored Procedure within the Data Sync tool.

For other A-Team articles about BICS and Data Sync, click here.

BICS Data Sync – Running Post Load Procedures against a Schema Service DB

$
0
0

For other A-Team articles about BICS and Data Sync, click here

Introduction

The Data Sync tool provides the ability to extract from both on-premise, and cloud data sources, and to load that data into BI Cloud Service (BICS), and other relational databases. In the recent 2.2 release of Data Sync, the functionality to run Post Session SQL and Stored Procedures was added. This allows, for instance, the Data Sync tool to call a stored procedure to update summary tables, and materialized views, in the target databases once the underlying data load has been completed.

As of the time of writing, this functionality is only available when the target database is an Oracle DBCS or standalone Oracle database.  It does NOT work with the standard BICS Schema Service target database.

This article provides steps for a viable workaround to run post session commands in a Schema Service target.

(for details on how to run this new functionality with a DBCS or standard Oracle DB target – see this article)

 

Main Article

Download The Latest Version of Data Sync Tool

Be sure to download and install the latest version of the Data Sync Tool from OTN through this link.

For further instructions on configuring Data Sync, see this article.  If a previous version of Data Sync is being upgraded, use the documentation on OTN.

Process Overview

Once the main data load has been completed, a single row will be inserted into a status table in the schema service database.  That will trigger the stored procedure to be run.

This solution will provide two triggering methods.  The choice of which to use will depend on the type of stored procedure that needs to be run once the data load has completed.

The current version of the Data Sync tool does not allow us to control the order that the load steps occur in. This means that we do not have the ability to make sure that the status table – that will trigger the stored procedure – is only loaded after all other table loads are complete.

As a workaround we will use 2 jobs. The first will load the data. Once that finishes, the second job will be triggered. This will load the single row into the status table, and that will trigger the post-load stored procedure to be run.

 

Create the Target Summary Table used to Trigger Post Session Stored Procedure

For this demonstration, a simple target table ‘DS_LOAD_STATUS’ will be created in the Schema Service database with 2 fields – a ‘LOAD_STATUS’ and ‘STATUS_DATE’. The make-up of this table is not important. The main point is that a table needs to exist in the schema service database that can be loaded last.  The two different trigger methods will be discussed next, but both will use the existence of a new row in this DS_LOAD_STATUS table to trigger the post session stored procedure.

1. This example SQL can be run in the ‘SQL Workshop’ tool within Apex for the Schema Service database accompanying the BICS environment to create the DS_LOAD_STATUS table.

CREATE table “DS_LOAD_STATUS” (
“STATUS_DATE” DATE,
“LOAD_STATUS” VARCHAR2(50)
)

 

Create the Triggering Mechanism

Two different methods are shown below.  Method 2 will work for all cases.  Method 1, which is slightly simpler, will work for specific cases.

Method 1

If the post session stored procedure does not include any DDL statements (for example, truncate, drop, create indexes, tables, etc) – so it is using only ‘select’, ‘insert’, ‘update’ and ‘delete’ commands – then the simplest method is to create an On-Insert trigger on the status table.  When a row is added, the trigger fires, and the stored procedure is run.

In this case, it is assumed that a stored procedure, named ‘POST_SESSION_STEPS’, has already been created.

The following SQL will create the ‘on insert’ trigger against the DS_LOAD_STATUS table so that after a row is inserted, this stored procedure is called.

create or replace trigger “DS_LOAD_TRIGGER_SP”
AFTER
insert on “DS_LOAD_STATUS”
for each row
begin
POST_SESSION_STEPS;
end;

 

Method 2

If the stored procedure does use DDL statements, then the use of a table on-insert trigger may not run smoothly.  In that case a scheduled database job will be created, which will look for a new row in the status table.  Once the new row is recognized, this job will execute the post load stored procedure.

Once again it is assumed that a stored procedure named ‘POST_SESSION_STEPS’, has already been created.

This process contains two steps.  Firstly a short stored procedure is created which will evaluate a condition – in this case whether a new row has been recently added to the status table – and if the condition is true, it will execute the the main stored procedure.

This SQL below creates this procedure called ‘CHECK_POST_SESSION_CONDITION‘ which will check to see if a new row has been added to the DS_LOAD_STATUS table within the last 5 minutes.

create or replace procedure CHECK_POST_SESSION_CONDITION as
V_ROW_COUNT INTEGER;
begin
select count (*) from DS_LOAD_STATUS
where STATUS_DATE > sysdate – interval ‘5’ minute;  — checking for row inserted in last 5 minutes
IF V_ROW_COUNT >= 1 THEN
POST_SESSION_STEPS; — post session procedure
END IF;
END;

The final step is to create a scheduled job that runs every 5 minutes checking the condition above.

begin CLOUD_SCHEDULER.CREATE_JOB (
JOB_NAME => ‘POST_SESSION_DS_LOAD_JOB’,
JOB_TYPE => ‘STORED_PROCEDURE’,
JOB_ACTION => ‘CHECK_POST_SESSION_CONDITION’, — run the CHECK_POST_SESSION_CONDITION procedure
REPEAT_INTERVAL => ‘freq=minutely; interval=5’ ); — run job every 5 minutes
END;

 

Set up Second Job in Data Sync

All remaining steps will be carried out in the environment where data sync is installed.

In this scenario, a Data Sync Job already exists which will load the desired data into the BICS schema service database and is named ‘Main_Load’.

If this job has never been run, run it now.  A successful load is important so that the ‘Signal’ file can be created.  This is the mechanism that will be used to trigger the second job, that will then load the status table, that will in turn trigger the post-load process.

We need to create a new Project for the second job.

3. Do this by selecting ‘Projects’ from the ‘File’ menu.

Cursor

4. Chose an appropriate name

Cursor

In this example, the target table with its trigger was created in the steps 1 and 2.  We need to set up this table as a target for data sync to load to.

5. Under ‘Project’, select ‘Target Tables/Data Sets’ and then ‘New’.  In the table name enter the exact name of the existing target table – in this case ‘DS_LOAD_STATUS‘.

Cursor

6. Select the ‘Table Columns’ sub-tab, and enter the column names and correct data types to match what was created in step 1.

Cursor

We also need to define a source to create the data for this DS_LOAD_STATUS table.  If a suitable table already exists in the source database, that may be used.  In this example we will base the data on a SQL statement.

7. Under ‘Project’ / ‘Relational Data’ select ‘Data from SQL’.  Provide a name for the source, and select to load into an existing target.  Use the search drop down to select the ‘DS_LOAD_STATUS’ table created in the step 1.  Select the source connection and enter the SQL.

Cursor

In this case it is a simple select statement that will return one row, with a value of ‘LOAD_COMPLETE’ for the LOAD_STATUS field, and the current time and date, for the STATUS_DATE.

select
sysdate as STATUS_DATE,
‘LOAD_COMPLETE’ as LOAD_STATUS
from dual

 

8. Select the newly created source, and then edit the Load Strategy.  In this case, because it’s a status table, we have chosen to always append the new row, and never delete existing data.

Cursor

9. Give the Job a suitable name in the ‘Jobs’ / ‘Jobs’ menu area, and then ‘Run’ the job.

Cursor

Make sure the job runs successfully before continuing.

 

Create Data Sync Trigger Mechanism

The Data Sync tool creates ‘Signal’ files whenever a job starts and successfully finishes. These files are stored in the /log/jobSignal sub-directory. Take a look in this directory.

In our case we see 4 files, as this image shows. The important one for our purpose is the one that shows when the Main_Load job has completed. In this case that Signal File is named ‘Main_Load_CompletedSignal.txt’. This is the file we will have Data Sync check for, and when it finds it, it will trigger the second job.

 

Cursor

To set up Data Sync to automatically trigger a job, we need to edit the ‘on_demand_job.xml’ file in the /conf-shared directory.

10. Open this file with a text editor.

Cursor

11. An entry needs to be added to the <OnDemandMonitors> section.

The syntax is:

<TriggerFile job=$JOB_NAME file=$FILE_TO_TRIGGER_JOB></TriggerFile>

In this example the full syntax will be:

<TriggerFile job=”POST_LOAD_JOB” file=”C:\Users\oracle\Desktop\BICSDataSync_V2_2\log\jobSignal\Main_Load_CompletedSignal.txt”> </TriggerFile>

12. Change the pollingIntervalInMinutes to the desired check interval. In this case we set it to 1, so that Data Sync will check for the existence of the Signal file every minute.  The entry should look similar to this.

Screenshot_10_27_16__5_36_PM

13. Save the updated on_demand_job.xml

14. Test the process is working.

Re-Open the Original Project and run the Main_Load job.  Monitor the jobSignal directory.  Shortly after finishing the Main_Job, the Signal file, in this case Main_Load_CompletedSignal.txt’ is found.  The Data Sync tool deletes the file so that the process will not run again, and starts the POST_LOAD_JOB created in step 9.

Screenshot_10_27_16__5_41_PM

15. As an additional check, go to the schema service database in Apex, and confirm that the DS_LOAD_STATUS table has had a new entry added, and that the ‘post-load’ stored procedure has been successfully run.

 

Object_Browser

Summary

This article walked through an approach to run a post-load stored procedure with the Data Sync tool and a schema service database target.

For other A-Team articles about BICS and Data Sync, click here.


Loading Data into Oracle BI Cloud Service using BI Publisher Reports and SOAP Web Services

$
0
0

Introduction

This post details a method of loading data that has been extracted from Oracle Business Intelligence Publisher (BIP) into the Oracle Business Intelligence Cloud Service (BICS). The BIP instance may either be Cloud-Based or On-Premise.

It builds upon the A-Team post Using Oracle BI Publisher to Extract Data from Oracle Sales and ERP Clouds. This post uses SOAP web services to extract data from an XML-formatted BIP report.

The method uses the PL/SQL language to wrap the SOAP extract, XML parsing commands, and database table operations. It produces a BICS staging table which can then be transformed into star-schema object(s) for use in modeling.  The transformation processes and modeling are not discussed in this post.

Additional detailed information, including the complete text of the procedure described, is included in the References section at the end of the post.

Rationale for using PL/SQL

PL/SQL is the only procedural tool that runs on the BICS / Database Schema Service platform. Other wrapping methods e.g. Java, ETL tools, etc. require a platform outside of BICS to run on.

PL/SQL can utilize native SQL commands to operate on the BICS tables. Other methods require the use of the BICS REST API.

Note: PL/SQL is a very good at showcasing functionality. However, it tends to become prohibitively resource intensive when deploying in an enterprise production environment.

For the best enterprise deployment, an ETL tool such as Oracle Data Integrator (ODI) should be used to meet these requirements and more:

* Security

* Logging and Error Handling

* Parallel Processing – Performance

* Scheduling

* Code re-usability and Maintenance

The steps below depict how to load a BICS table.

About the BIP Report

The report used in this post is named BIP_DEMO_REPORT and is stored in a folder named Shared Folders/custom as shown below:

BIP Report Location

The report is based on a simple analysis with three columns and output as shown below:

BIP Demo Analysis

Note: The method used here requires all column values in the BIP report to be NOT NULL for two reasons:

1. The XPATH parsing command signals either the end of a row or the end of the data when a null result is returned.

2. All columns being NOT NULL ensures that the result set is dense and not sparse. A dense result set ensures that each column is represented in each row. Additional information regarding dense and sparse result sets may be found in the Oracle document Database PL/SQL Language Reference.

One way to ensure a column is not null is to use the IFNull function in the analysis column definition as shown below:

BIP IFNULL Column Def

Call the BIP Report

The SOAP API request used here is similar to the one detailed in Using Oracle BI Publisher to Extract Data from Oracle Sales and ERP Clouds.

The SOAP API request should be constructed and tested using a SOAP API testing tool e.g. SoapUI.

This step uses the APEX_WEB_SERVICE package to issue the SOAP API request and store the XML result in a XMLTYPE variable. The key inputs to the package call are:

* The URL for the Report Request Service

* The SOAP envelope the Report Request Service expects.

* Optional Headers to be sent with the request

* An optional proxy override

Note: Two other BI Publisher reports services exist in addition to the one shown below. The PublicReportService_v11 should be used for BI Publisher 10g environments and the ExternalReportWSSService should be used when stringent security is required. An example URL is below:

https://hostname/xmlpserver/services/v2/ReportService

An example Report Request envelope is below:

<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:v2=”http://xmlns.oracle.com/oxp/service/v2″>
<soapenv:Header/>
<soapenv:Body>
<v2:runReport>
<v2:reportRequest>
<v2:byPassCache>
true</v2:byPassCache>
<v2:flattenXML>
false</v2:flattenXML>
<v2:reportAbsolutePath>
/custom/BIP_DEMO_REPORT.xdo</v2:reportAbsolutePath>
<v2:sizeOfDataChunkDownload>
-1</v2:sizeOfDataChunkDownload>
</v2:reportRequest>
<v2:userID>’||
P_AU||'</v2:userID>
<v2:password>’||
P_AP||'</v2:password>
</v2:runReport>
</soapenv:Body>
</soapenv:Envelope>

An example of setting a SOAP request header is below:

apex_web_service.g_request_headers(1).name :=SOAPAction‘; apex_web_service.g_request_headers(1).value := ”;

An example proxy override is below:

www-proxy.us.oracle.com

 Putting this together, example APEX statements are below:

apex_web_service.g_request_headers(1).name := ‘SOAPAction’;                  apex_web_service.g_request_headers(1).value := ”;                  f_xml := apex_web_service.make_request(p_url => p_report_url, p_envelope => l_envelope, p_proxy_override => l_proxy_override );

Note: The SOAP header used in the example above was necessary for the call to the BI Publisher 11g implementation used in a demo Sales Cloud instance. If it were not present, the error LPX-00216: invalid character 31 (0x1F) would appear. This message indicates that the response received from the server was encoded in a gzip format which is not a valid xmltype data type.

Parse the BIP Report Result Envelope

This step parses the XML returned by the SOAP call for the data stored in the tag named reportBytes that is encoded in Base64 format.

The XPATH expression used below should be constructed and tested using an XPATH testing tool e.g. freeformatter.com

This step uses the APEX_WEB_SERVICE package to issue parsing command and store the result in a CLOB variable. The key inputs to the package call are:

* The XML returned from BIP SOAP call above

* The XML Path Language (XPATH) expression to find the reportBytes data

An example of the Report Response envelope returned is below:

<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”><soapenv:Body><runReportResponse xmlns=”http://xmlns.oracle.com/oxp/service/v11/PublicReportService”><runReportReturn>        <reportBytes>PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPCEtLUdlbmVyYXRlZCBieSBPcmFjbGUgQkkgUHVibGlzaGVyIDEyLjIuMS4xLjAgLURhdGFlbmdpbmUsIGRhdGFtb2RlbDpfY3VzdG9tX0JJUF9ERU1PX01PREVMX3hkbSAtLT4KPERBVEFfRFM+PFNBVy5QQVJBTS5BTkFMWVNJUz48L1NBVy5QQVJBTS5BTkFMWVNJUz4KPEdfMT4KPENPTFVNTjA+QWNjZXNzb3JpZXM8L0NPTFVNTjA+PENPTFVNTjE+NTE2MTY5Ny44NzwvQ09MVU1OMT48Q09MVU1OMj40ODM3MTU8L0NPTFVNTjI+CjwvR18xPgo8R18xPgo8Q09MVU1OMD5BdWRpbzwvQ09MVU1OMD48Q09MVU1OMT43MjM3MzYyLjM8L0NPTFVNTjE+PENPTFVNTjI+NjI3OTEwPC9DT0xVTU4yPgo8L0dfMT4KPEdfMT4KPENPTFVNTjA+Q2FtZXJhPC9DT0xVTU4wPjxDT0xVTU4xPjY2MTQxMDQuNTU8L0NPTFVNTjE+PENPTFVNTjI+NDAzNzQ0PC9DT0xVTU4yPgo8L0dfMT4KPEdfMT4KPENPTFVNTjA+Q2VsbCBQaG9uZXM8L0NPTFVNTjA+PENPTFVNTjE+NjMyNzgxOS40NzwvQ09MVU1OMT48Q09MVU1OMj40Nzg5NzU8L0NPTFVNTjI+CjwvR18xPgo8R18xPgo8Q09MVU1OMD5GaXhlZDwvQ09MVU1OMD48Q09MVU1OMT44ODA3NzUzLjI8L0NPTFVNTjE+PENPTFVNTjI+NjU1MDY1PC9DT0xVTU4yPgo8L0dfMT4KPEdfMT4KPENPTFVNTjA+SW5zdGFsbDwvQ09MVU1OMD48Q09MVU1OMT40MjA4ODQxLjM5PC9DT0xVTU4xPjxDT0xVTU4yPjY2MTQ2OTwvQ09MVU1OMj4KPC9HXzE+CjxHXzE+CjxDT0xVTU4wPkxDRDwvQ09MVU1OMD48Q09MVU1OMT43MDAxMjUzLjI1PC9DT0xVTU4xPjxDT0xVTU4yPjI2OTMwNTwvQ09MVU1OMj4KPC9HXzE+CjxHXzE+CjxDT0xVTU4wPk1haW50ZW5hbmNlPC9DT0xVTU4wPjxDT0xVTU4xPjQxMjAwOTYuNDk8L0NPTFVNTjE+PENPTFVNTjI+NTI3Nzk1PC9DT0xVTU4yPgo8L0dfMT4KPEdfMT4KPENPTFVNTjA+UGxhc21hPC9DT0xVTU4wPjxDT0xVTU4xPjY2Njk4MDguODc8L0NPTFVNTjE+PENPTFVNTjI+Mjc4ODU4PC9DT0xVTU4yPgo8L0dfMT4KPEdfMT4KPENPTFVNTjA+UG9ydGFibGU8L0NPTFVNTjA+PENPTFVNTjE+NzA3ODE0Mi4yNTwvQ09MVU1OMT48Q09MVU1OMj42MzcxNzQ8L0NPTFVNTjI+CjwvR18xPgo8R18xPgo8Q09MVU1OMD5TbWFydCBQaG9uZXM8L0NPTFVNTjA+PENPTFVNTjE+Njc3MzEyMC4zNjwvQ09MVU1OMT48Q09MVU1OMj42MzMyMTE8L0NPTFVNTjI+CjwvR18xPgo8L0RBVEFfRFM+</reportBytes><reportContentType>text/xml</reportContentType><reportFileID xsi:nil=”true”/><reportLocale xsi:nil=”true”/></runReportReturn></runReportResponse></soapenv:Body></soapenv:Envelope>

An example of the XPATH expression to retrieve just the value of reportBytes is below:

//*:reportBytes/text()

Putting these together, an example APEX statement is below:

f_report_bytes := apex_web_service.parse_xml_clob( p_xml => f_xml, p_xpath => ‘//*:reportBytes/text()’ );

Decode the Report Bytes Returned

This step uses the APEX_WEB_SERVICE package to decode the Base64 result from above into a BLOB variable and then uses the XMLTYPE function to convert the BLOB into a XMLTYPE variable.

Decoding of the Base64 result should first be tested with a Base64 decoding tool e.g. base64decode.org

An example of the APEX decode command is below:

f_blob := apex_web_service.clobbase642blob(f_base64_clob);

 An example of the XMLTYPE function is below:

f_xml := xmltype (f_blob, 1);

The decoded XML output looks like this:

<?xml version=”1.0″ encoding=”UTF-8″?>
<!–Generated by Oracle BI Publisher 12.2.1.1.0 -Dataengine, datamodel:_custom_BIP_DEMO_MODEL_xdm –>
<DATA_DS><SAW.PARAM.ANALYSIS></SAW.PARAM.ANALYSIS>
<G_1>
<COLUMN0>Accessories</COLUMN0><COLUMN1>5161697.87</COLUMN1><COLUMN2>483715</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Audio</COLUMN0><COLUMN1>7237362.3</COLUMN1><COLUMN2>627910</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Camera</COLUMN0><COLUMN1>6614104.55</COLUMN1><COLUMN2>403744</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Cell Phones</COLUMN0><COLUMN1>6327819.47</COLUMN1><COLUMN2>478975</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Fixed</COLUMN0><COLUMN1>8807753.2</COLUMN1><COLUMN2>655065</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Install</COLUMN0><COLUMN1>4208841.39</COLUMN1><COLUMN2>661469</COLUMN2>
</G_1>
<G_1>
<COLUMN0>LCD</COLUMN0><COLUMN1>7001253.25</COLUMN1><COLUMN2>269305</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Maintenance</COLUMN0><COLUMN1>4120096.49</COLUMN1><COLUMN2>527795</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Plasma</COLUMN0><COLUMN1>6669808.87</COLUMN1><COLUMN2>278858</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Portable</COLUMN0><COLUMN1>7078142.25</COLUMN1><COLUMN2>637174</COLUMN2>
</G_1>
<G_1>
<COLUMN0>Smart Phones</COLUMN0><COLUMN1>6773120.36</COLUMN1><COLUMN2>633211</COLUMN2>
</G_1>
</DATA_DS>

Create a BICS Table

This step uses a SQL command to create a simple staging table that has 20 identical varchar2 columns. These columns may be transformed into number and date data types in a future transformation exercise that is not covered in this post.

A When Others exception block allows the procedure to proceed if an error occurs because the table already exists.

A shortened example of the create table statement is below:

execute immediate ‘create table staging_table ( c01 varchar2(2048), … , c20 varchar2(2048)  )’;

Load the BICS Table

This step uses SQL commands to truncate the staging table and insert rows from the BIP report XML content.

The XML content is parsed using an XPATH command inside two LOOP commands.

The first loop processes the rows by incrementing a subscript.  It exits when the first column of a new row returns a null value.  The second loop processes the columns within a row by incrementing a subscript. It exits when a column within the row returns a null value.

The following XPATH examples are for a data set that contains 11 rows and 3 columns per row:

//G_1[2]/*[1]text()          — Returns the value of the first column of the second row

//G_1[2]/*[4]text()          — Returns a null value for the 4th column signaling the end of the row

//G_1[12]/*[1]text()        — Returns a null value for the first column of a new row signaling the end of the — data set

After each row is parsed, it is inserted into the BICS staging table.

An image of the staging table result is shown below:

BIP Table Output

 

Summary

This post detailed a method of loading data that has been extracted from Oracle Business Intelligence Publisher (BIP) into the Oracle Business Intelligence Cloud Service (BICS).

Data was extracted and parsed from an XML-formatted BIP report using SOAP web services wrapped in the Oracle PL/SQL APEX_WEB_SERVICE package.

A BICS staging table was created and populated. This table can then be transformed into star-schema objects for use in modeling.

For more BICS and BI best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-Team Chronicles for BICS.

References

Complete Text of Procedure Described

Using Oracle BI Publisher to Extract Data from Oracle Sales and ERP Clouds

Database PL/SQL Language Reference

Reference Guide for the APEX_WEB_SERVICE

Soap API Testing Tool

XPATH Testing Tool

Base64 Decoding and Encoding Testing Tool

Oracle GoldenGate: Working With Tokens and Environment Variables

$
0
0

Introduction

Oracle GoldenGate contains advanced functionality that exposes a wealth of information users may leverage. In this article we shall discuss three of these, TOKENS; which is user defined data written to Oracle GoldenGate Trails, the Column Conversion Function @TOKEN; which is used to retrieve the token data from the Oracle GoldenGate Trail, and the Column Conversion Function @GETENV; which is used to get information about the Oracle GoldenGate environment. We will demonstrate the use of each as data is replicated between an Oracle 12c Multi-tenant Database and MySQL Community Server 5.7 Database.

Main Article

What are Oracle GoldenGate Tokens?

Tokens are labels used to identify user defined data stored in the Oracle GoldenGate Trail Record Header. Tokens are defined via the Extract TABLE parameter; and, must consist of a name identifying the token and the token data. The token data character string may be up to 2000 bytes in length and may be either user specified text enclosed within single quotes or the results of an Oracle GoldenGate Column Conversion Function.

When using tokens in the replication stream, the Extract Data Pump cannot be in PASSTHRU mode.

Token data may be used in the COLMAP clause of a Replicat MAP statement, within a SQLEXEC, a UserExit, or a Macro. To retrieve the token data from the Oracle GoldenGate Trail, use the Column Conversion Function @TOKEN as input to any of the previously mentioned parameters.

Demonstration Environment

We will be replicating the sample Oracle HR database EMPLOYEES table to MySQL.

The Oracle table specifications are:

CREATE TABLE “HR”.”EMPLOYEES”
( “EMPLOYEE_ID” NUMBER(6,0),
“FIRST_NAME” VARCHAR2(20 BYTE),
“LAST_NAME” VARCHAR2(25 BYTE) CONSTRAINT “EMP_LAST_NAME_NN” NOT NULL ENABLE,
“EMAIL” VARCHAR2(25 BYTE) CONSTRAINT “EMP_EMAIL_NN” NOT NULL ENABLE,
“PHONE_NUMBER” VARCHAR2(20 BYTE),
“HIRE_DATE” DATE CONSTRAINT “EMP_HIRE_DATE_NN” NOT NULL ENABLE,
“JOB_ID” VARCHAR2(10 BYTE) CONSTRAINT “EMP_JOB_NN” NOT NULL ENABLE,
“SALARY” NUMBER(8,2),
“COMMISSION_PCT” NUMBER(2,2),
“MANAGER_ID” NUMBER(6,0),
“DEPARTMENT_ID” NUMBER(4,0),
CONSTRAINT “EMP_SALARY_MIN” CHECK (salary > 0) ENABLE,
CONSTRAINT “EMP_EMAIL_UK” UNIQUE (“EMAIL”)
CONSTRAINT “EMP_EMP_ID_PK” PRIMARY KEY (“EMPLOYEE_ID”)
CONSTRAINT “EMP_DEPT_FK” FOREIGN KEY (“DEPARTMENT_ID”)
REFERENCES “HR”.”DEPARTMENTS” (“DEPARTMENT_ID”) ENABLE,
CONSTRAINT “EMP_JOB_FK” FOREIGN KEY (“JOB_ID”)
REFERENCES “HR”.”JOBS” (“JOB_ID”) ENABLE,
CONSTRAINT “EMP_MANAGER_FK” FOREIGN KEY (“MANAGER_ID”)
REFERENCES “HR”.”EMPLOYEES” (“EMPLOYEE_ID”) ENABLE
);

The MySQL table specifications are:

CREATE TABLE `EMPLOYEES` (
`EMPLOYEE_ID` decimal(6,0) NOT NULL,
`FIRST_NAME` varchar(20) DEFAULT NULL,
`LAST_NAME` varchar(25) NOT NULL,
`EMAIL` varchar(25) NOT NULL,
`PHONE_NUMBER` varchar(20) DEFAULT NULL,
`HIRE_DATE` date NOT NULL,
`JOB_ID` varchar(10) NOT NULL,
`SALARY` decimal(8,2) DEFAULT NULL,
`COMMISSION_PCT` decimal(2,2) DEFAULT NULL,
`MANAGER_ID` decimal(6,0) DEFAULT NULL,
`DEPARTMENT_ID` varchar(45) DEFAULT NULL,
PRIMARY KEY (`EMPLOYEE_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

CREATE TABLE `OGG_TOKENS` (
`EMPLOYEE_ID` decimal(6,0) NOT NULL,
`OGG_TOKEN_NAME` varchar(200) NOT NULL,
`OGG_TOKEN_DATA` varchar(2000) DEFAULT NULL,
PRIMARY KEY (`EMPLOYEE_ID`, `OGG_TOKEN_NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Token Example

In my source Integrated Extract, I will create the token tkn-example-text and replicate the data downstream to MySQL. In the Integrated Extract parameter file, I add the token to the TABLE statement:

extract epdborcl
userid c##ggadmin, password AACAAAAAAAAAAAHAAIFBOIYAMCGIMARE, encryptkey default
exttrail ./dirdat/ea
logallsupcols
updaterecordformat compact
reportcount every 60 seconds, rate
table pdborcl.tpc.*;
table pdborcl.hr.countries;
table pdborcl.hr.departments;
table pdborcl.hr.job_history;
table pdborcl.hr.jobs;
table pdborcl.hr.locations;
table pdborcl.hr.regions;
table pdborcl.hr.employees,
tokens (tkn-example-test = ‘Example token data set on Integrated Extract’)
;

When data for the employees table is captured, the token id and data will be written to the Oracle GoldenGate Trail. Using logdump and setting the option usertoken detail, we can see the token data:

Hdr-Ind    :     E  (x45)     Partition  :     .  (x0c)
UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41)
RecLength  :    60  (x003c)   IO Time    : 2016/11/21 11:27:22.001.435
IOType     :   134  (x86)     OrigNode   :   255  (xff)
TransInd   :     .  (x00)     FormatType :     R  (x52)
SyskeyLen  :     0  (x00)     Incomplete :     .  (x00)
AuditRBA   :        272       AuditPos   : 11324432
Continued  :     N  (x00)     RecCount   :     1  (x01)
2016/11/21 11:27:22.001.435 GGSUnifiedUpdate     Len    60 RBA 2568
Name: PDBORCL.HR.EMPLOYEES  (TDR Index: 1)
After  Image:                                             Partition 12   GU b
0000 001c 0000 000a 0000 0000 0000 0000 00cf 000a | ………………..
000a 0000 0000 0000 0000 003c 0000 000a 0000 0000 | ………..<……..
0000 0000 00cf 000a 000a 0000 0000 0000 0000 00d2 | ………………..
User tokens:   62 bytes
tkn-example-test    : Example token data set on Integrated Extract

To send data downstream to the MySQL GoldenGate instance, I create an Extract Data Pump with the following settings:

extract pmysql
rmthost 192.168.120.46, mgrport 15000
rmttrail ./dirdat/om
userid c##ggadmin, password AACAAAAAAAAAAAHAAIFBOIYAMCGIMARE, encryptkey default
table pdborcl.hr.*;

We can verify the data delivery to the target GoldenGate instance by viewing the Remote GoldenGate Trail with logdump:

Hdr-Ind    :     E  (x45)     Partition  :     .  (x0c)
UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41)
RecLength  :    60  (x003c)   IO Time    : 2016/11/21 11:27:22.001.435
IOType     :   134  (x86)     OrigNode   :   255  (xff)
TransInd   :     .  (x03)     FormatType :     R  (x52)
SyskeyLen  :     0  (x00)     Incomplete :     .  (x00)
AuditRBA   :        272       AuditPos   : 20475408
Continued  :     N  (x00)     RecCount   :     1  (x01)
2016/11/21 11:27:22.001.435 GGSUnifiedUpdate     Len    60 RBA 3903
Name: PDBORCL.HR.EMPLOYEES  (TDR Index: 1)
After  Image:                                             Partition 12   GU s
0000 001c 0000 000a 0000 0000 0000 0000 00cf 0007 | ………………..
000a 0000 0000 0000 0098 9680 0000 000a 0000 0000 | ………………..
0000 0000 00cf 0007 000a 0000 0000 0000 00a7 d8c0 | ………………..
User tokens:   62 bytes
tkn-example-test    : Example token data set on Integrated Extract

To apply the data to the MySQL OGG_TOKENS table, I use the COLMAP parameter and @TOKEN Column Conversion Function in the Replicat parameter file:

replicat ro12hr
targetdb hr@localhost, userid ggadmin, password AACAAAAAAAAAAAHAAIFBOIYAMCGIMARE, encryptkey default
reportcount every 60 seconds, rate
insertupdates
map pdborcl.hr.employees, target hr.OGG_TOKENS,
colmap (usedefaults,
OGG_TOKEN_NAME = ‘tkn-example-test’,
OGG_TOKEN_DATA = @token(‘tkn-example-test’)
);
noinsertupdates
map pdborcl.hr.*, target hr.*;

To verify the token information is inserted into the target tables, we can use MySQL Workbench to query the target, which returns:

# EMPLOYEE_ID, OGG_TOKEN_NAME, OGG_TOKEN_DATA
‘208’, ‘tkn-example-test’, ‘Example token data set on Integrated Extract’

This simple demonstration is not very useful for more than showing how to set and retrieve a token. Tokens become valuable when you want to record things about the Oracle GoldenGate environment that are useful when creating history tables, monitor details about the replication environment, or record details about the database and operating system environment. To obtain this information we use the @GETENV Column Conversion Function.

 

@GETENV Column Conversion Function

The @GETENV Column Conversion Function is used to obtain information about the Oracle GoldenGate environment. The information returned by @GETENV may be used as input to SQLEXEC queries and Stored Procedures, the COLMAP option of TABLE and MAP, TOKENS, and the UserExit GET_ENV_VALUE function.

There are too many options available for us to cover in this short article; however, the Oracle GoldenGate Windows and Unix Reference Guide provides an in-depth list of all supported function options and their use.

For our demonstration, we shall use @GETENV to do the following: (1) record any lag in each replication group, (2) get the current Julian timestamp in each replication group and use that information to compute lag, (3) get each replication group name, type, and process id, and (4) get details about the source Oracle GoldenGate environment, database environment, server, and operating system.

In the target MySQL database, create two tables for this data:

CREATE TABLE OGG_LAG_DATA (
ROW_TS timestamp(6) NOT NULL,
EXT_NAME varchar(8) NULL,
EXT_TYPE varchar(50) NULL,
EXT_PID varchar (50) NULL,
EXT_LAG_SEC bigint NULL,
DP_NAME varchar(8) NULL,
DP_TYPE varchar(50) NULL,
DP_PID varchar (50) NULL,
DP_LAG_SEC bigint NULL,
REP_NAME varchar(8) NULL,
REP_TYPE varchar(50) NULL,
REP_PID varchar (50) NULL,
REP_LAG_SEC bigint NULL,
SRC_COMMIT_TS timestamp(6) NULL,
EXT_JTS bigint NULL,
EXT_LAG_JTS bigint NULL,
DP_JTS bigint NULL,
DP_LAG_JTS bigint NULL,
REP_JTS bigint NULL,
REP_LAG_JTS bigint NULL,
PRIMARY KEY (ROW_TS)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

CREATE TABLE OGG_ENV_DATA (
ROW_TS timestamp(6) NOT NULL,
SOURCE_SERVER varchar(100) NULL,
SOURCE_OS_TYPE varchar(100) NULL,
SOURCE_OS_VERSION varchar(100) NULL,
SOURCE_HARDWARE varchar(100) NULL,
SOURCE_GG_VERSION varchar(100) NULL,
SOURCE_DB_NAME varchar(100) NULL,
SOURCE_DB_INSTANCE varchar(100) NULL,
SOURCE_DB_TYPE varchar(100) NULL,
SOURCE_DB_VERSION varchar(100) NULL,
TARGET_DB_NAME varchar(100) NULL,
TARGET_DB_VERSION varchar(100) NULL,
PRIMARY KEY (ROW_TS)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

To record information about the source environment, modify the Integrated Extract and Extract Data Pump configuration:

extract epdborcl
userid c##ggadmin, password AACAAAAAAAAAAAHAAIFBOIYAMCGIMARE, encryptkey default
exttrail ./dirdat/ea
logallsupcols
updaterecordformat compact
reportcount every 60 seconds, rate
table pdborcl.tpc.*;
table pdborcl.hr.countries;
table pdborcl.hr.departments;
table pdborcl.hr.job_history;
table pdborcl.hr.jobs;
table pdborcl.hr.locations;
table pdborcl.hr.regions;
table pdborcl.hr.employees,
tokens (
tkn-ext-group = @GETENV (‘GGENVIRONMENT’, ‘GROUPNAME’),
tkn-ext-type = @GETENV (‘GGENVIRONMENT’, ‘GROUPTYPE’),
tkn-ext-pid = @GETENV (‘GGENVIRONMENT’, ‘PROCESSID’),
tkn-ext-lag = @GETENV (‘LAG’, ‘SEC’),
tkn-ext-jts = @GETENV (‘JULIANTIMESTAMP’)
);

In the EPDBORCL Integrated Extract parameter file, we will use @GETENV to set tokens for information about the GoldenGate operating environment, processing lag in seconds, and the current system time when the Extract processed its latest record; in Julian Timestamp format. We’ll use this timestamp downstream in the Replicat as an alternative method for computing lag.

extract pmysql
rmthost 192.168.120.46, mgrport 15000
rmttrail ./dirdat/om
userid c##ggadmin, password AACAAAAAAAAAAAHAAIFBOIYAMCGIMARE, encryptkey default
table pdborcl.hr.employees,
tokens (
tkn-dp-group = @GETENV (‘GGENVIRONMENT’, ‘GROUPNAME’),
tkn-dp-type = @GETENV (‘GGENVIRONMENT’, ‘GROUPTYPE’),
tkn-dp-pid = @GETENV (‘GGENVIRONMENT’, ‘PROCESSID’),
tkn-dp-lag = @GETENV (‘LAG’, ‘SEC’),
tkn-dp-jts = @GETENV (‘JULIANTIMESTAMP’)
);
table pdborcl.hr.*;

In the PMYSQL Extract Data Pump, we use the same settings as the Integrated Extract to gather information about its operating environment.

In the target MySQL GoldenGate instance, configure the Replicat to apply the token data, return information about its operating environment, and perform lag calculations from the Julian Timestamps recorded by the Integrated Extract and Extract Data Pump.

replicat ro12hr
targetdb hr@localhost, userid ggadmin, password AACAAAAAAAAAAAHAAIFBOIYAMCGIMARE, encryptkey default
reportcount every 60 seconds, rate
insertupdates
insertdeletes
map pdborcl.hr.employees, target hr.OGG_LAG_DATA,
colmap (usedefaults,
ROW_TS = @date (‘yyyy-mm-dd hh:mi:ss.ffffff’, ‘JTS’, @getenv (‘JULIANTIMESTAMP’) ),
EXT_NAME = @token (‘tkn-ext-group’),
EXT_TYPE = @token (‘tkn-ext-type’),
EXT_PID = @token (‘tkn-ext-pid’),
EXT_LAG_SEC = @token (‘tkn-ext-lag’),
DP_NAME = @token (‘tkn-dp-group’),
DP_TYPE = @token (‘tkn-dp-type’),
DP_PID = @token (‘tkn-dp-pid’),
DP_LAG_SEC = @token (‘tkn-dp-lag’),
REP_NAME = @GETENV (‘GGENVIRONMENT’, ‘GROUPNAME’),
REP_TYPE = @GETENV (‘GGENVIRONMENT’, ‘GROUPTYPE’),
REP_PID = @GETENV (‘GGENVIRONMENT’, ‘PROCESSID’),
REP_LAG_SEC = @GETENV (‘LAG’, ‘SEC’),
SRC_COMMIT_TS = @GETENV (‘GGHEADER’, ‘COMMITTIMESTAMP’),
EXT_JTS = @token (‘tkn-ext-jts’),
EXT_LAG_JTS = @datediff (‘SS’, @GETENV (‘GGHEADER’, ‘COMMITTIMESTAMP’),
@date (‘yyyy-mm-dd hh:mi:ss.ffffff’, ‘JTS’, @token (‘tkn-ext-jts’) )
),
DP_JTS = @token (‘tkn-dp-jts’),
DP_LAG_JTS = @datediff (‘SS’, @GETENV (‘GGHEADER’, ‘COMMITTIMESTAMP’),
@date (‘yyyy-mm-dd hh:mi:ss.ffffff’, ‘JTS’, @token (‘tkn-dp-jts’) )
),
REP_JTS = @GETENV (‘JULIANTIMESTAMP’)
REP_LAG_JTS = @datediff (‘SS’, @GETENV (‘GGHEADER’, ‘COMMITTIMESTAMP’),
@date (‘yyyy-mm-dd hh:mi:ss.ffffff’, ‘JTS’, @getenv (‘JULIANTIMESTAMP’) )
)
);
map pdborcl.hr.employees, target hr.OGG_ENV_DATA,
colmap (usedefaults,
ROW_TS = @date (‘yyyy-mm-dd hh:mi:ss.ffffff’, ‘JTS’, @getenv (‘JULIANTIMESTAMP’) ),
SOURCE_SERVER = @GETENV (‘GGFILEHEADER’, ‘HOSTNAME’),
SOURCE_OS_TYPE = @GETENV (‘GGFILEHEADER’, ‘OSTYPE’),
SOURCE_OS_VERSION = @GETENV (‘GGFILEHEADER’, ‘OSVERSION’),
SOURCE_HARDWARE = @GETENV (‘GGFILEHEADER’, ‘HARDWARETYPE’),
SOURCE_GG_VERSION = @GETENV (‘GGFILEHEADER’, ‘GGVERSIONSTRING’),
SOURCE_DB_NAME = @GETENV (‘GGFILEHEADER’, ‘DBNAME’),
SOURCE_DB_INSTANCE = @GETENV (‘GGFILEHEADER’, ‘DBINSTANCE’),
SOURCE_DB_TYPE = @GETENV (‘GGFILEHEADER’, ‘DBTYPE’),
SOURCE_DB_VERSION = @GETENV (‘GGFILEHEADER’, ‘DBVERSIONSTRING’),
TARGET_DB_NAME = @GETENV (‘DBENVIRONMENT’, ‘DBNAME’),
TARGET_DB_VERSION = @GETENV (‘DBENVIRONMENT’, ‘DBVERSION’)
);
noinsertupdates
noinsertdeletes
map pdborcl.hr.*, target hr.*;

In the Replicat, @TOKEN is used to retrieve tokens in the Remote Extract Trail set by the Oracle Integrated Extract and Extract Data Pump. The @DATE Column Conversion Function converts the current server timestamp, in Julian Timestamp format, into a MySQL timestamp. which is then applied to the ROW_TS column of the target tables.

We use @GETENV to return information about the Replicat operating environment, retrieve the source record commit timestamp from the GoldenGate Trail record header, retrieve information about the source server and database from the GoldenGate Trail file header, and retrieve information about the target database environment.

@DATEDIFF computes the difference in seconds between the source record commit timestamp and the Julian Timestamp tokens recorded for each record by Integrated Extract, Extract Data Pump, and Replicat. @DATE is used to convert the Julian Timestamp to the designated timestamp format.

INSERTUPDATES and INSERTDELETES tells the Replicat to convert any update or delete operations against the source EMPLOYEES table into insert operations for the target OGG_LAG_DATA and OGG_ENV_DATA tables. These settings are toggled off via NOINSERTUPDATES and NOINSERTDELETES before the wildcard MAP statement; which ensures all source insert, update, and deletes are applied correctly to the remaining target HR tables.

We can use MySQL Workbench to review the test results:

select ROW_TS, EXT_NAME, EXT_PID, EXT_LAG_SEC, EXT_JTS, EXT_LAG_JTS from OGG_LAG_DATA;
# ROW_TS, EXT_NAME, EXT_PID, EXT_LAG_SEC, EXT_JTS, EXT_LAG_JTS
‘2016-11-21 14:56:38.970890’, ‘EPDBORCL’, ‘17253’, ‘0’, ‘212346515057710776’, ‘0’
‘2016-11-21 14:56:38.972303’, ‘EPDBORCL’, ‘17253’, ‘0’, ‘212346515057710839’, ‘0’
‘2016-11-21 14:56:38.974375’, ‘EPDBORCL’, ‘17253’, ‘0’, ‘212346515057710839’, ‘0’
‘2016-11-21 14:56:38.976066’, ‘EPDBORCL’, ‘17253’, ‘0’, ‘212346515057710839’, ‘0’

select ROW_TS, DP_NAME, DP_PID, DP_LAG_SEC, DP_JTS, DP_LAG_JTS from OGG_LAG_DATA;
# ROW_TS, DP_NAME, DP_PID, DP_LAG_SEC, DP_JTS, DP_LAG_JTS
‘2016-11-21 14:56:38.970890’, ‘PMYSQL’, ‘17763’, ’98’, ‘212346515155563697’, ’98’
‘2016-11-21 14:56:38.972303’, ‘PMYSQL’, ‘17763’, ’98’, ‘212346515155603197’, ’98’
‘2016-11-21 14:56:38.974375’, ‘PMYSQL’, ‘17763’, ’98’, ‘212346515155603197’, ’98’
‘2016-11-21 14:56:38.976066’, ‘PMYSQL’, ‘17763’, ’98’, ‘212346515155603197’, ’98’

select ROW_TS, REP_NAME, REP_PID, REP_LAG_SEC, REP_JTS, REP_LAG_JTS from OGG_LAG_DATA;
# ROW_TS, REP_NAME, REP_PID, REP_LAG_SEC, REP_JTS, REP_LAG_JTS
‘2016-11-21 14:56:38.970890’, ‘RO12HR’, ‘768’, ‘3141’, ‘212346518198970890’, ‘3141’
‘2016-11-21 14:56:38.972303’, ‘RO12HR’, ‘768’, ‘3141’, ‘212346518198972303’, ‘3141’
‘2016-11-21 14:56:38.974375’, ‘RO12HR’, ‘768’, ‘3141’, ‘212346518198974375’, ‘3141’
‘2016-11-21 14:56:38.976066’, ‘RO12HR’, ‘768’, ‘3141’, ‘212346518198976066’, ‘3141’

select ROW_TS, SOURCE_SERVER, SOURCE_OS_TYPE, SOURCE_OS_VERSION, SOURCE_HARDWARE from OGG_ENV_DATA;
# ROW_TS, SOURCE_SERVER, SOURCE_OS_TYPE, SOURCE_OS_VERSION, SOURCE_HARDWARE
‘2016-11-21 14:44:46.198673’, ‘centos0ra12’, ‘Linux’, ‘#1 SMP Thu Mar 31 16:04:38 UTC 2016’, ‘x86_64’
‘2016-11-21 14:47:03.848363’, ‘centos0ra12’, ‘Linux’, ‘#1 SMP Thu Mar 31 16:04:38 UTC 2016’, ‘x86_64’
‘2016-11-21 14:47:03.849590’, ‘centos0ra12’, ‘Linux’, ‘#1 SMP Thu Mar 31 16:04:38 UTC 2016’, ‘x86_64’
‘2016-11-21 14:47:03.850831’, ‘centos0ra12’, ‘Linux’, ‘#1 SMP Thu Mar 31 16:04:38 UTC 2016’, ‘x86_64’

select ROW_TS, SOURCE_GG_VERSION from OGG_ENV_DATA;
# ROW_TS, SOURCE_GG_VERSION
‘2016-11-21 14:44:46.198673’, ‘12.2.Version 12.2.0.1.1 OGGCORE_12.2.0.1.0_PLATFORMS_151211.1401_FBO’
‘2016-11-21 14:47:03.848363’, ‘12.2.Version 12.2.0.1.1 OGGCORE_12.2.0.1.0_PLATFORMS_151211.1401_FBO’
‘2016-11-21 14:47:03.849590’, ‘12.2.Version 12.2.0.1.1 OGGCORE_12.2.0.1.0_PLATFORMS_151211.1401_FBO’
‘2016-11-21 14:47:03.850831’, ‘12.2.Version 12.2.0.1.1 OGGCORE_12.2.0.1.0_PLATFORMS_151211.1401_FBO’

select ROW_TS, SOURCE_DB_NAME, SOURCE_DB_INSTANCE, SOURCE_DB_TYPE, SOURCE_DB_VERSION from OGG_ENV_DATA;
# ROW_TS, SOURCE_DB_NAME, SOURCE_DB_INSTANCE, SOURCE_DB_TYPE, SOURCE_DB_VERSION
‘2016-11-21 14:44:46.198673’, ‘ORCL’, ‘orcl’, ‘ORACLE’, ‘12.1.Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production\nPL/SQL Release 12.’
‘2016-11-21 14:47:03.848363’, ‘ORCL’, ‘orcl’, ‘ORACLE’, ‘12.1.Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production\nPL/SQL Release 12.’
‘2016-11-21 14:47:03.849590’, ‘ORCL’, ‘orcl’, ‘ORACLE’, ‘12.1.Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production\nPL/SQL Release 12.’
‘2016-11-21 14:47:03.850831’, ‘ORCL’, ‘orcl’, ‘ORACLE’, ‘12.1.Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production\nPL/SQL Release 12.’

select ROW_TS, TARGET_DB_NAME, TARGET_DB_VERSION from OGG_ENV_DATA;
# ROW_TS, TARGET_DB_NAME, TARGET_DB_VERSION
‘2016-11-21 14:44:46.198673’, ‘hr’, ‘MySQL\nServer Version: 5.7.16\nClient Version: 5.6.14\nHost Connection: Localhost via UNIX socket\nProto’
‘2016-11-21 14:47:03.848363’, ‘hr’, ‘MySQL\nServer Version: 5.7.16\nClient Version: 5.6.14\nHost Connection: Localhost via UNIX socket\nProto’
‘2016-11-21 14:47:03.849590’, ‘hr’, ‘MySQL\nServer Version: 5.7.16\nClient Version: 5.6.14\nHost Connection: Localhost via UNIX socket\nProto’
‘2016-11-21 14:47:03.850831’, ‘hr’, ‘MySQL\nServer Version: 5.7.16\nClient Version: 5.6.14\nHost Connection: Localhost via UNIX socket\nProto’

 

Summary

In this article we presented the Oracle GoldenGate @TOKEN, @TOKENS, and @GETENV Column Conversion functions and demonstrated their use by replicating data between an Oracle Multi-tenant Database and MySQL Community Server Database.

Loading Data into Oracle BI Cloud Service using BI Publisher Reports and REST Web Services

$
0
0

Introduction

This post details a method of loading data that has been extracted from Oracle Business Intelligence Publisher (BIP) into the Oracle Business Intelligence Cloud Service (BICS). The BIP instance may either be Cloud-Based or On-Premise.

It builds upon the A-Team post Extracting Data from Oracle Business Intelligence 12c Using the BI Publisher REST API. This post uses REST web services to extract data from an XML-formatted BIP report.

The method uses the PL/SQL language to wrap the REST extract, XML parsing commands, and database table operations. It produces a BICS staging table which can then be transformed into star-schema object(s) for use in modeling.  The transformation processes and modeling are not discussed in this post.

Additional detailed information, including the complete text of the procedure described, is included in the References section at the end of the post.

Rationale for using PL/SQL

PL/SQL is the only procedural tool that runs on the BICS / Database Schema Service platform. Other wrapping methods e.g. Java, ETL tools, etc. require a platform outside of BICS to run on.

PL/SQL can utilize native SQL commands to operate on the BICS tables. Other methods require the use of the BICS REST API.

Note: PL/SQL is a very good at showcasing functionality. However, it tends to become prohibitively resource intensive when deploying in an enterprise production environment.

For the best enterprise deployment, an ETL tool such as Oracle Data Integrator (ODI) should be used to meet these requirements and more:

* Security

* Logging and Error Handling

* Parallel Processing – Performance

* Scheduling

* Code Re-usability and Maintenance

The steps below depict how to load a BICS table.

About the BIP Report

The report used in this post is named BIP_DEMO_REPORT and is stored in a folder named Shared Folders/custom as shown below: BIP Report Location

The report is based on a simple analysis with three columns and output as shown below:

BIP Demo Analysis

Note: The method used here requires all column values in the BIP report to be NOT NULL for two reasons:

* The XPATH parsing command signals either the end of a row or the end of the data when a null result is returned.

* All columns being NOT NULL ensures that the result set is dense and not sparse. A dense result set ensures that each column is represented in each row.

Additional information regarding dense and sparse result sets may be found in the Oracle document Database PL/SQL Language Reference.

One way to ensure a column is not null is to use the IFNull function in the analysis column definition as shown below:

BIP IFNULL Column Def

Call the BIP Report

The REST API request used here is similar to the one detailed in Extracting Data from Oracle Business Intelligence 12c Using the BI Publisher REST API. The REST API request should be constructed and tested using a REST API testing tool e.g. Postman

This step uses the APEX_WEB_SERVICE package to issue the REST API request and return the result in a CLOB variable. The key inputs to the package call are:

* The URL for the report request service

* Two request readers to be sent for authorization and content.

* The REST body the report request service expects.

* An optional proxy override

An example URL is below:

http://hostname/xmlpserver/services/rest/v1/reports/custom%2FBIP_DEMO_REPORT/run

Note: Any ASCII special characters used in a value within a URL, as opposed to syntax, needs to be referenced using its ASCII code prefixed by a % sign. In the example above, the slash (/) character is legal in the syntax but not for the value of the report location. Thus the report location, “custom/BIP_DEMO_REPORT” must be shown as custom%2FBIP_DEMO_REPORT where 2F is the ASCII code for a slash character.

An example request Authorization header is below.

apex_web_service.g_request_headers(1).name := ‘Authorization’;          apex_web_service.g_request_headers(1).value :=  ‘Basic cHJvZG5leTpBZG1pbjEyMw==‘;

Note: The authorization header value is the string ‘Basic ‘ concatenated with a Base64 encoded representation of a username and password separated by a colon e.g.  username:password

Encoding of the Base64 result should first be tested with a Base64 encoding tool e.g. base64encode.org

An example of the Content-Type header is below:

apex_web_service.g_request_headers(2).name := Content-Type’;            apex_web_service.g_request_headers(2).value := ‘multipart/form-data; boundary=”Boundary_1_1153447573_1465550731355“‘;

Note: The boundary value entered here in the header is for usage in the body below. The boundary text may be any random text not used elsewhere in the request.

An example of a report request body is below:

Boundary_1_1153447573_1465550731355                                                                 Content-Type: application/json                                                                              Content-Disposition: form-data; name=ReportRequest”                        {“byPassCache”:true,”flattenXML”:false}                                         —Boundary_1_1153447573_1465550731355

An example proxy override is below:

www-proxy.us.oracle.com

 An example REST API call:

f_report_clob  := apex_web_service.make_rest_request ( p_url => p_report_url, p_body => l_body,        p_http_method => ‘POST’,  p_proxy_override => l_proxy_override );

Parse the BIP REST Result

The BIP REST result is the report XML data embedded in text with form-data boundaries.

This step uses the :

* INSTR function to determine the beginning and end of the embedded XML

* SUBSTR function to extract just the embedded XML and store it in a CLOB variable

* XMLTYPE.createXML function to convert and return the XML.

The key inputs to this step are:

* The CLOB returned from BIP REST call above

* The XML root name returned from the BIP report, e.g. DATA_DS

An example of the REST result returned is below:

–Boundary_2_1430729833_1479236681852

Content-Type: application/json

Content-Disposition: form-data; name=”ReportResponse”

{“reportContentType”:”text/xml”}

–Boundary_2_1430729833_1479236681852

Content-Type: application/octet-stream

Content-Disposition: form-data; filename=”xmlp2414756005405263619tmp”; modification-date=”Tue, 15 Nov 2016 19:04:41 GMT”; size=1242; name=”ReportOutput”

<?xml version=”1.0″ encoding=”UTF-8″?>

<!–Generated by Oracle BI Publisher 12.2.1.1.0 -Dataengine, datamodel:_custom_BIP_DEMO_MODEL_xdm –>

<DATA_DS><SAW.PARAM.ANALYSIS></SAW.PARAM.ANALYSIS>

<G_1>

<COLUMN0>Accessories</COLUMN0><COLUMN1>5161697.87</COLUMN1><COLUMN2>483715</COLUMN2>

</G_1>

<G_1>

         <COLUMN0>Smart Phones</COLUMN0><COLUMN1>6773120.36</COLUMN1><COLUMN2>633211</COLUMN2>

</G_1>

</DATA_DS>

–Boundary_2_1430729833_1479236681852– >

Examples of the string functions to retrieve and convert just the XML are below. The f_report_clob variable contains the result of the REST call. The p_root_name variable contains the BIP report specific XML rootName.

To find the starting position of the XML, the INSTR function searches for the opening tag consisting of the root name prefixed with a ‘<’ character, e.g. <DATA_DS:

f_start_position := instr ( f_report_clob, ‘<‘ || p_root_name );

To find the length of the XML, the INSTR function searches for the position of the closing tag consisting of the root name prefixed with a ‘</’ characters, e.g. </DATA_DS, determines and adds the length of the closing tab using the LENGTH function, and subtracts the starting position:

f_xml_length := instr ( f_report_clob, ‘</’ || p_root_name ) + length( ‘</’ || p_root_name || ‘>’) f_start_position ;

To extract the XML and store it in a CLOB variable, the SUBSTR function uses the starting position and the length of the XML:

f_xml_clob := substr(f_report_clob, f_start_position, f_xml_length );

To convert the CLOB into an XMLTYPE variable:

f_xml := XMLTYPE.createXML( f_xml_clob );

Create a BICS Table

This step uses a SQL command to create a simple staging table that has 20 identical varchar2 columns. These columns may be transformed into number and date data types in a future transformation exercise that is not covered in this post.

A When Others exception block allows the procedure to proceed if an error occurs because the table already exists.

A shortened example of the create table statement is below:

execute immediate ‘create table staging_table ( c01 varchar2(2048), … , c20 varchar2(2048)  )’;

Load the BICS Table

This step uses SQL commands to truncate the staging table and insert rows from the BIP report XML content.

The XML content is parsed using an XPATH command inside two LOOP commands.

The first loop processes the rows by incrementing a subscript.  It exits when the first column of a new row returns a null value.  The second loop processes the columns within a row by incrementing a subscript. It exits when a column within the row returns a null value.

The following XPATH examples are for a data set that contains 11 rows and 3 columns per row:

//G_1[2]/*[1]/text()          — Returns the value of the first column of the second row

//G_1[2]/*[4]/text()          — Returns a null value for the 4th column signaling the end of the row

//G_1[12]/*[1]/text()        — Returns a null value for the first column of a new row signaling the end of the — data set

After each row is parsed, it is inserted into the BICS staging table.

An image of the staging table result is shown below:

BIP Table Output

Summary

This post detailed a method of loading data that has been extracted from Oracle Business Intelligence Publisher (BIP) into the Oracle Business Intelligence Cloud Service (BICS).

Data was extracted and parsed from an XML-formatted BIP report using REST web services wrapped in the Oracle PL/SQL APEX_WEB_SERVICE package.

A BICS staging table was created and populated. This table can then be transformed into star-schema objects for use in modeling.

For more BICS and BI best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-Team Chronicles for BICS.

References

Complete Text of Procedure Described

Extracting Data from Oracle Business Intelligence 12c Using the BI Publisher REST API

Database PL/SQL Language Reference

Reference Guide for the APEX_WEB_SERVICE

REST API Testing Tool

XPATH Testing Tool

Base64 decoding and encoding Testing Tool

 

 

Best Practices – Data movement between Oracle Storage Cloud Service and HDFS

$
0
0

Introduction

Oracle Storage Cloud Service should be the central place for persisting raw data produced from another PaaS services and also the entry point for data that is uploaded from the customer’s data center. Big Data Cloud Service ( BDCS ) supports data transfers between Oracle Storage Cloud Service and HDFS. Both Hadoop and Oracle provides various tools and Oracle engineered solutions for the data movement. This document outlines various tools and describes the best practices to improve data transfer usability between Oracle Storage Cloud Service and HDFS.

Main Article

Architectural Overview

 

new_oss_architecture

Interfaces to Oracle Storage Cloud Service

 

Interface

Resource

odcp

Accessing Oracle Storage Cloud Service Using Oracle Distributed Copy

Distcp

Accessing Oracle Storage Cloud Service Using Hadoop Distcp

Upload CLI

Accessing Oracle Storage Cloud Service Using the Upload CLI Tool

Hadoop fs -cp

Accessing Oracle Storage Cloud Service Using hadoop File System shell copy

Oracle Storage Cloud Software Appliance

Accessing Oracle Storage Cloud Service Using Oracle Storage Cloud Software Appliance

Application Programming platform

Java Library – Accessing Oracle Storage Cloud Service Using Java Library
File Transfer Manager API – Accessing Oracle Storage Cloud Service Using File Transfer Manager API
REST API – Accessing Oracle Storage Cloud Service Using REST API

 

Oracle Distributed Copy (odcp)

Oracle Distributed Copy (odcp) is a tool for copying very large data files in a distributed environment between HDFS and an Oracle Storage Cloud Service.

  • How does it work?

odcp tool has two main components.

(a) odcp launcher script

(b) conductor application

odcp launcher script is a bash script serving as a launcher for the spark application which provides a fully parallel transfer of files.

Conductor application is an Apache Spark application to copy large files between HDFS and an Oracle Storage Cloud Service.

For end users it is recommended to use the odcp launcher script. The odcp launcher script simplifies the usage of Conductor application by encapsulating environment variables setup for hadoop/Java, spark-submit parameters setup and invoking spark application etc. The conductor application is an ideal approach while performing data movement using spark application.

blog3

odcp takes the given input file (source) and splits it into smaller file chunks. Each input chunk is then transferred by one executor over the network to destination store.

basic-flow

When all chunks are successfully transferred, executors take output chunks and merge them into final output files.

flow

  • Examples

Oracle Storage Cloud Service is based on Swift, the open-source Open Stack Object Store. The data stored in Swift can be used as the direct input to a MapReducer job by simply using the “swift:// <URL>” to declare the source of the data. In a Swift File system URL, the hostname part of the URL identifies the container and the service to work with; the path identifies the name of the object.

Swift syntax:

Swift://<MyContainer.MyProvider>/<filename>

odcp launcher script

Copy file from HDFS to Oracle Storage Cloud Service

odcp hdfs:///user/oracle/data.raw swift://myContainer.myProvider/data.raw

Copy file from Oracle Storage Cloud Service to HDFS:

odcp swift://myContainer.myProvider/data.raw hdfs:///user/oracle/odcp-data.raw

Copy directory from HDFS to Oracle Storage Cloud Service:

odcp hdfs:///user/data/ swift://myContainer.myProvider/backup

In case the system has more than 3 nodes, transfer speed can be increased by specifying a higher number of executors. For 6 nodes, use the following command:

odcp –num-executors=6 hdfs:///user/oracle/data.raw swift://myContainer.myProvider/data.raw

 

Highlight of odcp launcher script Options
–executor-cores: This option is called number of executor cores. This specifies the number of thread counts which depends on vCPU. This allows scripts to run in parallel based on the thread count.  The default value is 30.
–num-executors: This option is called number of executors. This will be the same as the number of physical node/VMs. The default value is 3.

 

Conductor application

Usage: Conductor [options] <source URI...> <destination URI>
<source URI...> <destination URI>
source/destination file(s) URI, examples:
hdfs://[HOST[:PORT]]/<path>
swift://<container>.<provider>/<path>
file:///<path>
-i <value> | --fsSwiftImpl <value>
swift file system configuration. Default taken from etc/hadoop/core-site.xml (fs.swift.impl)
-u <value> | --swiftUsername <value>
swift username. Default taken from etc/hadoop/core-site.xml fs.swift.service.<PROVIDER>.username)
-p <value> | --swiftPassword <value>
swift password. Default taken from etc/hadoop/core-site.xml (fs.swift.service.<PROVIDER>.password)
-i <value> | --swiftIdentityDomain <value>
swift password. Default taken from etc/hadoop/core-site.xml (fs.swift.service.<PROVIDER>.tenant)
-a <value> | --swiftAuthUrl <value>
swift auth URL. Default taken from etc/hadoop/core-site.xml (fs.swift.service.<PROVIDER>.auth.url)
-P <value> | --swiftPublic <value>
indicates if all URLs are public - yes/no (default yes). Default taken from etc/hadoop/core-site.xml (fs.swift.service.<PROVIDER>.public)
-r <value> | --swiftRegion <value>
swift Keystone region
-b <value> | --blockSize <value>
destination file block size (default 268435456 B), NOTE: remainder after division of partSize by blockSize must be equal to zero
-s <value> | --partSize <value>
destination file part size (default 1073741824 B), NOTE: remainder after division of partSize by blockSize must be equal to zero
-e <value> | --srcPattern <value>
copies file when their names match given regular expression pattern, NOTE: ignored when used with --groupBy
-g <value> | --groupBy <value>
concatenate files when their names match given regular expression pattern
-n <value> | --groupName <value>
group name (use only with --groupBy), NOTE: slashes are not allowed
--help
display this help and exit

 

One can submit a spark conductor application to a spark deployment environment for execution of spark applications. Below is an example of how to submit a spark conductor application.

spark-submit
–conf spark.yarn.executor.memoryOverhead=600
–jars hadoop-openstack-spoc-2.7.2.jar,scopt_2.10-3.4.0.jar
–class oracle.paas.bdcs.conductor.Conductor
–master yarn
–deploy-mode client
–executor-cores <number of executor core e.g. 5>
–executor-memory <memory size e.g. 40G>
–driver-memory < driver memory size e.g. 10G>
original-conductor-1.0-SNAPSHOT.jar
–swiftUsername <oracle username@oracle.com>
–swiftPassword <password>
–swiftIdentityDomain <storage ID assigned to this user>
–swiftAuthUrl https://<Storage cloud domain name e.g. storage.us2.oraclecloud.com:443>/auth/v2.0/tokens 
–swiftPublic true
–fsSwiftImpl org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem
–blockSize <block size e.g. 536870912>
swift://<container.provider e.g. rstrejc.a424392>/someDirectory
swift:// <container.provider e.g. rstrejc.a424392>/someFile
hdfs:///user/oracle/

  • Limitations

odcp consumes a lot of resources of the cluster. While running other Spark/MapReduce jobs parallel to odcp, one needs to adjust the number of executors, the amount of memory available to the executors or the number of executor cores using the options –executor-cores,  –executor-memory and –num-executors parameter value for better performance.

 

Distcp

Distcp (distributed copy) is a Hadoop utility used for inter/intra-cluster copying of large amounts of data in parallel. The Distcp command submits a regular MapReducer job that performs a file-by-file copy.

  • How does it work?

Distcp involves two steps:

(a) Building the list of files to copy (known as the copy list)

(b) Running a MapReduce job to copy files, with the copy list as input

distcp

The MapReduce job that does the copying has only mappers—each mapper copies a subset of files in the copy list. By default, the copy list is a complete list of all files in the source directory parameters of Distcp.

 

  • Examples

 

Copying data from HDFS to Oracle Storage Cloud Service syntax:

hadoop distcp hdfs://<hadoop namenode>/<source filename> swift://<MyContainer.MyProvider>/<destination filename>

Allocation of JVM heap-size:   

export HADOOP_CLIENT_OPTS=”-Xms<start heap memory size> –Xmx<max heap memory size>”

Setting timeout syntax:

hadoop distcp – Dmapred.task.timeout=<time in milliseconds>  hdfs://<hadoop namenode>/<source filename> swift://<MyContainer.MyProvider>/<destination filename>

Hadoop getmerge syntax:

bin/hadoop fs -getmerge [nl] <source directory> <destination directory>/<output filename>

Hadoop getmerge command takes a source directory and a destination file as an input and concatenates source files into the destination local file. The parameter  –nl can be set to add a newline character at the end of each file.

 

  • Limitations

For a large file copy, one has to make sure that the task has a termination strategy in case the task doesn’t read an input, write an output, or update its status string. The option  -Dmapred.task.timeout=<time in milliseconds>  can be used to set the maximum timeout value. In case of 1TB file size use -Dmapred.task.timeout=60000000 (approximately 16 hours) with Distcp command.

Distcp might run out of memory while copying very large files. To get around this, consider changing the -Xmx JVM heap-size parameters before executing hadoop distcp command. This value must be multiple of 1024

In order to improve the transfer speed of very large file, one has to split the file at source and copy these split files to destination. Once the files are successfully transferred, at the destination end, Hadoop performs merge operation.

Upload CLI

 

  • How does it work?

The Upload CLI tool is a cross-platform Java-based command line tool that you can use to efficiently upload files to Oracle Storage Cloud Service. This tool optimized uploads through segmentation and parallelization to maximize network efficiency and reduce overall upload time. During the large file transfer process,  if the system gets interrupted, upload CLI tool maintain the state and resumes from the point where the file transfer get interrupted. This tool has an automatic retry option on failures.

  • Example:

Syntax of upload CLI:

java -jar uploadcli.jar -url REST_Endpoint_URL -user userName -container containerName file-or-files-or-directory

To upload a file named file.txt to a standard container myContainer in the domain myIdentityDomain as the user abc.xyz@oracle.com, run the following command:

java -jar uploadcli.jar -url https://foo.storage.oraclecloud.com/myIdentityDomain-myServiceName -user abc.xyz@oracle.com -container myContainer file.txt

When running the Upload CLI tool on a host that’s behind a proxy server, specify the host name and port of the proxy server by using the https.proxyHost and https.proxyPort Java parameters.

 

Syntax of upload CLI behind proxy server:

java -Dhttps.proxyHost=host -Dhttps.proxyPort=port -jar uploadcli.jar -url REST_Endpoint_URL -user userName -container containerName file-or-files-or-directory

  • Limitations

Upload CLI is a java tool and will only run on hosts which satisfy the prerequisites for uploadcli tool.

 

Hadoop fs -cp

 

  • How does it work?

Hadoop fs -cp is a family of Hadoop file system shell commands that can run from source operating system’s command line interface. Hadoop fs -cp is not distributed across cluster. This command transfer data byte by byte from the source machine where the command has been issued.

  • Example

hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2

 

  • Limitations

The byte by byte transfer takes a very long time to copy large file from HDFS to Oracle Storage Cloud Service.

 

Oracle Storage Cloud Software Appliance

 

  • How does it work?

Oracle Storage Cloud Software Appliance is a product that facilitates easy, secure, reliable data storage and retrieval from Oracle Storage Cloud Service. Businesses can use Oracle Cloud Storage without changing their data center applications and workflows. The applications which use standard file-based network protocol like NFS to store and retrieve data, can use Oracle Storage Cloud Software Appliance as a bridge between Oracle Storage Cloud Service which uses object storage and standard file storage. Oracle Storage Cloud Software Appliance caches frequently retrieved data on the local host, minimizing the number of REST API calls to Oracle Storage Cloud Service and enabling low-latency, high-throughput file I/O.

The application host instance can mount directory to the Oracle Storage Cloud Software Appliance that acts as a cloud storage gateway. This enables the application host instance to access Oracle Cloud Storage container as a standard NFS file system.

 

Architecture

blog2

 

  • Limitations

The appliance is ideal for backup and archive use cases that require the replication of infrequently accessed data to cloud containers. Read-only and read-dominated content repositories are ideal target. Once the Oracle Storage Cloud Service container is mapped to a filesystem in Oracle Storage Cloud Software Appliance, other data movement tools like REST API, odcp, distcp, java library can’t be used for the specific container. Doing so would cause the data in the appliance to become inconsistent with data in Oracle Storage Cloud Service.

 

Application Programming Platform

Oracle provides various java library APIs to access Oracle Storage Cloud Services. The following interfaces summarizes various APIs one can use programmatically to access Oracle storage cloud service.

Interface

Description

Java library

Accessing Oracle Storage Cloud Service Using Java Library

File Transfer Manager API

Accessing Oracle Storage Cloud Service Using File Transfer Manager API

REST API

Accessing Oracle Storage Cloud Service Using REST API


Java Library  

 

  • How does it work?

The Java library is useful for Java Applications which prefer to use Oracle Cloud Java API for Oracle Storage Cloud Service instead of tools provided by Oracle and Hadoop. The Java library wraps the RESTful web service API. Most of the major RESTful API features to Oracle Storage Cloud Service are available through the Java Library. The Java Library is available via separate Oracle Cloud Service Java SDK.

 

java library

  • Example

Sample Code snippet

package storageupload;
import oracle.cloud.storage.*;
import oracle.cloud.storage.model.*;
import oracle.cloud.storage.exception.*;
import java.io.*;
import java.util.*;
import java.net.*;
public class UploadingSegmentedObjects {
public static void main(String[] args) {
try {
CloudStorageConfig myConfig = new CloudStorageConfig();
myConfig.setServiceName(“Storage-usoracleXXXXX”)
.setUsername(“xxxxxxxxx@yyyyyyyyy.com”)
.setPassword(“xxxxxxxxxxxxxxxxx”.toCharArray())
.setServiceUrl(“https://xxxxxx.yyyy.oraclecloud.com&#8221;);
CloudStorage myConnection = CloudStorageFactory.getStorage(myConfig);
System.out.println(“\nConnected!!\n”);
if ( myConnection.listContainers().isEmpty() ){
myConnection.createContainer(“myContainer”);
}
FileInputStream fis = new FileInputStream(“C:\\temp\\hello.txt”);
myConnection.storeObject(“myContainer”, “C:\\temp\\hello.txt”, “text/plain”, fis);
fis = new FileInputStream(“C:\\temp\\hello.txt”);
myConnection.storeObject(“myContainer”, “C:\\temp\\hello1.txt”, “text/plain”, fis);
fis = new FileInputStream(“C:\\temp\\hello.txt”);
myConnection.storeObject(“myContainer”, “C:\\temp\\hello2.txt”, “text/plain”, fis);
List myList = myConnection.listObjects(“myContainer”, null);
Iterator it = myList.iterator();
while (it.hasNext()) {
System.out.println((it.next().getKey().toString()));
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

 

  • Limitations

Java API cannot create Oracle storage Cloud Service archive container. Appropriate JRE version is required for the Java Library.

 

File Transfer Manager API

 

  • How does it Work?

The File Transfer Manager (FTM) API is a Java library that simplifies uploading to and downloading from Oracle Storage Cloud Service. The File Transfer Manager provides both synchronous and asynchronous APIs to transfer files. It provides a way to track the operations for asynchronous version. The Java Library is available via separate Oracle Cloud Service Java SDK.

 

  • Example

Uploading a Single File Sample Code snippet

FileTransferAuth auth = new FileTransferAuth
(
"email@oracle.com", // user name
"xxxxxx", // password
"yyyyyy", //  service name
"https://xxxxx.yyyyy.oraclecloud.com", // service URL
"xxxxxx" // identity domain
);
FileTransferManager manager = null;
try {
manager = FileTransferManager.getDefaultFileTransferManager(auth);
String containerName = "mycontainer";
String objectName = "foo.txt";
File file = new File("/tmp/foo.txt");
UploadConfig uploadConfig = new UploadConfig();
uploadConfig.setOverwrite(true);
uploadConfig.setStorageClass(CloudStorageClass.Standard);
System.out.println("Uploading file " + file.getName() + " to container " + containerName);
TransferResult uploadResult = manager.upload(uploadConfig, containerName, objectName, file);
System.out.println("Upload completed successfully.");
System.out.println("Upload result:" + uploadResult.toString());
} catch (ClientException ce) {
System.out.println("Upload failed. " + ce.getMessage());
} finally {
if (manager != null) {
manager.shutdown();
}
}

 

REST API

 

  • How does it work?

The REST API can be accessed from any application or programming platform that correctly and completely understands the Hypertext Transfer Protocol (HTTP). The REST API uses advanced facets of HTTP such as secure communication over HTTPS, HTTP headers, and specialized HTTP verbs (PUT, DELETE). cURL is one of the many applications that meet these requirement.

 

  • Example

cURL syntax:

curl -v -s -X PUT -H “X-Auth-Token: <Authorization Token ID>” https://Oracle Cloud Storage domain name>/v1/<storage ID associated to user account/<container name>”

 

Some Data Transfer Test results

The configuration used to measure performance and data transfer rates are as following:

Test environment configuration:

- BDCS 16.2.5
- Hadoop Swift driver 2.7.2
- US2 production data center
- 3 nodes cluster that is running in BDA
- Every node has 256GB memory/30 vCPU
- File size: 1TB (Terabyte)
- File contains all zeros

#

Interface

Source

Destination

Time

Comment

 1  odcp HDFS Oracle Storage Cloud Service 54 minutes Transfer rate :

2.47 GB/sec

1.11 TB/hour

2 hadoop Distcp Oracle Storage Cloud Service HDFS failed Not Enough memory (after 1h)
3 hadoop Distcp HDFS Oracle Storage Cloud Service Failed
4 hadoop Distcp HDFS Oracle Storage Cloud Service 3 hours Based on splitting 1TB files into 50 files with each file size of 10GB. Each 10GB file took 18 minutes (and with partition size 256MB)
5 Upload CLI HDFS Oracle Storage Cloud Service 5 hours  55 minutes Data was read from Big Data Cloud Service HDFS mounted using fuse_dfs
6 hadoop fs -cp HDFS Oracle Storage Cloud Service 11 hours 50 minutes 50 seconds Parallelism 1, Transfer rate: 250 Mb/sec

 

Summary

One can make following conclusions from the above analysis.

Data File size and Data transfer time are two main components on deciding the appropriate interface for data movement between HDFS and Oracle Storage Cloud Service.

There is no additional overhead of data manipulation and processing using odcp interface.

Uploading a file to Oracle storage cloud service using REST API

$
0
0

Introduction

This is the second part of a two part article which demonstrates how to upload data in near-real time from an on-premise oracle database to Oracle Storage Cloud Service.

In the previous article of this series, we demonstrated Oracle GoldenGate functionality to write to a flat file using Apache Flume File Roll Sink. If you would like to read the first part in this article series please visit Oracle GoldenGate : Apply to Apache Flume File Roll Sink

In this article we will demonstrate using the cURL command which will upload the flat file to Oracle Storage Cloud Service.

We used the Oracle Big Data Lite Virtual Machine as the test bed for this article. The VM image is available for download on the Oracle Technology Network website.

Main Article

There are various tools available to access Oracle Storage Cloud Service. According to Best Practices – Data movement between Oracle Storage Cloud Service and HDFS , cURL REST interface is appropriate for this requirement.

cURL REST Interface

REST API

REST API is used to manage containers and objects in the Oracle Storage Cloud Service instance. Anyone can access the REST API from any application or programming platform that understands the Hypertext Transfer Protocol (HTTP) and has Internet connectivity.

cURL is one of the tools used to access the REST interface. cURL is an open source tool used for transferring data which supports various protocols including HTTP and HTTPS. cURL is typically available by default on most UNIX-like hosts. For information about downloading and installing cURL, see Quick Start.

Oracle Storage Cloud Service ( OSCS )

Oracle Storage Cloud Service enables applications to store and manage contents in the cloud. Stored objects can be retrieved directly by external clients or by applications running within Oracle Cloud (For example: Big Data Preparation Cloud Service).

A container is a storage compartment that provides a way to organize the data stored in Oracle Storage Cloud Service. Containers are similar to directories, but with a key distinction; unlike directories, containers cannot be nested.

Prerequisites

First, we need access to the Oracle Storage Cloud Service and information about the Oracle Cloud user name, password, and identity domain.

credentials

Requesting an Authentication Token

Oracle Storage Cloud Service requires authentication for any operation against the service instance. Authentication is performed by using an authentication token. Authentication tokens are requested from the service by authenticating user credentials with the service. All provisioned authentication tokens are temporary and will expire in 30 minutes. We will include a current authentication token with every request to Oracle Storage Cloud Service.

Request an authentication token by running the following cURL command:

curl -v -s -X GET -H ‘X-Storage-User: <my identity domain>:<Oracle Cloud user name>’ -H ‘X-Storage-Pass: <Oracle Cloud user password>’ https://<myIdentityDomain>.storage.oraclecloud.com/auth/v1.0

We ran the above cURL command. The following is the output of this command, with certain key lines highlighted. Note that if the request includes the correct credentials, it returns the HTTP/1.1 200 OK response.

 

OSCS_Auth_token

 

From the output of the command we just ran, note the following:

– The value of the X-Storage-Url header.

This value is the REST endpoint URL of the service. This URL value will be used in the next step to create the container.

-The value of the X-Auth-Token header.

This value is the authentication token, which will be used in the next step to create the container. Note that the authentication token expires after 30 minutes, after the token expires you should request a fresh token.

Creating A Container

Run the following cURL command to create a new container:

curl -v -s -X PUT -H “X-Auth-Token: <Authentication Token ID>” https://storage.oraclecloud.com/v1/Storage-myIdentityDomain/myFirstContainer

– Replace the value of the X-Auth-Token header with the authentication token that you obtained earlier.
– Change https://storage.oraclecloud.com/v1/Storage-myIdentityDomain to the X-Storage-Url header value that you noted while getting an authentication token.
– And change myFirstContainer to the name of the container that you want to create.

Verifying that A Container is created

 Run the following cURL command:

curl -v -s -X GET -H “X-Auth-Token: <Authentication Token ID>” https://storage.oraclecloud.com/v1/Storage-myIdentityDomain/myFirstContainer

If the request is completed successfully, it returns the HTTP/1.1 204 No Content response. This response indicates that there are no objects yet in new container.

In this exercise, as we are not creating a new container. We will use an existing container to upload the file. So we don’t need to verify the container creation .

Uploading an Object

Once Oracle GolgenGate completes writing the records to a file at /u01/ogg-bd/flumeOut directory, the cURL program reads the file present at /u01/ogg-bd/flumeOut directory. Then it uploads the file to create an object in the  container myFirstContainer. Any user with the Service Administrator role or a role that is specified in the X-Container-Write ACL of the container can create an object.

We ran the following cURL command:

curl -v -X PUT -H “X-Auth-Token:  <Authentication Token ID>”-T myfile https://<MyIdentityDomain>.storage.oraclecloud.com/v1/Storage-myIdentityDomain/myFirstContainer/myObject

When running this command we…
– Replaced the value of the X-Auth-Token header with the authentication token that we obtained earlier.
– Changed https://<MyIdentiryDomain>.storage.oraclecloud.com/v1/Storage-myIdentityDomain to the X-Storage-Url header value that we noted while getting an authentication token.
– Changed myFirstContainer to the name of the container that we want to create.
– Changed myfile  to the full path and name of the file that we want to upload
– Changed myObject to the name of the object that we want to create in the container

If the request is completed successfully, it returns the HTTP/1.1 201 Created response, as shown in the following output. We verified the full transfer by comparing “Content-Length” value.

 

Upload_to_OSCS

 

We also verified the proper transfer of the file to Oracle Storage Cloud Service using Big Data Preparation Cloud Service.

BDPCS_Source

Summary

In this article we demonstrated the functionality of REST API which uploads the data from the On Premise Big Data Lite VM to Oracle Storage Cloud Service.  After combining both articles we demonstrated the functionality of moving data on near real-time from the On-premise Oracle database to Oracle Storage Cloud Service using Oracle Golden Gate and REST API.

Loading Data into Oracle BI Cloud Service using OTBI Analyses and SOAP

$
0
0

Introduction

This post details a method of loading data that has been extracted from Oracle Transactional Business Intelligence (OTBI) using SOAP into the Oracle Business Intelligence Cloud Service (BICS). The OTBI instance may either be Cloud-Based or On-Premise. This method may also be used to load data from Oracle Business Intelligence Enterprise Edition (OBIEE).

It builds upon the A-Team post Using Oracle BI Answers to Extract Data from HCM via Web Services which details the extraction process.

This post uses the PL/SQL language to wrap the SOAP extract, XML parsing commands, and database table operations in a stored procedure in the BICS Schema Service database. It produces a BICS staging table which can then be transformed into star-schema object(s) for use in modeling.  The transformation processes and modeling are not discussed in this post.

The most complex portion of this post details how to convert the analysis XML report results, embedded in a CDATA (Character Data) text attribute, back into standard XML markup notation so the rows and columns of data can be parsed.

Additional detailed information, including the complete text of the procedure described, is included in the References section at the end of the post.

Rationale for using PL/SQL

PL/SQL is the only procedural tool that runs on the BICS / Database Schema Service platform. Other wrapping methods e.g. Java, ETL tools, etc. require a platform outside of BICS to run on.

PL/SQL can utilize native SQL commands to operate on the BICS tables. Other methods require the use of the BICS REST API.

Note: PL/SQL is a very good at showcasing functionality. However, it tends to become prohibitively resource intensive when deploying in an enterprise production environment.

For the best enterprise deployment, an ETL tool such as Oracle Data Integrator (ODI) should be used to meet these requirements and more:

* Security

* Logging and Error Handling

* Parallel Processing – Performance

* Scheduling

* Code re-usability and Maintenance

The steps below depict how to load a BICS table.

About the OTBI Analysis

The analysis used in this post is named Suppliers and is stored in a folder named Shared Folders/custom as shown below:

A

The analysis has three columns and output as shown below:

B

Note: The method used here requires all column values in the analysis to be NOT NULL for two reasons. The XPATH parsing command signals either the end of a row or the end of the data when a null result is returned. All columns being NOT NULL ensures that the result set is dense and not sparse. A dense result set ensures that each column is represented in each row. Additional information regarding dense and sparse result sets may be found in the Oracle document Database PL/SQL Language Reference.

One way to ensure a column is not null is to use the IFNull function in the analysis column definition as shown below:

C

An optional parameter may be sent at run time to filter each column.

Ensuring the Web Services are Available

To ensure that the web services are available in the required environment, type a form of the following URL into a browser:

https://hostname/analytics-ws/saw.dll/wsdl/v9

Note: The version number e.g. v9 may vary from server to server.

If you are not able to reach the website, the services may not be offered.  Discuss this with the server administrator.

Calling the Analysis

Calling the analysis is a two-step process. The first step initiates a session in OTBI and returns a session ID.  The second step uses that session ID to call the analysis and extract the data.

The SOAP API requests should be constructed and tested using a SOAP API testing tool e.g. SoapUI.

Note: API testing tools such as SoapUI, cURL, Postman, and so on are third-party tools for using SOAP and REST services. Oracle does not provide support for these tools or recommend a particular tool for its APIs. You can select the tool based on your requirements.

The procedure uses the APEX_WEB_SERVICE package to issue the SOAP API requests and store the XML result in a XMLTYPE variable. The key inputs to the package call are:

* The URL for the OTBI Session Web Service

* The URL for the OTBI XML View Web Service

* The Base64 encoded credentials to access the analysis

* The SOAP envelopes expected by the OTBI Web Service.

* Optional Parameters to filter the results

* An optional proxy override

Decoding the Credentials

To avoid hard-coding credentials in the procedure, the credentials are expected to be encoded in a base64 format prior to invoking the procedure. A useful base64 encoding tool may be found at Base64 Decoding and Encoding Testing Tool. The text to encode should be in the format username:password

The APEX_WEB_SERVICE and the DBMS_LOB packages and the INSTR function are used to decode the credentials into username and password variables. The APEX_WEB_SERVICE package decodes the credentials into a BLOB variable. The DBMS_LOB package converts the BLOB to a CLOB variable. The INSTR function then separates the decoded result into the two variables.

Examples are below:

— Decode the Base 64 Credentials
f_blob := apex_web_service.clobbase642blob(f_base64_creds);
— Create a temporary CLOB instance
dbms_lob.createtemporary(f_clob, true);
— Convert the decoded BLOB credentials to a CLOB
dbms_lob.converttoclob(
f_clob,
f_blob,
v_file_size,
v_dest_offset,
v_src_offset,
v_blob_csid,
v_lang_context,
v_warning);
— Parse the credentials into username and password
f_au := substr ( f_clob, 1, instr(f_clob, ‘:’) -1 ); — username
f_ap := substr ( f_clob, instr(f_clob, ‘:’) +1 ); — password

Calling the Session Service

An example Session URL is below:

https://hostname/analytics-ws/saw.dll?SoapImpl=nQSessionService

An example Logon Request envelope is below. The result will be an envelope containing a session ID.

<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:v9=”urn://oracle.bi.webservices/v9″>
<soapenv:Header/>
<soapenv:Body>
<v9:logon>
<v9:name>username</v9:name>
<v9:password>password</v9:password>
</v9:logon>
</soapenv:Body>
</soapenv:Envelope>

 An example APEX_WEB_SERVICE call for the login is below:

f_xml := apex_web_service.make_request(p_url => f_session_url
,p_envelope => f_envelope
— ,p_proxy_override => — An optional Proxy URL
— ,p_wallet_path => — An optional path to an Oracle database wallet file
— ,p_wallet_pwd => — The password for the optional Oracle database wallet file
);

The APEX_WEB_SERVICE package is used to parse the XML result from above to obtain the session ID. An example is below:

f_session_id := apex_web_service.parse_xml_clob(p_xml => f_xml
,p_xpath => ‘//*:sessionid/text()
);

Troubleshooting the Session Service Call

Three common issues are the need for a proxy, the need for a trusted certificate (if using HTTPS), and the need to use the TLS security protocol.

The need for a proxy may be detected when the following error occurs: ORA-12535: TNS:operation timed out. Adding the optional p_proxy_override  parameter to the call may correct the issue. An example proxy override is:

www-proxy.us.oracle.com

The need for a trusted certificate is detected when the following error occurs: ORA-29024: Certificate validation failure.

A workaround may be to run this procedure from a full Oracle Database Could Service or an on-premise Oracle database. Adding the trusted certificate(s) to an Oracle database wallet file and adding the optional p_wallet_path  and p_wallet_pwd  parameters to the call should correct the issue.  For more information on Oracle wallets, refer to Using Oracle Wallet Manager in the References section of this post.

The need to use the TLS protocol maybe detected when the following error occurs: ORA-29259: end-of-input reached.

A workaround is to run this procedure from a different Oracle Database Could Service or an on-premise Oracle database. Ensure the database version is 11.2.0.4.10 or above.

Additionally: When using an on-premise Oracle database, the SQL Operations described later in this post (Create Table, Truncate Table, Insert) may be modified to use the BICS REST API. For more information refer to the REST APIs for Oracle BI Cloud Service in the References section of this post.

Calling the XML View Service

An example XML View service URL is:

https://hostname/analytics-ws/saw.dll?SoapImpl=xmlViewService

An example Analysis Request envelope is below. This envelope contains the session ID from the logon call, the location of the analysis, a placeholder variable for the VNUM analysis variable, and a filter value for the VTYPE variable.

<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:v9=”urn://oracle.bi.webservices/v9″>
<soapenv:Header/>
<soapenv:Body>
<v9:executeXMLQuery>
<v9:report>
<v9:reportPath>/shared/custom/Suppliers</v9:reportPath>
<v9:reportXml></v9:reportXml>
</v9:report>
<v9:outputFormat>xml</v9:outputFormat>
<v9:executionOptions>
<v9:async></v9:async>
<v9:maxRowsPerPage></v9:maxRowsPerPage>
<v9:refresh></v9:refresh>
<v9:presentationInfo></v9:presentationInfo>
<v9:type></v9:type>
</v9:executionOptions>
<v9:reportParams>
<!–Zero or more repetitions:–>
<v9:variables>
<v9:name>VNUM</v9:name>
<v9:value></v9:value>
</v9:variables>
<v9:variables>
<v9:name>VTYPE</v9:name>
<v9:value>Supplier</v9:value>
</v9:variables>
</v9:reportParams>
<v9:sessionID>’||F_SESSION_ID||'</v9:sessionID>
</v9:executeXMLQuery>
</soapenv:Body>
</soapenv:Envelope>

An example APEX_WEB_SERVICE call for the analysis result is below:

f_xml := apex_web_service.make_request(p_url => f_report_url
,p_envelope => f_envelope
— ,p_proxy_override => — An optional Proxy URL
— ,p_wallet_path => — An optional path to an Oracle database wallet file
— ,p_wallet_pwd => — The password for the optional Oracle database wallet file
);

Preparing the XML Result

The XML result from the Analysis call contains the report results in a CDATA text section. In order to parse the results, the XML within the text section is converted into standard XML using the XMLTYPE package and the REPLACE function.

An example of the CDATA section result, as seen in SoapUI, is below:

<sawsoap:rowset xsi:type=”xsd:string”><![CDATA[<rowset xmlns=”urn:schemas-microsoft-com:xml-analysis:rowset”>
<Row>
<Column0>UJ Catering Service AG</Column0>
<Column1>5991</Column1>
<Column2>Supplier</Column2>
</Row>
</rowset>]]>
</sawsoap:rowset>

The same result, as seen in APEX_WEB_SERVICE, is below:

<sawsoap:rowset xsi:type=”xsd:string”>&lt;rowset xmlns=&quot;urn:schemas-microsoft-com:xml-analysis:rowset&quot;&gt;
&lt;Row&gt;
&lt;Column0&gt;UJ Catering Service AG&lt;/Column0&gt;
&lt;Column1&gt;5991&lt;/Column1&gt;
&lt;Column2&gt;Supplier&lt;/Column2&gt;
&lt;/Row&gt;
&lt;/rowset&gt;
</sawsoap:rowset>

The converted result needed for parsing  is below:

<sawsoap:rowset xsi:type=”xsd:string”><bi:rowset xmlns:bi=”urn:schemas-microsoft-com:xml-analysis:rowset”>
<Row>
<Column0>UJ Catering Service AG</Column0>
<Column1>5991</Column1>
<Column2>Supplier</Column2>
</Row>
</bi:rowset>
</sawsoap:rowset>

The XMLTYPE package and the REPLACE function usage is below.  Note: the CHR(38) function returns the ‘&’ character.

F_CLOB := F_XML.GETCLOBVAL(); — Convert to CLOB
F_CLOB := REPLACE (F_CLOB, CHR(38)||’lt;’, ‘<‘);
F_CLOB := REPLACE (F_CLOB, CHR(38)||’gt;’, ‘>’ );
F_CLOB := REPLACE (F_CLOB, CHR(38)||’quot;’,'”‘);
F_CLOB := REPLACE (F_CLOB, ‘/rowset’, ‘/bi:rowset’); — Insert bi namespace
F_CLOB := REPLACE (F_CLOB, ‘<rowset’, ‘<bi:rowset’); — Insert bi namespace
F_CLOB := REPLACE (F_CLOB, ‘xmlns=’, ‘xmlns:bi=’); — Insert bi namespace
F_XML := XMLTYPE.createXML( F_CLOB ); — Convert back to XMLTYPE

Creating a BICS Table

This step uses a SQL command to create a simple staging table that has 20 identical varchar2 columns. These columns may be transformed into number and date data types in a future transformation exercise that is not covered in this post.

A When Others exception block allows the procedure to proceed if an error occurs because the table already exists. An example is below:

EXCEPTION
WHEN OTHERS THEN NULL; — Ignore error if table exists

Note: The table needs to be created once before compiling the procedure the first time. The complete DDL is below:

CREATE TABLE STAGING_TABLE
(
C01 VARCHAR2(2048 BYTE),C02 VARCHAR2(2048 BYTE), C03 VARCHAR2(2048 BYTE), C04 VARCHAR2(2048 BYTE), C05 VARCHAR2(2048 BYTE),
C06 VARCHAR2(2048 BYTE),C07 VARCHAR2(2048 BYTE), C08 VARCHAR2(2048 BYTE), C09 VARCHAR2(2048 BYTE), C10 VARCHAR2(2048 BYTE),
C11 VARCHAR2(2048 BYTE),C12 VARCHAR2(2048 BYTE), C13 VARCHAR2(2048 BYTE), C14 VARCHAR2(2048 BYTE), C15 VARCHAR2(2048 BYTE),
C16 VARCHAR2(2048 BYTE),C17 VARCHAR2(2048 BYTE), C18 VARCHAR2(2048 BYTE), C19 VARCHAR2(2048 BYTE), C20 VARCHAR2(2048 BYTE)
)

A shortened example of the create table statement is below:

execute immediate ‘create table staging_table ( c01 varchar2(2048), … , c20 varchar2(2048)  )’;

Loading the BICS Table

This step uses SQL commands to truncate the staging table and insert rows from the BIP report XML content.

The XML content is parsed using an XPATH command inside two LOOP commands.

The first loop processes the rows by incrementing a subscript.  It exits when the first column of a new row returns a null value.  The second loop processes the columns within a row by incrementing a subscript. It exits when a column within the row returns a null value.

The following XPATH examples are for a data set that contains 5 rows and 3 columns per row:

//Row[2]/*[1]/text() — Returns the value of the first column of the second row
//Row[2]/*[4]/text() — Returns a null value for the 4th column signaling the end of the row
//Row[6]/*[1]/text() — Returns a null value for the first column of a new row signaling the end of the — data set

After each row is parsed, it is inserted into the BICS staging table.

An image of the staging table result is shown below:

D

Summary

This post detailed a method of loading data that has been extracted from Oracle Transactional Business Intelligence (OTBI) using SOAP into the Oracle Business Intelligence Cloud Service.

A BICS staging table was created and populated. This table can then be transformed into star-schema objects for use in modeling.

For more BICS and BI best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-Team Chronicles for BICS.

References

Complete Text of Procedure Described

Using Oracle BI Answers to Extract Data from HCM via Web Services

Database PL/SQL Language Reference

Reference Guide for the APEX_WEB_SERVICE

Soap API Testing Tool

XPATH Testing Tool

Base64 Decoding and Encoding Testing Tool

Using Oracle Wallet Manager

REST APIs for Oracle BI Cloud Service

 

 

 

Loading Data from Oracle Field Service Cloud into Oracle BI Cloud Service using SOAP

$
0
0

Introduction

This post details a method of extracting and loading data from Oracle Field Service Cloud (OFSC) into the Oracle Business Intelligence Cloud Service (BICS).

A compelling reason to use such a method is when data is required that is not in the standard daily extract. Such data might be planning (future) data or data recently provided in new releases of the application.

This post uses SOAP web services to extract XML-formatted data responses. It also uses the PL/SQL language to wrap the SOAP extract, XML parsing commands, and database table operations in a Stored Procedure. It produces a BICS staging table and a staging view which can then be transformed into star-schema object(s) for use in modeling. The transformation processes and modeling are not discussed in this post.

Finally, an example of a database job is provided that executes the Stored Procedure on a scheduled basis.

The PL/SQL components are for demonstration purposes only and are not intended for enterprise production use. Additional detailed information, including the complete text of the PL/SQL procedure described, is included in the References section at the end of this post.

Rationale for Using PL/SQL

PL/SQL is the only procedural tool that runs on the BICS / Database Schema Service platform. Other wrapping methods e.g. Java, ETL tools, etc. require a platform outside of BICS to run on.

PL/SQL may also be used in a DBCS that is connected to BICS.

PL/SQL can utilize native SQL commands to operate on the BICS tables. Other methods require the use of the BICS REST API.

Note: PL/SQL is very good at showcasing functionality. However, it tends to become prohibitively resource intensive when deploying in an enterprise production environment. For the best enterprise deployment, an ETL tool such as Oracle Data Integrator (ODI) should be used to meet these requirements and more:

* Security

* Logging and Error Handling

* Parallel Processing and Performance

* Scheduling

* Code Re-usability and Maintenance

Using Oracle Database Cloud Service

Determining Security Protocol Requirements

If the web service requires a security protocol, key exchange or cypher not supported by the default BICS Schema Database Service, another Oracle Database Cloud Service (DBCS) may be used.

An example security protocol is TLS version 1.2 which is used by the OFSC web service accessed in this post.

Note: For TLSv1.2, specify a database version of 11.2.0.4.10 or greater, or any version of 12c. If the database is not at the required version, PL/SQL may throw the following error: ORA-29259: end-of-input reached

To detect what protocol a web service uses, open the SOAP WSDL page in a browser, click the lock icon, and navigate to the relevant security section. A Chrome example from an OFSC WSDL page is below:

1

Preparing the DBCS

If a DBCS other than the default Schema Service is used, the following steps need to be performed.

Create a BICS user in the database. The use of Jobs and the DBMS_CRPTO package shown in the example below are discussed later in the post. Example SQL statements are below:

— USER SQL
CREATE USER “BICS_USER” IDENTIFIED BY password
DEFAULT TABLESPACE “USERS”
TEMPORARY TABLESPACE “TEMP”
ACCOUNT UNLOCK;
— QUOTAS
ALTER USER “BICS_USER” QUOTA UNLIMITED ON USERS;
— ROLES
ALTER USER “BICS_USER” DEFAULT ROLE “CONNECT”,”RESOURCE”;
— SYSTEM PRIVILEGES
GRANT CREATE VIEW TO “BICS_USER”;
GRANT CREATE ANY JOB TO “BICS_USER”;
–OBJECT PERMISSIONS
GRANT EXECUTE ON SYS.DBMS_CRYPTO TO BICS_USER;

Create an entry in a new or existing Oracle database wallet for the trusted public certificate used to secure connections to the web service via the Internet. A link to the Oracle Wallet Manager documentation is included in the References section. Note the location and password of the wallet as they is used to issue the SOAP request.

The need for a trusted certificate is detected when the following error occurs: ORA-29024: Certificate validation failure.

An example certificate path found using Chrome browser is shown below. Both of these trusted certificates need to be in the Oracle wallet.

2

Preparing the Database Schema

Two objects need to be created prior to compiling the PL/SQL stored procedure.

The first is a staging table comprising a set of identical columns. This post uses a staging table named QUOTA_STAGING_TABLE. The columns are named consecutively as C01 through Cnn. This post uses 50 staging columns. The SQL used to create this table may be viewed here.

The second is a staging view named QUOTA_STAGING_VIEW built over the staging table. The view column names are the attribute names used in the API WSDL. The SQL used to create this view may be viewed here. The purpose of the view is to relate an attribute name found in the SOAP response to a staging table column based on the view column’s COLUMN_ID in the database. For example, if a response attribute name of bucket_id is detected and the COLUMN_ID of the corresponding view column is 3, then the staging table column populated with the attribute value would be C03.

Ensuring the Web Services are Available

To ensure that the web services are available in the required environment, type a form of the following URL into a browser:

https://hostname/soap/capacity/?wsdl

Note: If you are unable to reach the website, the services may not be offered or the URL may have changed. Discuss this with the service administrator.

Using API Testing Tools

The SOAP Request Envelope should be developed in an API testing tool such as SoapUI or Postman. The XPATH expressions for parsing should be developed and tested in an XPATH expression testing tool such as FreeFormatter. Links to these tools are provided in the References section.

Note: API testing tools such as SoapUI, FreeFormatter, Postman, and so on are third-party tools for using SOAP and REST services. Oracle does not provide support for these tools or recommend a particular tool for its APIs. You can select the tool based on your requirements.

Preparing the SOAP Request

This post uses the get_quota_data method of the Oracle Field Service Cloud Capacity Management API. Additional information about the API is included as a link in the References section.

Use a browser to open the WSDL page for this API. An example URL for the page is: https://hostname/soap/capacity/?wsdl. This page provides important information regarding the request and response envelopes used by the API.

The request envelope is comprised of the following sections. Note: To complete the envelope creation, the sections are concatenated together to provide a single request envelope. An example of a complete request enveloped may be viewed here.

Opening

The Opening section is static text as shown below:

<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:urn=”urn:toa:capacity”>
<soapenv:Header/>
<soapenv:Body>
<urn:get_quota_data>

User

The User section is dynamic and comprises the following components:

Now

The now component is the current time in the UTC time zone. An example is: <now>2016-12-19T09:13:10+00:00</now>. It is populated by the following command:

SELECT TO_CHAR (SYSTIMESTAMP AT TIME ZONE ‘UTC’, ‘YYYY-MM-DD”T”HH24:MI:SS”+00:00″‘ ) INTO V_NOW FROM DUAL;

Login

The login component is the user name.

Company

The company component is the company for which data is being retrieved.

Authorization String

The auth_string component is the MD5 hash of the concatenation of the now component with the MD5 hash of the user password. In pseudo-code it would be md5 (now + md5 (password)). It is populated by the following command:

SELECT
LOWER (
DBMS_CRYPTO.HASH (
V_NOW||
LOWER( DBMS_CRYPTO.HASH (V_PASSWORD,2) )
,2
)
)
INTO V_AUTH_STRING FROM DUAL;

Note: ‘2’ is the code for MD5.

An example is:

<auth_string>b477d40346ab40f1a1a038843d88e661fa293bec5cc63359895ab4923051002a,/auth_string>

Required Parameters

There are two required parameters: date and resource_id. Each may have multiple entries. However the sample procedure in this post allows only one resource id. It also uses just one date to start with and then issues the request multiple times for the number of consecutive dates requested.

In this post, the starting date is the current date in Sydney, Australia. An example is below:

<date>2016-12-21</date> <resource_id>Test_Resource_ID</resource_id>

The starting date and subsequent dates are populated by this command:

CASE WHEN P_DATE IS NULL
THEN SELECT TO_CHAR (SYSTIMESTAMP AT TIME ZONE ‘Australia/Sydney’, ‘YYYY-MM-DD’) INTO P_DATE FROM DUAL;
ELSE P_DATE:= TO_CHAR (TO_DATE (P_DATE, ‘YYYY-MM-DD’) + 1,’YYYY-MM-DD’); — Increments the day by 1
END CASE;

Aggregation

The aggregation component specifies whether to aggregate the results. Since BI will do this automatically, aggregation and totals are set to 0 (no). An example is:

<aggregate_results>0</aggregate_results> <calculate_totals>0</calculate_totals>

Field Requests

This section may be passed as a parameter and it lists the various data fields to be included in the extract. An example is below:

<day_quota_field>max_available</day_quota_field>
<time_slot_quota_field>max_available</time_slot_quota_field>
<time_slot_quota_field>quota</time_slot_quota_field>
<category_quota_field>used</category_quota_field>
<category_quota_field>used_quota_percent</category_quota_field>
<work_zone_quota_field>status</work_zone_quota_field>

Closing

The Closing section is static text as shown below:

</urn:get_quota_data>
</soapenv:Body>
</soapenv:Envelope>

Calling the SOAP Request

The APEX_WEB_SERVICE package is used to populate a request header and issue the SOAP request. The header requests that the web service return the contents in a non-compressed text format as shown below:

 

APEX_WEB_SERVICE.G_REQUEST_HEADERS(1).NAME := ‘Accept-Encoding’;
APEX_WEB_SERVICE.G_REQUEST_HEADERS(1).VALUE := ‘identity’;

For each date to be processed the SOAP request envelope is created and issued as shown below:

F_XML      := APEX_WEB_SERVICE.MAKE_REQUEST(
P_URL         => F_SOAP_URL
,P_ENVELOPE    => F_REQUEST_ENVELOPE
,P_WALLET_PATH => ‘file:wallet location
,P_WALLET_PWD  => ‘wallet password‘ );

Troubleshooting the SOAP Request Call

Common issues are the need for a proxy, the need for a trusted certificate (if using HTTPS), and the need to use the TLS security protocol. Note: This post uses DBCS so the second and third issues have been addressed.

The need for a proxy may be detected when the following error occurs: ORA-12535: TNS:operation timed out. Adding the optional p_proxy_override parameter to the call may correct the issue. An example proxy override is:

www-proxy.us.oracle.com

 

Parsing the SOAP Response

For each date to be processed the SOAP response envelope is parsed to obtain the individual rows and columns.

The hierarchy levels of the capacity API are listed below:

Bucket > Day > Time Slot > Category > Work Zone

Each occurrence of every hierarchical level is parsed to determine attribute names and values. Both the name and the value are then used to populate a column in the staging table.

When a hierarchical level is completed and no occurrences of a lower level exist, a row is inserted into the BICS staging table.

Below is an example XML response element for one bucket.

<bucket>
<bucket_id>TEST Bucket ID</bucket_id>
<name>TEST Bucket Name</name>
<day>
<date>2016-12-21</date>
<time_slot>
<label>7-10</label>
<quota_percent>100</quota_percent>
<quota>2520</quota>
<max_available>2520</max_available>
<used_quota_percent>0</used_quota_percent>
<category>
<label>TEST Category</label>
<quota_percent>100</quota_percent>
<quota>2520</quota>
<max_available>2340</max_available>
<used_quota_percent>0</used_quota_percent>
</category>
</time_slot>
<time_slot>
<label>10-14</label>
<quota_percent>100</quota_percent>
<quota>3600</quota>
<max_available>3600</max_available>
<used_quota_percent>0</used_quota_percent>
<category>
<label>TEST Category</label>
<quota_percent>100</quota_percent>
<quota>3600</quota>
<max_available>3360</max_available>
<used_quota_percent>0</used_quota_percent>
</category>
</time_slot>
<time_slot>
<label>14-17</label>
<quota_percent>100</quota_percent>
<quota>2220</quota>
<max_available>2220</max_available>
<used_quota_percent>0</used_quota_percent>
<category>
<label>TEST Category</label>
<quota_percent>100</quota_percent>
<quota>2220</quota>
<max_available>2040</max_available>
<used_quota_percent>0</used_quota_percent>
</category>
</time_slot>
</day>
</bucket>

The processing of the bucket element is as follows:

Occurrences 1 and 2 of the bucket level are parsed to return attribute names of bucket_id and name. The bucket_id attribute is used as-is and the name attribute is prefixed with “bucket_” to find the corresponding column_ids in the staging view. The corresponding columns in the staging table, C03 and C04, are then populated.

Occurrence 3 of the bucket level returns the day level element tag. Processing then continues at the day level.

Occurrence 1 of the day level returns the attribute name of date. The attribute name is prefixed with “day_” to find the corresponding column_id in the staging view. The corresponding column in the staging table, C05, is then populated with the value ‘2016-12-21’.

Occurrence 2 of the day level returns the first of three time_slot level element tags. Processing for each continues at the time-slot level. Each time_slot element contains 5 attribute occurrences followed by a category level element tag.

Each category level contains 5 attribute occurrences. Note: there is no occurrence of a work_zone level element tag in the category level. Thus after each category level element is processed, a row is written to the staging table.

The end result is that 3 rows are written to the staging table for this bucket. The table below describes the XML to row mapping for the first row.

Attribute Name Attribute Value View Column Name Table Column Name
bucket_id TEST Bucket ID BUCKET_ID C03
name TEST Bucket Name BUCKET_NAME C04
day 2016-12-21 DAY_DATE C05
label 7-10 TIME_SLOT_LABEL C18
quota_percent 100 TIME_SLOT_QUOTA_PERCENT C19
quota 2520 TIME_SLOT_QUOTA C21
max_available 2520 TIME_SLOT_MAX_AVAILABLE C26
used_quota_percent 0 TIME_SLOT_USED_QUOTA_PERCENT C29
label TEST Category CAT_LABEL C32
quota_percent 100 CAT_QUOTA_PERCENT C33
quota 2520 CAT_QUOTA C35
max_available 2340 CAT_MAX_AVAILABLE C42
used_quota_percent 0 CAT_USED_QUOTA_PERCENT C44

 

In PL/SQL, the processing is accomplished using the LOOP command. There is a loop for each hierarchical level. Loops end when no results are returned for a parse statement.

XPATH statements are used for parsing. Additional information regarding XPATH statements may be found in the References section. Examples are below:

Statement Returns
/bucket[5] The entire fifth bucket element in the response. If no results then all buckets have been processed.
/bucket/*[1] The first bucket attribute or element name.
/bucket/*[2]/text() The second bucket attribute value.
/bucket/day/*[6] The sixth day attribute or element name.
/bucket/day[1]/*[6]/text() The sixth day attribute value.
/bucket/day/time_slot[2]/*[4] The fourth attribute or element name of the second time_slot.

 

Scheduling the Procedure

The procedure may be scheduled to run periodically through the use of an Oracle Scheduler job. A link to the Scheduler documentation may be found in the References section.

A job is created using the CREATE_JOB procedure by specifying a job name, type, action and a schedule. Setting the enabled argument to TRUE enables the job to automatically run according to its schedule as soon as you create it.

An example of a SQL statement to create a job is below:

BEGIN
DBMS_SCHEDULER.CREATE_JOB (
JOB_NAME        => ‘OFSC_SOAP_QUOTA_EXTRACT’,
JOB_TYPE        => ‘STORED_PROCEDURE’,
ENABLED          => TRUE,
JOB_ACTION      => ‘BICS_OFSC_SOAP_INTEGRATION’,
START_DATE      => ’21-DEC-16 10.00.00 PM Australia/Sydney’,
REPEAT_INTERVAL => ‘FREQ=HOURLY; INTERVAL=24’   — this will run the job every 24 hours
);
END;
/

Note: If using the BICS Schema Service database, the package name is CLOUD_SCHEDULER rather than DBMS_SCHEDULER.

The job log and status may be queried using the *_SCHEDULER_JOBS views. Examples are below:

SELECT JOB_NAME, STATE, NEXT_RUN_DATE from USER_SCHEDULER_JOBS;
SELECT LOG_DATE, JOB_NAME, STATUS from USER_SCHEDULER_JOB_LOG;

 

Summary

This post detailed a method of extracting and loading data from Oracle Field Service Cloud (OFSC) into the Oracle Business Intelligence Cloud Service (BICS).

The post used SOAP web services to extract the XML-formatted data responses. It used a PL/SQL Stored Procedure to wrap the SOAP extract, XML parsing commands, and database table operations. It loaded a BICS staging table and a staging view which can be transformed into star-schema object(s) for use in modeling.

Finally, an example of a database job was provided that executes the Stored Procedure on a scheduled basis.

For more BICS and BI best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-Team Chronicles for BICS.

References

Text of Complete Procedure

OFSC Capacity API Document

OFSC Capacity API WSDL

Scheduling Jobs with Oracle Scheduler

Database PL/SQL Language Reference

Reference Guide for the APEX_WEB_SERVICE

Soap API Testing Tool

XPATH Testing Tool

Base64 Decoding and Encoding Testing Tool

Using Oracle Wallet Manager

Oracle Business Intelligence Cloud Service Tasks

 


Loading Data from Oracle Identity Cloud Service into Oracle BI Cloud Service using REST

$
0
0

Introduction

This post details a method of extracting and loading data from Oracle Identity Cloud Service (IDCS) into the Oracle Business Intelligence Cloud Service (BICS). It builds upon the A-team post IDCS Audit Event REST API which details the REST API calls used.

One use case for this method is for analyzing trends regarding audit events.

This post uses REST web services to extract JSON-formatted data responses. It also uses the PL/SQL language to wrap the REST extract, JSON parsing commands, and database table operations in a Stored Procedure. It produces a BICS staging table which can then be transformed into star-schema object(s) for use in modeling. The transformation processes and modeling are not discussed in this post.

Finally, an example of a database job is provided that executes the Stored Procedure on a scheduled basis.

The PL/SQL components are for demonstration purposes only and are not intended for enterprise production use. Additional detailed information, including the complete text of the PL/SQL procedure described, is included in the References section at the end of this post.

Rationale for Using PL/SQL

PL/SQL is the only procedural tool that runs on the BICS / Database Schema Service platform. Other wrapping methods e.g. Java, ETL tools, etc. require a platform outside of BICS to run on.

PL/SQL may also be used in a DBaaS (Database as a Service) that is connected to BICS.

PL/SQL can utilize native SQL commands to operate on the BICS tables. Other methods require the use of the BICS REST API.

Note: PL/SQL is a very good at showcasing functionality. However, it tends to become prohibitively resource intensive when deploying in an enterprise production environment. For the best enterprise deployment, an ETL tool such as Oracle Data Integrator (ODI) should be used to meet these requirements and more:

* Security

* Logging and Error Handling

* Parallel Processing – Performance

* Scheduling

* Code Re-usability and Maintenance

Using Oracle Database as a Service

Determining Security Protocol Requirements

If the web service requires a security protocol, key exchange or cypher not supported by the default BICS Schema Database Service, another Oracle Database Cloud Service (DBaaS) may be used.

Note: For the most consistent response, specify a database version of 11.2.0.4.10 or greater, or any version of 12c. If the database is not at the required version, PL/SQL may throw the following error: ORA-29259: end-of-input reached

To detect what protocol a web service uses, open the IDCS Login page in a browser, click the lock icon, and navigate to the relevant security section. A Chrome example from an IDCS Login page is below:

1

Preparing the DBaaS

If DBaaS is used, the following steps need to be performed.

Creating the BICS User

Create a BICS user in the database. The use of the Job privilege is discussed later in the post. Example SQL statements are below:

 — USER SQL
CREATE USER “BICS_USER” IDENTIFIED BY password
DEFAULT TABLESPACE “USERS”
TEMPORARY TABLESPACE “TEMP”
ACCOUNT UNLOCK;
— QUOTAS
ALTER USER “BICS_USER” QUOTA UNLIMITED ON USERS;
— ROLES
ALTER USER “BICS_USER” DEFAULT ROLE “CONNECT”,”RESOURCE”;
— SYSTEM PRIVILEGES
GRANT CREATE VIEW TO “BICS_USER”;
GRANT CREATE ANY JOB TO “BICS_USER”;

Managing Trusted Certificates

Create an entry in a new or existing Oracle database wallet for the trusted public certificate used to secure connections to the web service via the Internet. A link to the Oracle Wallet Manager documentation is included in the References section. Note the location and password of the wallet as they are used to issue the REST request.

The need for a trusted certificate is detected when the following error occurs: ORA-29024: Certificate validation failure.

An example certificate path found using Chrome browser is shown below. Both of these trusted certificates need to be in the Oracle wallet.

2

Granting Network Access

This post uses the UTL_HTTP package which requires the user to have permission to access web services via an Access Control List (ACL).

The need for an ACL privilege is detected when the following error occurs: ORA-24247: network access denied by access control list (ACL).

Grant the BICS_USER authority to connect to the network access control list (ACL). To determine your unique network ACL name run the following:

SELECT * FROM DBA_NETWORK_ACLS;

Using the network name from above run the following:

BEGIN
DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE(acl   => ‘NETWORK_ACL_YourUniqueSuffix’
principal   => ‘BICS_USER’,
is_grant    => true,
privilege   => ‘connect’);
END;
/

 

Preparing the Database Schema

A staging table needs to be created prior to compiling the PL/SQL stored procedure.

This post uses a staging table named AUDIT_EVENT. The columns are those chosen from the REST API for Oracle Identity Cloud Service. A link to the document may be found in the References section. This post uses the following columns:

ACTOR_DISPLAY_NAME
ACTOR_ID
ACTOR_NAME
ACTOR_TYPE
ADMIN_REF_RESOURCE_NAME
ADMIN_RESOURCE_NAME
EC_ID
EVENT_ID
ID
MESSAGE
SSO_COMMENTS
SSO_PROTECTED_RESOURCE
SSO_USER_AGENT
TIMESTAMP

The SQL used to create this table may be viewed here.

Using API Testing Tools

The REST requests should be developed in API testing tools such as SoapUI and Postman. The JSON expressions for parsing should be developed and tested in a JSON expression testing tool such as CuriousConcept. Links to these tools are provided in the References section.

Note: API testing tools such as SoapUI, CuriousConcept, Postman, and so on are third-party tools for using SOAP and REST services. Oracle does not provide support for these tools or recommend a particular tool for its APIs. You can select the tool based on your requirements. As a starting point and for some examples refer to the A-Team post IDCS OAuth 2.0 and REST API.

Preparing and Calling the IDCS REST Service

This post uses the AuditEvents and Token methods of the IDCS REST API 

Preparing the Token Request

IDCS uses the OAuth 2.0 framework for authorization. This requires an access token to be requested and provided via the Token method of the API.

Before preparing the REST request, a Web Application needs to be created in IDCS. This administrative function is not covered in this post. You will need the Client ID and the Client Secret generated with the web application.

You must encode the Client ID and Client Secret when you include it in a request for an access token. A Base64 encoding tool such as Base64 may be used to perform this step. Place the Client ID and Client Secret on the same line and insert a colon between them: clientid:clientsecret and then encode the string. An example encoded result is

Y2xpZW50aWQ6Y2xpZW50c2VjcmV0

You will need the wallet path and password discussed in the Preparing the DBaaS section above. An example path from a linux server is:

/u01/app/oracle

You will need the URL for the Token method of the URL such as:

https://idcs-hostname/oauth2/v1/token

The APEX_WEB_SERVICE package is used to set the headers and parameters described below.

Two HTTP request headers are needed. The first is a Content-Type header and the second is an Authorization header. The authorization header value is the concatenation of the string ‘Basic ‘ with the Base64 encoded result of the Client ID and the Client Secret as shown below:

v_authorization_token := ‘Y2xpZW50aWQ6Y2xpZW50c2VjcmV0’;
apex_web_service.g_request_headers (1).name := ‘Content-Type’;
apex_web_service.g_request_headers(1).value := ‘application/x-www-form-urlencoded; charset=UTF-8’;
apex_web_service.g_request_headers(2).name := ‘Authorization’;
apex_web_service.g_request_headers(2).value := ‘Basic ‘||v_authorization_token ;

The parameter method is set to POST and two HTTP request parameters are needed. The first is a grant_type and the second is a scope as shown below:

p_http_method => ‘POST’,
p_parm_name => apex_util.string_to_table(‘grant_type:scope’),
p_parm_value => apex_util.string_to_table(‘client_credentials~urn:opc:idm:__myscopes__’,’~’)

Note: The urn:opc:idm:__myscopes__ in the scope parameter value is used as a tag by Oracle Identity Cloud Service clients requesting access tokens from the OAuth authorization server. Access tokens are returned that contain all applicable Oracle Identity Cloud Service scopes based on the privileges represented by the Oracle Identity Cloud Service administrator roles granted to the requesting client.

Calling the Token Request

The APEX_WEB_SERVICE package is used to call the request and store the result in a CLOB variable as shown below:

l_ws_response_clob := apex_web_service.make_rest_request (
p_url => l_ws_url,
p_http_method => ‘POST’,
p_parm_name => apex_util.string_to_table(‘grant_type:scope’),
p_parm_value => apex_util.string_to_table (‘client_credentials~urn:opc:idm:__myscopes__’,’~’)
,p_wallet_path => ‘file:/u01/app/oracle
,p_wallet_pwd => ‘password
);

The result of the call is shown below with a partial token. The token is actually over 2,000 characters long.

{“access_token”:”eyJ4NXQjUzI1NiI6Ijg1a3E1M… “, “token_type”:”Bearer”,”expires_in”:3600}

Note: The response includes the expires_in:3600 parameter. This means that your token is no longer valid after one hour from the time that you generate it.

Parsing the Token Response

The APEX_JSON package is used to parse the token response and store the result in a VARCHAR variable as shown below. Additional information about this package is included as a link in the References section.

apex_json.parse(l_ws_response_clob);
f_idcs_token := apex_json.get_varchar2(p_path => ‘access_token’);

The result of the parse is just the token itself which is used to prepare the Audit Events request.

Preparing the Audit Events Request

The Audit Events request is prepared two or more times. Once to get a first response containing one event that has a field holding the total number of events. Then subsequent requests are made to retrieve all of the events.

IDCS has a limit of how many events are returned for each request. This post uses 500 as a chunk size value which may be modified. Check with the web services administrator for the maximum number of events per request. Also ensure that the number of events inserted into the BICS table equals the total number found in the initial response.

The number of subsequent requests needed is calculated as the total number of events divided by the chunk size, rounded up to the nearest integer. For example 614 events divided by 500 would result in two subsequent requests needed.

The UTL_HTTP package is used instead of the APEX_WEB_SERVICE package to avoid a limitation of 1,024 characters on the length of a header value. The access token is used in a header value and is over 2,000 characters. The error received with the APEX_WEB_SERVICE call is: ORA-06502: PL/SQL: numeric or value error: character string buffer too small.

Preparing All Requests

All requests need to have the following:

The wallet path and password specified. These are specified globally as shown below:

utl_http.set_wallet(‘file:/u01/app/oracle’, ‘password‘); — For Trusted Certificates

Persistent connection support enabled as shown below:

utl_http.set_persistent_conn_support(FALSE, 1); — Set default persistent connections (1)

Begin the request as shown below:

req := utl_http.begin_request(l_ws_url, ‘get’,’http/1.1′);

Note: The result is stored in a variable named req which is of the req type defined in the UTL_HTTP package as shown below:

— A PL/SQL record type that represents a HTTP request
TYPE req IS RECORD (
url VARCHAR2(32767 byte), — Requested URL
method VARCHAR2(64), — Requested method
http_version VARCHAR2(64), — Requested HTTP version
private_hndl PLS_INTEGER — For internal use only
);

The following three HTTP headers set are shown below:

utl_http.set_header(REQ, ‘Content-Type’, ‘application/scim+json’);
utl_http.set_header(REQ, ‘Cache-Control’, ‘no-cache’);
utl_http.set_header(REQ, ‘Authorization’, ‘Bearer ‘ || l_idcs_token); — The received access token

All but the last need persistent connection support as shown below:

utl_http.set_persistent_conn_support(req, TRUE); — Keep Connection Open

Note: The last request does not have the above setting so will default to FALSE and the connection to the service will be closed.

Preparing Individual Requests

Individual requests need to have the following:

The URL set as shown below:

l_ws_url := https://idcs-hostname/admin/v1/AuditEvents?count=1′; — Get first event for total event count

Subsequent URLs are as shown below:

l_ws_url := https://idcs-hostname /admin/v1/AuditEvents?count=500&startIndex=1&sortBy=timestamp;

Note: subsequent requests need the startindex parameter incremented by the chunk size (500).

Calling the Audit Events Request

The Audit Events requests are called using the UTL_HTTP package as shown below:

resp := utl_http.get_response(req);

Note: The result is stored in a variable named resp which is of the resp type defined in the UTL_HTTP package as shown below:

— A PL/SQL record type that represents a HTTP response
TYPE resp IS RECORD (
status_code PLS_INTEGER, — Response status code
reason_phrase VARCHAR2(256), — Response reason phrase
http_version VARCHAR2(64), — Response HTTP version
private_hndl PLS_INTEGER — For internal use only
);

Troubleshooting the REST Request Calls

Common issues are the need for a proxy, the need for an ACL, the need for a trusted certificate (if using HTTPS), and the need to use the correct TLS security protocol. Note: This post uses DBaaS so all but the first issue has been addressed.

The need for a proxy may be detected when the following error occurs: ORA-12535: TNS:operation timed out. Adding the optional p_proxy_override parameter to the call may correct the issue. An example proxy override is:

www-proxy.us.oracle.com

Parsing the Audit Event Responses

The APEX_JSON package is used to parse the responses.

Before parsing begins the staging table is truncated as shown below:

execute immediate ‘truncate table audit_event’;

An example of a response containing just one event is below:

{“schemas”:[“urn:scim:api:messages:2.0:ListResponse”]
,”totalResults”:614
,”Resources”:[
{“eventId”:”sso.authentication.failure”
,”ssoProtectedResource”:”https://idcs-hostname:443/ui/v1/myconsole”
,”actorName”:”user.name@oracle.com”
,”ssoIdentityProvider”:”localIDP”
,”ssoCSR”:”false”
,”ssoUserPostalCode”:”null”
,”ssoUserCity”:”null”
,”reasonValue”:”SSO-1018″
,”ssoUserCountry”:”null”
,”rId”:”0:1:3:2:4″
,”message”:”Authentication failure User not found.”
,”timestamp”:”2016-10-04T09:38:46.336Z”
,”ssoComments”:”Authentication failure User not found.”
,”ssoApplicationHostId”:”idcs-hostname”
,”ssoUserState”:”null”
,”ecId”:”q^Unq0s8000000000″
,”ssoRp”:”IDCS”
,”ssoLocalIp”:”10.196.29.102″
,”serviceName”:”SSO”
,”ssoAuthnLevel”:0
,”actorType”:”User”
,”ssoSessionId”:”null”
,”ssoUserAgent”:”Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36″
,”actorId”:”IDCS”
,”id”:”0a37c7374c494ed080d15c554ae75be8″
,”meta”: {“created”:”2016-10-04T09:38:46.353Z”
,”lastModified”:”2016-10-04T09:38:46.353Z”
,”resourceType”:”AuditEvent”
,”location”:”https://idcs-hostname/admin/v1/AuditEvents/0a37c7374c494ed080d15c554ae75be8″}
,”schemas”:[“urn:ietf:params:scim:schemas:oracle:idcs:AuditEvent”]
,”idcsCreatedBy”: {“value”:”UnAuthenticated”
,”$ref”:”https://idcs-hostname/admin/v1/AuditEvents/UnAuthenticated”}
,”idcsLastModifiedBy”: {“value”:”UnAuthenticated”
,”$ref”:”https://idcs-hostname/admin/v1/AuditEvents/UnAuthenticated”}
}],”startIndex”:1,”itemsPerPage”:1}

 

Parsing the First Response

The first JSON response of one event is read into a varchar variable as shown below:

utl_http.read_text(resp, l_ws_response_varchar, 32766);

The variable is then parsed as shown below:

apex_json.parse(l_ws_response_varchar);

Note: the above result is implicitly stored in a global package array named g_values. This array contains the JSON members and values.

The value of the JSON member named totalResults is retrieved and stored in a variable as shown below:

v_resultSet := apex_json.get_varchar2(p_path => ‘totalResults’);

This is the total number of events to be retrieved and is all that is wanted from the first response.

Parsing the Subsequent Responses

Subsequent Responses may contain a number of events up to the setting of the chunk size (500 in this post). These responses will need to be stored in a temporary CLOB variable.

The DBMS_LOB package is used to manage the temporary CLOB variable. Additional information about the package may be found in the References section.

This variable is created at the beginning of the parsing and freed at the end of the procedure as shown below:

dbms_lob.createtemporary(l_ws_response_clob, true);
dbms_lob.freetemporary(l_ws_response_clob);

This variable is also trimmed to zero characters at the beginning of each chunk of events using the following:

DBMS_LOB.TRIM (l_ws_response_clob, 0);

The response is read by a LOOP command. Each iteration of the loop reads 32,766 characters of text and appends these to the temporary CLOB variable as shown below:

while not(EOB)
LOOP
BEGIN
utl_http.read_text(resp, l_ws_response_varchar, 32766);
if l_ws_response_varchar is not null and length(l_ws_response_varchar)>0 then
dbms_lob.writeappend(l_ws_response_clob, length(l_ws_response_varchar), l_ws_response_varchar);
end if;
EXCEPTION
WHEN utl_http.end_of_body THEN
EOB := TRUE;
utl_http.end_response(resp);
END;
END LOOP;

The CLOB result is then parsed into the implicit package array of JSON elements and values as shown below. This array contains a number of events equal to or less than the chunk size setting (500).

apex_json.parse(l_ws_response_clob);

Each event in the array is retrieved, has its columns parsed, and is inserted into the BICS staging table as shown below:

for i in 1..v_chunkSize LOOP
v_loadCount := v_loadCount + 1;
IF v_loadCount > v_resultSet THEN NULL;
ELSE
INSERT
INTO AUDIT_EVENT
(
EVENT_ID,
ID,
ACTOR_ID,
ADMIN_REF_RESOURCE_NAME,
ACTOR_NAME,
ACTOR_DISPLAY_NAME,
MESSAGE,
SSO_COMMENTS,
SSO_PROTECTED_RESOURCE,
SSO_USER_AGENT,
TIMESTAMP,
ACTOR_TYPE,
ADMIN_RESOURCE_NAME,
EC_ID
)
VALUES
(
apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].eventId’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].id’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].actorId’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].adminRefResourceName’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].actorName’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].actorDisplayName’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].message’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].ssoComments’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].ssoProtectedResource’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].ssoUserAgent’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].timestamp’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].actorType’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].adminResourceName’)
,apex_json.get_varchar2(p_path => ‘Resources[‘ || i || ‘].ecId’)
);
v_row_count := v_row_count + 1;
END IF;
END LOOP;

After the last chunk of events is processed the procedure terminates.

Scheduling the Procedure

The procedure may be scheduled to run periodically through the use of an Oracle Scheduler job. A link to the Scheduler documentation may be found in the References section.

A job is created using the DBMS_SCHEDULER.CREATE_JOB procedure by specifying a job name, type, action and a schedule. Setting the enabled argument to TRUE enables the job to automatically run according to its schedule as soon as you create it.

An example of a SQL statement to create a job is below:

BEGIN
dbms_scheduler.create_job (
job_name => ‘IDCS_REST_AUDIT_EXTRACT’,
job_type => ‘STORED_PROCEDURE’,
enabled => TRUE,
job_action => ‘BICS_IDCS_REST_INTEGRATION’,
start_date => ’21-DEC-16 10.00.00 PM Australia/Sydney’,
repeat_interval => ‘freq=hourly;interval=24’ — this will run once every 24 hours
);
END;
/

Note: If using the BICS Schema Service database, the package name is CLOUD_SCHEDULER rather than DBMS_SCHEDULER.

The job log and status may be queried using the *_SCHEDULER_JOBS views. Examples are below:

SELECT JOB_NAME, STATE, NEXT_RUN_DATE from USER_SCHEDULER_JOBS;
SELECT LOG_DATE, JOB_NAME, STATUS from USER_SCHEDULER_JOB_LOG;

Summary

This post detailed a method of extracting and loading data from Oracle Identity Cloud Service (IDCS) into the Oracle Business Intelligence Cloud Service (BICS).

The post used REST web services to extract the JSON-formatted data responses. It used a PL/SQL Stored Procedure to wrap the REST extract, JSON parsing commands, and database table operations. It loaded a BICS staging table which can be transformed into star-schema object(s) for use in modeling.

Finally, an example of a database job was provided that executes the Stored Procedure on a scheduled basis.

For more BICS and BI best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-Team Chronicles for BICS.

References

Complete Procedure

REST API for Oracle Identity Cloud Service

Scheduling Jobs with Oracle Scheduler

Database PL/SQL Language Reference

APEX_WEB_SERVICE Reference Guide

APEX_JSON Reference Guide

UTL_HTTP Package Reference Guide

Soap API Testing Tool

Curious Concept JOSN Testing Tool

Base64 Decoding and Encoding Testing Tool

Using Oracle Wallet Manager

Oracle Business Intelligence Cloud Service Tasks

DBMS_LOB Reference Guide

 

Oracle Data Integrator Best Practices: Using Reverse-Engineering on the Cloud and on Premises

$
0
0

Introduction

 

This article discusses best practices on using the Reverse Engineering features of Oracle Data Integrator (ODI) on the cloud and on premises.  The first part of this article presents the various options available in ODI to reverse-engineer metadata from a data server.  Then, the article discusses performance considerations when running and executing reverse-engineering tasks.  The last section of this article discusses the ODI reverse-engineering best practices.

 

Oracle Data Integrator Best Practices:  Using Reverse-Engineering on the Cloud and on Premises

 

In ODI, reverse-engineering is the process of selecting metadata from a data server and populating the selected metadata into an ODI model.  An ODI model contains objects or datastores such as tables, views, queues, and synonyms.  ODI models also contain attributes, keys, and constraints for each datastore.   An ODI model is connected to an ODI logical schema of a given ODI technology.

Figure 1, below, shows an example of an ODI model called Staging.  This ODI model is connected to an ODI logical schema called Oracle Warehouse – Staging.  The ODI technology for this logical schema is Oracle.  The ODI reverse-engineering options are located on the menu options of the ODI Model screen, as illustrated on Figure 1, below.

 

Figure 1 - ODI Reverse-Engineering – ODI Model

Figure 1 – ODI Reverse-Engineering – ODI Model

There are two reverse-engineering options in ODI:  Reverse Engineer, and Selective Reverse-Engineering.  The following section of this article discusses these two options.

 

Using the ODI Reverse Engineer Option

 

Figure 2, below, shows the ODI Reverse Engineer option for the ODI model called Staging.  This option offers two ways of performing a reverse-engineering task:  Standard, and CustomizedFigure 2, below, shows the Standard option.  The Standard option is the default option; it provides basic reverse-engineering capabilities – users can retrieve a minimum set of attributes with this option.

 

Figure 2 - ODI Reverse-Engineering – Standard Option

Figure 2 – ODI Reverse-Engineering – Standard Option

Tip

When using the Standard option, the reverse-engineering task can only be executed with a local agent – the default agent of the ODI Studio.

The Standard option can filter the selection of metadata by object type.  In this example, on Figure 2, above, the selected object type is Table; thus, only Oracle tables will be reverse-engineered.  The Mask option provides additional filtering capabilities.  In this example, the reverse-engineer task only brings the Oracle tables starting with a name of STG, followed by any additional characters – the percent wildcard (%) specifies any characters.

In this example, on Figure 2, above, the Standard option retrieves the name of the Oracle tables, the table attributes, and the table constraints.  The table attributes include the column names, the column types, and the column lengths.    The table constraints include primary keys, unique keys, foreign keys, and check constraints.

Figure 3, below, show a list of datastores for this ODI model: STG_CUSTOMER, STG_ORDERS, STG_PRODUCT, and STG_STATUS.  These datastores have been reverse-engineered with the Standard option.  The attributes of the datastore called STG_CUSTOMER are also illustrated on Figure 3, below:

 

Figure 3 - ODI Reverse-Engineering – ODI Data Stores

Figure 3 – ODI Reverse-Engineering – ODI Data Stores

The Standard option uses the Java Database Connectivity (JDBC) API to retrieve metadata from a data server.  The JDBC API is the industry standard for database-independent connectivity between Java applications and a wide range of databases – the ODI Studio is a Java application.  The Standard option has an extensive number of features, but it can only retrieve a limited set of metadata due to the limitations of the JDBC API driver.  For instance, if an Oracle table is partitioned, the Standard option cannot reverse-engineer the partitions of a table because the JDBC API driver does not support the selection of table partitions.  On the other hand, the Customized option provides additional features, and it can retrieve additional metadata such as table partitions from a data server.  The Customized option requires a Reverse-Engineering Knowledge Module (RKM), which can be customized to perform additional tasks.  When using the Customized option, the reverse-engineering task can be executed with the local agent of the ODI Studio (default), or with any agent configured in the ODI Topology.  Figure 4, below, shows the Customized option:

 

Figure 4 - ODI Reverse-Engineering – Customized Option

Figure 4 – ODI Reverse-Engineering – Customized Option

In this example, on Figure 4, above, the Customized option uses a logical agent called OracleDIAgent-JCS.  This ODI agent is a J2EE agent, located on an instance of the Oracle Java Cloud Service (JCS).  In this example, the type of object to reverse-engineer is Table, and the Mask option has been set to STG% – all tables starting with a prefix of STG will be reverse-engineered.  The RKM for this reverse-engineering task is the RKM Oracle.  Also, the options for this RKM are illustrated on Figure 4, above.

Figure 5, below, shows a list of tasks for this RKM.  Some of the RKM tasks include retrieving partitions, foreign keys (FK), index keys, table conditions, and other metadata from the Oracle database.

 

Figure 5 - ODI Reverse-Engineering – RKM Oracle Tasks

Figure 5 – ODI Reverse-Engineering – RKM Oracle Tasks

When the Customized option is used for a reverse-engineer task, the ODI agent executes the code generated by the RKM, and the ODI model gets populated with metadata from the data server.  Figure 6, below, shows the Partitions screen of an ODI datastore called W_ORDERS_F.  In this example, the partitions for this datastore – an Oracle partitioned table – have been populated using the RKM Oracle.

 

Figure 6 - ODI Reverse-Engineering – Datastore Partitions

Figure 6 – ODI Reverse-Engineering – Datastore Partitions

 

Tip

The RKM tasks and options depend on the available features of a given technology.  For additional information on RKMs, go to “Introduction to Oracle Data Integrator Knowledge Modules.”

 

Using the ODI Selective Reverse-Engineering Option

 

The Selective Reverse-Engineering option, illustrated on Figure 7, below, offers additional capabilities such as reverse-engineering new datastores, and existing datastores.  This option works in conjunction with the Standard option, and it is only available if the Standard option is selected in the Reverse Engineer tab.  This option allows users to select from a list of objects before executing the reverse-engineering task.

 

Figure 7 - ODI Reverse-Engineering – Selective Reverse-Engineering

Figure 7 – ODI Reverse-Engineering – Selective Reverse-Engineering

Figure 7, above, shows a list of objects to be reverse-engineered:  STG_CUSTOMER, STG_ORDERS, STG_PRODUCT, and STG_STATUS.   These objects are Oracle tables that the Selective Reverse-Engineering option found when the Objects to Reverse Engineer check-box was selected.  The objects listed on Figure 7, above, are the result from the filters put in place in the Reverse Engineer tab.

If these objects already exist in the ODI model, and the Reverse Engineer Execution button is clicked, the metadata for the existing objects will be updated.  If the objects are new, they will be added to the ODI model.

 

ODI Reverse-Engineering Considerations

 

The execution time of a reverse-engineering task depends on several factors.  For instance, a large number of tables and columns may take longer to reverse-engineer than a small set of tables or columns.  Also, the location of the ODI Studio and the type of agent used for the reverse-engineering task may also have an impact on the overall execution time.  For instance, let’s assume that an ODI user wants to reverse-engineer a set of Oracle tables located on an instance of DBCS.  Also, let’s assume that the ODI repository is located on another instance of DBCS.  Let’s assume that the ODI user selects the Standard option to execute a reverse-engineer task, and the task is executed from the premises of the ODI user.  Under this scenario, executing the reverse-engineer task from the premises of the ODI user is not a recommended strategy.  Figure 8, below, shows an example of this unfavorable practice:

 

Figure 8 - ODI Reverse-Engineering – On-Premise ODI Studio

Figure 8 – ODI Reverse-Engineering – On-Premise ODI Studio

In this example, on Figure 8, above, the selected metadata must be exported from Instance A – where the source data server is located – to the promises of the ODI user – where the ODI Studio is located.  Then, the local agent must upload the selected metadata from the ODI Studio into Instance B – where the ODI repository is located.  This strategy does not offer the best performance, since the content of the selected metadata must travel from the cloud to the on premises of the user, and then back to the cloud.

The best strategy is to execute the reverse-engineering task from an instance of the ODI Studio that is running on the Oracle Cloud such as the Oracle Java Cloud Service (JCS) or the Oracle Compute Cloud Service (CCS).  Figure 9, below, shows an example:

 

Figure 9 - ODI Reverse-Engineering - ODI Studio on JCS

Figure 9 – ODI Reverse-Engineering – ODI Studio on JCS

This strategy, shown on Figure 9, above, offers the best performance when performing reverse-engineering tasks between database cloud services.  The same strategy can be applied to other SQL databases that are on other cloud services.

 

ODI Reverse-Engineering Best Practices

 

When using the reverse-engineering features of ODI, follow these rules of thumb:

 

  • Use the reverse-engineering Standard option if a minimum set of attributes – such as table name, table columns, and table constraints – are needed for an ODI model.
  • Use the reverse-engineering Customized option if the Standard option does not populate the required metadata due to the limitations of the JDBC API driver.
  • When using the Customized option, select and import the RKM that supports the technology of the ODI model.  Get familiar with the options and steps of the selected RKM before using it.  If a RKM needs to be modified for additional tasks, rename the RKM and document the changes.
  • When performing the reverse-engineering task, use the Mask option to filter the number of objects to be reverse-engineered.  The Mask option is available on both the Standard option and the Customized option.
  • When the Standard option is used for a reverse-engineering task, run the ODI Studio in a location that is near both data servers: the source data server and the ODI repository data server.  This will reduce the amount of time it may take to populate the ODI model with the desired metadata.
  • Use the best available ODI agent when performing a reverse-engineering task.  The best available ODI agent is the one that is located the closest to both the source data server and the ODI repository data server.
  • When performing a reverse-engineering task, ensure the connection-user executing the task has the necessary privileges to access the metadata from the source data server.  The connection-user is the user configured in the physical data server of the ODI Topology.  If the reverse-engineering task completes successfully but does not populate any metadata into the ODI model, the connection-user may not have the necessary privileges to read and access the metadata from the data server.

Conclusion

 

The ODI reverse-engineering features offer a mechanism to retrieve metadata from a data server and to populate the metadata into an ODI model.  This metadata can then be used in ODI mappings to build data integration tasks.  The ODI reverse-engineering features offer various options – Standard and Customized – to reverse engineer objects from a data server.   The Standard option leverages the JDBC driver to retrieve metadata from a data server.   The Customized option leverages RKMs to retrieve and populate additional metadata that cannot be retrieved with the Standard option.  These RKMs offer additional features and options to reverse-engineer additional metadata from a data server.

For more Oracle Data Integrator best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-team Chronicles for Oracle Data Integrator (ODI).”

 

ODI Related Articles

Integrating Oracle Data Integrator (ODI) On-Premise with Cloud Services

Connect ODI to Oracle Database Cloud Service (DBCS)

ODI 12c and DBaaS in the Oracle Public Cloud

Oracle Platform as a Service (PaaS)

Infrastructure as a Service (IaaS)

Oracle Storage Cloud Service (SCS)

Applications as a Service (SaaS)

Oracle Database Cloud Service (DBCS)

Using Oracle Database Schema Cloud Service

Oracle Exadata Cloud Service (ExaCS)

Loading Data into the Oracle Database in an Exadata Cloud Service Instance

 

Loading Data from Oracle Field Service Cloud into Oracle BI Cloud Service using REST

$
0
0

Introduction

This post details a method of extracting and loading data from Oracle Field Service Cloud (OFSC) into the Oracle Business Intelligence Cloud Service (BICS) using RESTful services. It is a companion to the A-Team post Loading Data from Oracle Field Service Cloud into Oracle BI Cloud Service using SOAP . Both this post and the SOAP post offer methods to compliment the standard OFSC Daily Extract described in Oracle Field Service Cloud Daily Extract Description.

One case for using this method is analyzing trends regarding OFSC events.

This post uses RESTful web services to extract JSON-formatted data responses. It also uses the PL/SQL language to call the web services, parse the JSON responses, and perform database table operations in a Stored Procedure. It produces a BICS staging table which can then be transformed into star-schema object(s) for use in modeling. The transformation processes and modeling are not discussed in this post.

Finally, an example of a database job is provided that executes the Stored Procedure on a scheduled basis.

The PL/SQL components are for demonstration purposes only and are not intended for enterprise production use. Additional detailed information, including the complete text of the PL/SQL procedure described, is included in the References section at the end of this post.

Rationale for Using PL/SQL

PL/SQL is the only procedural tool that runs on the BICS / Database Schema Service platform. Other wrapping methods e.g. Java, ETL tools, etc. require a platform outside of BICS to run on.

PL/SQL may also be used in a DBaaS (Database as a Service) that is connected to BICS.

PL/SQL can utilize native SQL commands to operate on the BICS tables. Other methods require the use of the BICS REST API.

Note: PL/SQL is a very good at showcasing functionality. However, it tends to become prohibitively resource intensive when deploying in an enterprise production environment. For the best enterprise deployment, an ETL tool such as Oracle Data Integrator (ODI) should be used to meet these requirements and more:

* Security

* Logging and Error Handling

* Parallel Processing – Performance

* Scheduling

* Code Re-usability and Maintenance

About the OFSC REST API

The document REST API for Oracle Field Service Cloud Service should be used extensively, especially the Authentication, Paginating, and Working with Events sections. Terms described there such as subscription, page, and authorization are used in the remainder of this post.

In order to receive events, a subscription is needed listing the specific events desired. The creation of a subscription returns both a subscription ID and a page number to be used in the REST calls to receive events.

At this time, a page contains 0 to 100 items (events) along with the next page number to use in a subsequent call.

The following is a list of supported events types available from the REST API:

Activity Events
Activity Link Events
Inventory Events
Required Inventory Events
User Events
Resource Events
Resource Preference Events

This post uses the following subset of events from the Activity event type:

activityCreated
activityUpdated
activityStarted
activitySuspended
activityCompleted
activityNotDone
activityCanceled
activityDeleted
activityDelayed
activityReopened
activityPreworkCreated
activityMoved

The process described in this post can be modified slightly for each different event type. Note: the columns returned for each event type differ slightly and require modifications to the staging table and parsing section of the procedure.

Using Oracle Database as a Service

This post uses the new native support for JSON offered by the Oracle 12c database. Additional information about these new features may be found in the document JSON in Oracle Database.

These features provide a solution that overcomes a current limitation in the APEX_JSON package. The maximum length of JSON values in that package is limited to 32K characters. Some of the field values in OFSC events exceed this length.

Preparing the DBaaS Wallet

Create an entry in a new or existing Oracle database wallet for the trusted public certificates used to secure connections to the web service via the Internet. A link to the Oracle Wallet Manager documentation is included in the References section. Note the location and password of the wallet as they are used to issue the REST request.

The need for a trusted certificate is detected when the following error occurs: ORA-29024: Certificate validation failure.

An example certificate path found using Chrome browser is shown below. Both of these trusted certificates need to be in the Oracle wallet.

  • 2

Creating a BICS User in the Database

The complete SQL used to prepare the DBaaS may be viewed here.

Example SQL statements are below:

CREATE USER “BICS_USER” IDENTIFIED BY password
DEFAULT TABLESPACE “USERS”
TEMPORARY TABLESPACE “TEMP”
ACCOUNT UNLOCK;
— QUOTAS
ALTER USER “BICS_USER” QUOTA UNLIMITED ON USERS;
— ROLES
ALTER USER “BICS_USER” DEFAULT ROLE “CONNECT”,”RESOURCE”;
— SYSTEM PRIVILEGES
GRANT CREATE VIEW TO “BICS_USER”;
GRANT CREATE ANY JOB TO “BICS_USER”;

Creating Database Schema Objects

Three tables need to be created prior to compiling the PL/SQL stored procedure. These tables are:

*     A staging table to hold OFSC Event data

*     A subscription table to hold subscription information.

*     A JSON table to hold the JSON responses from the REST calls

The staging table, named OFSC_EVENT_ACTIVITY, has columns described in the OFSC REST API for the Activity event type. These columns are:

PAGE_NUMBER — for the page number the event was extracted from
ITEM_NUMBER — for the item number within the page of the event
EVENT_TYPE
EVENT_TIME
EVENT_USER
ACTIVITY_ID
RESOURCE_ID
SCHEDULE_DATE
APPT_NUMBER
CUSTOMER_NUMBER
ACTIVITY_CHANGES — To store all of the individual changes made to the activity

The subscription table, named OFSC_SUBSCRIPTION_PAGE, has the following columns:

SUBSCRIPTION_ID     — for the supported event types
NEXT_PAGE                — for the next page to be extracted in an incremental load
LAST_UPDATE            — for the date of the last extract
SUPPORTED_EVENT — for the logical name for the subscription event types
FIRST_PAGE               — for the first page to be extracted in a full load

The JSON table, named OFSC_JSON_TMP, has the following columns:

PAGE_NUMBER — for the page number extracted
JSON_CLOB       — for the JSON response received for each page

Using API Testing Tools

The REST requests should be developed in API testing tools such as cURL and Postman. The JSON expressions for parsing should be developed and tested in a JSON expression testing tool such as CuriousConcept. Links to these tools are provided in the References section.

Note: API testing tools such as SoapUI, CuriousConcept, Postman, and so on are third-party tools for using SOAP and REST services. Oracle does not provide support for these tools or recommend a particular tool for its APIs. You can select the tool based on your requirements.

Subscribing to Receive Events

Create subscriptions prior to receiving events. A subscription specifies the types of events that you want to receive. Multiple subscriptions are recommended. For use with the method in this post, a subscription should only contain events that have the same response fields.

The OFSC REST API document describes how to subscribe using a cURL command. Postman can also easily be used. Either tool will provide a response as shown below:

{
“subscriptionId”: “a0fd97e62abca26a79173c974d1e9c19f46a254a”,
“nextPage”: “160425-457,0”,
“links”: [ … omitted for brevity ]
}.

Note: The default next page is for events after the subscription is created. Ask the system administrator for a starting page number if a past date is required.

Use SQL*Plus or SQL Developer and insert a row for each subscription into the OFSC_SUBSCRIPTION_PAGE table.

Below is an example insert statement for the subscription above:

INSERT INTO OFSC_SUBSCRIPTION_PAGE
(
SUBSCRIPTION_ID,
NEXT_PAGE,
LAST_UPDATE,
SUPPORTED_EVENT,
FIRST_PAGE
)
VALUES
(
‘a0fd97e62abca26a79173c974d1e9c19f46a254a’,
‘160425-457,0’,
sysdate,
‘Required Inventory’,
‘160425-457,0’
);

Preparing and Calling the OFSC RESTful Service

This post uses the events method of the OFSC REST API.

This method requires the Basic framework for authorization and mandates a base64 encoded value for the following information: user-login “@” instance-id “:” user-password

An example encoded result is:

dXNlci1sb2dpbkBpbnN0YW5jZS1pZDp1c2VyLXBhc3N3b3Jk

The authorization header value is the concatenation of the string ‘Basic’ with the base64 encoded result discussed above. The APEX_WEB_SERVICE package is used to set the header as shown below:

v_authorization_token := ‘ dXNlci1sb2dpbkBpbnN0YW5jZS1pZDp1c2VyLXBhc3N3b3Jk’;
apex_web_service.g_request_headers(1).name  := ‘Authorization’;
apex_web_service.g_request_headers(1).value := ‘Basic ‘||v_authorization_token;

The wallet path and password discussed in the Preparing the DBaaS Wallet section are also required. An example path from a Linux server is:

/u01/app/oracle

Calling the Events Request

The events request is called for each page available for each subscription stored in the OFSC_SUBSCRIPTION_PAGE table using a cursor loop as shown below:

For C1_Ofsc_Subscription_Page_Rec In C1_Ofsc_Subscription_Page
Loop
V_Subscription_Id := C1_Ofsc_Subscription_Page_Rec.Subscription_Id;
Case When P_Run_Type = ‘Full’ Then
V_Next_Page := C1_Ofsc_Subscription_Page_Rec.First_Page;
Else
V_Next_Page := C1_Ofsc_Subscription_Page_Rec.Next_Page;
End Case; … End Loop;

The URL is modified for each call. The subscription_id and the starting page are from the table.

For the first call only, if the parameter / variable p_run_type is equal to ‘Full’, the staging table is truncated and the page value is populated from the FIRST_PAGE column in the OFSC_SUBSCRIPTION_PAGE table. Otherwise, the staging table is not truncated and the page value is populated from the NEXT_PAGE column.

Subsequent page values come from parsing the nextPage value in the responses.

An example command to create the URL from the example subscription above is:

f_ws_url := v_base_url||’/events?subscriptionId=’ ||v_subscription_id|| chr(38)||’page=’ ||v_next_page;

The example URL result is:

https://ofsc-hostname/rest/ofscCore/v1/events?subscriptionId=a0fd97e62abca26a79173c974d1e9c19f46a254a&page=160425-457,0

An example call using the URL is below:

f_ws_response_clob := apex_web_service.make_rest_request (
p_url => f_ws_url
,p_http_method => ‘GET’
,p_wallet_path => ‘file:/u01/app/oracle’
,p_wallet_pwd => ‘wallet-password‘ );

Storing the Event Responses

Each response (page) is processed using a while loop as shown below:

While V_More_Pages
Loop
Extract_Page;
End Loop;

Each page is parsed to obtain the event type of the first item. A null (empty) event type signals an empty page and the end of the data available. An example parse to obtain the event type of the first item is below. Note: for usage of the JSON_Value function below see JSON in Oracle Database.

select  json_value (f_ws_response_clob, ‘$.items[0].eventType’ ) into f_event_type from  dual;

If there is data in the page, the requested page number and the response clob are inserted into the OFSC_JSON_TMP table and the response is parsed to obtain the next page number for the next call as shown below:

f_json_tmp_rec.page_number := v_next_page; — this is the requested page number
f_json_tmp_rec.json_clob := f_ws_response_clob;
insert into ofsc_json_tmp values f_json_tmp_rec;
select json_value (f_ws_response_clob, ‘$.nextPage’ ) into v_next_page from dual;

Parsing and Loading the Events Responses

Each response row stored in the OFSC_JSON_TMP table is retrieved and processed via a cursor loop statement as shown below:

for c1_ofsc_json_tmp_rec in c1_ofsc_json_tmp
loop
process_ofsc_json_page (c1_ofsc_json_tmp_rec.page_number);
end loop;

An example response is below with only the first item shown:

{
“found”: true,
“nextPage”: “170110-13,0”,
“items”: [
{
“eventType”: “activityUpdated”,
“time”: “2017-01-04 12:49:51”,
“user”: “soap”,
“activityDetails”: {
“activityId”: 1297,
“resourceId”: “test-resource-id“,
“resourceInternalId”: 2505,
“date”: “2017-01-25”,
“apptNumber”: “82994469003”,
“customerNumber”: “12797495”
},
“activityChanges”: {
“A_LastMessageStatus”: “SuccessFlag – Fail – General Exception: Failed to update FS WorkOrder details. Reason: no rows updated for: order_id = 82994469003 service_order_id = NULL”
}
}
],
“links”: [

]
}

Each item (event) is retrieved and processed via a while loop statement as shown below:

while f_more_items loop
process_item (i);
i := i + 1;
end loop;

For each item, a dynamic SQL statement is prepared and submitted to return the columns needed to insert a row into the OFSC_EVENT_ACTIVITY staging table as shown below (the details of creating the dynamic SQL statement have been omitted for brevity):

An example of a dynamically prepared SQL statement is below. Note: for usage of the JSON_Table function below see JSON in Oracle Database.

DYN_SQL

The execution of the SQL statement and the insert are shown below:

execute immediate f_sql_stmt into ofsc_event_activity_rec;
insert into ofsc_event_activity values ofsc_event_activity_rec;

Verifying the Loaded Data

Use SQL*Plus, SQL Developer, or a similar tool to display the rows loaded into the staging table.

A sample set of rows is shown below:

tabResults

Troubleshooting the REST Calls

Common issues are the need for a proxy, the need for an ACL, the need for a trusted certificate (if using HTTPS), and the need to use the correct TLS security protocol. Note: This post uses DBaaS so all but the first issue has been addressed.

The need for a proxy may be detected when the following error occurs: ORA-12535: TNS:operation timed out. Adding the optional p_proxy_override parameter to the call may correct the issue. An example proxy override is:

www-proxy.us.oracle.com

Scheduling the Procedure

The procedure may be scheduled to run periodically through the use of an Oracle Scheduler job as described in Scheduling Jobs with Oracle Scheduler.

A job is created using the DBMS_SCHEDULER.CREATE_JOB procedure by specifying a job name, type, action and a schedule. Setting the enabled argument to TRUE enables the job to automatically run according to its schedule as soon as you create it.

An example of a SQL statement to create a job is below:

BEGIN
dbms_scheduler.create_job (
job_name => ‘OFSC_REST_EVENT_EXTRACT’,
job_type => ‘STORED_PROCEDURE’,
enabled => TRUE,
job_action => ‘BICS_OFSC_REST_INTEGRATION’,
start_date => ’12-JAN-17 11.00.00 PM Australia/Sydney’,
repeat_interval => ‘freq=hourly;interval=24’ — this will run once every 24 hours
);
END;
/

Note: If using the BICS Schema Service database, the package name is CLOUD_SCHEDULER rather than DBMS_SCHEDULER.

The job log and status may be queried using the *_SCHEDULER_JOBS views. Examples are below:

SELECT JOB_NAME, STATE, NEXT_RUN_DATE from USER_SCHEDULER_JOBS;
SELECT LOG_DATE, JOB_NAME, STATUS from USER_SCHEDULER_JOB_LOG;

Summary

This post detailed a method of extracting and loading data from Oracle Field Service Cloud (OFSC) into the Oracle Business Intelligence Cloud Service (BICS) using RESTful services.

The method extracted JSON-formatted data responses and used the PL/SQL language to call the web services, parse the JSON responses, and perform database table operations in a Stored Procedure. It also produced a BICS staging table which can then be transformed into star-schema object(s) for use in modeling.

Finally, an example of a database job was provided that executes the Stored Procedure on a scheduled basis.

For more BICS and BI best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-Team Chronicles for BICS.

References

Complete Procedure

JSON in Oracle Database

REST API for Oracle Field Service Cloud Service

Scheduling Jobs with Oracle Scheduler

Database PL/SQL Language Reference

APEX_WEB_SERVICE Reference Guide

APEX_JSON Reference Guide

Curious Concept JSON Testing Tool

Postman Testing Tool

Base64 Decoding and Encoding Testing Tool

Using Oracle Wallet Manager

Oracle Business Intelligence Cloud Service Tasks

 

Oracle Data Integrator Best Practices: Using Loading Knowledge Modules on both On-Premises and Cloud Computing

$
0
0

Introduction

 

This article discusses the best practices for selecting and using the Oracle Data Integrator (ODI) Loading Knowledge Modules (LKMs) on both on-premises and on cloud computing.  LKMs are code templates in ODI that can perform data upload operations from on-premises data servers to cloud services, between cloud services, or between on-premises data servers.  ODI supports a variety of technologies such as SQL databases, Big Data, Files, Java Messaging Systems (JMSs), and many other technologies.  Most of these technologies are now available on both on-premises and on data cloud services.  For each of these technologies, a variety of LKMs are available.  For instance, ODI offers LKMs for SQL databases such as Oracle, Teradata, MySQL, MS SQL Server, among others.  For Big Data, ODI offers LKMs for Spark, Hive, Sqoop, Kafka, and Pig, among others.

LKMs are one of seven categories of Knowledge Modules (KMs) found in ODI.  Other categories of KMs include: Reverse-Engineering, Check, Integration, Extract, Journalizing, and Service.  This article focuses on the selection and use of LKMs.    If you would like to learn more about other categories of KMs, go to “Oracle Data Integrator (ODI) Knowledge Modules (KMs).”

 

When are Loading Knowledge Modules Required?

 

In ODI, LKMs are required in the following use-cases:

 

  • Different Technologies – The source datastore and the target datastore are from different technologies.  For instance, an LKM is required when loading data from a file into an Oracle table or when loading data from a Teradata table into an Oracle table.  This use-case is illustrated on Figure 1, below:

 

Figure 1 - Using a LKM with Different Technologies

Figure 1 – Using a LKM with Different Technologies

 

  • Different Data Servers – The source datastore and the target datastore are from the same technology, but they are not located on the same data server.  For instance, an LKM is required when loading data between two Oracle tables that are located on different Oracle data servers.  This use-case is illustrated on Figure 2, below:

 

Figure 2 - Using a LKM with Different Data Servers

Figure 2 – Using a LKM with Different Data Servers

 

  • On-Premises to Cloud – On cloud computing, LKMs are required when uploading data from an on-premises data server into a cloud data service.  This use-case is illustrated on Figure 3, below:

 

 

Figure 3 - Using a LKM from On-Premises to Oracle DBCS

Figure 3 – Using a LKM from On-Premises to Oracle DBCS

 

 

  • Different Instances of a Database Cloud Service – LKMs are also required when both the source and the target datastores are from the same database cloud service, but each datastore is located in a different instance of the service or the datastores are hosted on separate services.  This use-case is illustrated on Figure 4, below:

 

 

Figure 4 - Using a LKM on Different Database Cloud Service Instances

Figure 4 – Using a LKM on Different Database Cloud Service Instances

 

Styles of Knowledge Modules

 

In ODI, there are two KM stylestemplate-style, and component-style.  Template-style KMs are available in both ODI 11g and ODI 12c.  Component-style KMs are available in ODI 12c only.  A LKM is either a template-style KM or a component-style KM. Template-style KMs must be imported from folder /<Oracle Home>/odi/sdk into an ODI repository.  Component-style KMs are automatically installed in ODI when an ODI repository is created.  By default, ODI 12c uses component-style LKMs when a LKM is required, unless ODI users choose to import and use template-style LKMs.

In ODI 12c, when a mapping is created and a LKM is required, ODI automatically assigns a component-style LKM to the mapping.   If a template-style LKM has been already imported into an ODI project, then the template-style LKM is used instead.  Figure 5 below shows how to identify the LKM that ODI automatically assigns to an ODI mapping.  In this example, the LKM Oracle to Oracle (DB Link).GLOBAL – a component-style LKM – has been assigned to this mapping.

 

Figure 5 - ODI 12c Component-Style LKM

Figure 5 – ODI 12c Component-Style LKM

 

A template-style KM can be imported either as a global object or as a local objectWhen a template-style KM is imported as a global object, it can be used by any ODI project of an ODI repository.  When a template-style KM is imported as a local object, the KM can only be used in the ODI project where it has been imported.  A component-style KM is a global object.  Other ODI objects such as variables, sequences, user functions, and reusable mappings can also be imported as either global objects or local objects.  Thus, LKMs are either global or local objects in an ODI repository.  Figure 6, below, summarizes the KM styles found in ODI, and the KM object types:

 

Figure 6 - Styles & Types of Knowledge Modules in ODI

Figure 6 – Styles & Types of Knowledge Modules in ODI

 

Figure 7, below, shows an example of how LKMs can be configured either as global KMs or local KMs in an ODI repository:

 

 

Figure 7 - Global vs. Local Loading Knowledge Modules in ODI

Figure 7 – Global vs. Local Loading Knowledge Modules in ODI

 

Figure 7, above, shows a global LKM called LKM Oracle to Oracle (datapump) v1.0.  This LKM has been imported as a global object into this ODI repository; thus, it can be used by any of the following three ODI projects:  ODI Project 1, ODI Project 2, and ODI Project 3.  For ODI Project 1, there are three additional LKMs:  LKM Oracle to Oracle (datapump) v2.0, LKM File to Teradata (TTU), and LKM SQL to SQL.  These three LKMs have been imported as local objects into this ODI project; thus, they can only be used in this ODI project.  Note there are two versions of the LKM Oracle to Oracle (datapump), v1.0 and v2.0.  Both versions will be visible in ODI Project 1, and they can both be used in this project.  For ODI Project 2 and ODI Project 3, no local LKMs have been configured or imported into either project.  For additional information on how to import objects and KMs in ODI, go to “Importing Objects in Oracle Data Integrator.”

 

Loading Knowledge Modules Best Practices

 

For a given technology, ODI supports various ways of performing data upload operations.  For instance, for Big Data, ODI has various LKMs to perform data upload operations between HDFS, Hive, Spark, HBase, and Pig.  Each of these tools offers performance benefits, but some of them can upload data faster than others.  Thus, the selection of a LKM has a significant impact on the overall performance of your data upload operations.  For any of the technologies supported by ODI, follow these rules of thumb when selecting and using LKMs:

 

  • Select LKMs that support the fastest method for uploading data between data servers.  For instance, Oracle offers Data Pump and DBLINK – among other tools – to upload data between Oracle databases.  These two Oracle tools are supported by ODI with two LKMs:  LKM Oracle to Oracle (datapump), and LKM Oracle to Oracle (DBLINK).  Both Data Pump and DBLINK offer great performance benefits, but Data Pump can upload data faster than DBLINK because it uses multiple threads to read, export, and import data in parallel.  Thus, when using ODI to load data between Oracle databases, use the LKM Oracle to Oracle (datapump) – the fastest way of loading data between Oracle databases.  Use this approach with other technologies as well, and select LKMs that support the fastest way of uploading data between data servers.  To learn more about using Oracle Data Pump with ODI, go to “Using Oracle Data Pump with Oracle Data Integrator (ODI).”
  • Select LKMs that perform best in your environment.  Some LKMs offer tuning options such as number of parallel threads, direct path load options, and concurrent upload operations.  Test these options and find the optimum configuration based on the available resources in your environment and the amount of concurrent data upload operations that your environment can support at a given time.
  • Explore additional LKMs as well.  Your technology may offer additional data upload tools and ODI may have additional LKMs to support these tools.  If necessary, discuss with your technology experts – such as database administrators (DBAs) and data architects – which tools are recommended.  Then, select the LKMs that support the recommended tools.
  • If the out-of-the-box LKMs do not support the desired technology tool to perform a data upload operation, build your own LKM.  The ODI framework allows ODI users to build their own KMs – this is one of the biggest benefits of using ODI.  For a complete guide on how to develop knowledge modules in ODI, go to “Developing Knowledge Modules with Oracle Data Integrator.”
  • Select LKMs that support the native tools of your technology.  Typically, these LKMs have a broader number of options, and can easily be customized for additional tasks.  Also, when using LKMs that support native tools, the ODI agent can be located on any physical server, since the upload operation is done by the actual technology and not by the ODI agent.  For instance, when loading data from text files into the Oracle database, use the LKM File to Oracle (EXTERNAL TABLE).  This LKM uses the Oracle External Table technology – a native tool of the Oracle database – to upload data in parallel from text files into the Oracle database.  When using this LKM, the upload operation is done by the Oracle database and not by the ODI agent; thus, the ODI agent can reside on any computer.   This topic is discussed in greater details at “Understanding Where to Install the ODI Standalone Agent.”
  • LKMs can upload data from on-premises data servers to cloud data services.  To see examples of how to upload data from on-premises Oracle databases to Oracle Database Cloud Service (DBCS), go to “Using ODI Loading Knowledge Modules on the Oracle Database Cloud Service (DBCS).”  LKMs can also upload data into other non-Oracle cloud services such as the Amazon big data cloud service, Amazon Elastic Map Reduce (EMR).  To see examples of how to use LKMs with Amazon EMR, go to “Using Oracle Data Integrator (ODI) with Amazon Elastic MapReduce (EMR).”
  • Use the ODI 12c Exchange option to download additional LKMs – these LKMs are free of cost.  This option can be invoked from the ODI Studio, by selecting the Check for Updates option from the Help menu.  The ODI 12c Exchange option allows the ODI user-community to share KMs and other ODI objects through update centers.  The ODI 12c Exchange option offers both Oracle supported KMs and non-supported KMs.  For additional information on the ODI 12c Exchange option, go to “Introducing Oracle Data Integrator (ODI) Exchange.”
  • When using LKMs in ODI, take advantage of all the parallel features available in your technology, and orchestrate the data upload operations in parallel.  Most technologies have options to perform data uploads in parallel, and these parallel options may be available through the LKMs options – get familiar with the KM options and configure them accordingly.  Also, ODI 12c offers In-Session Parallelism – data upload operations can run in parallel if multiple execution units are defined in an ODI mapping.  An example of this strategy can be found in the following blog:  “Importing Data from SQL databases into Hadoop with Sqoop and Oracle Data Integrator (ODI).”  Section “Using ODI 12c In-Session Parallelism with Sqoop” of this blog discusses how to use the ODI In-Session Parallelism option.  The blog also discusses how to use ODI packages to load data in asynchronous (parallel) mode.
  • When configuring the ODI Physical Architecture, create a single ODI data server for all the schemas that are physically located on the same physical data server – this will eliminate the need of using LKMs in mappings.  For instance, if two schemas are located on the same Oracle database, create a single ODI data server to host these two schemas – do not create two ODI data servers, one for each schema.  LKMs are not required when the source schema and the target schema are both located on the same physical data server.  However, if the two schemas are located on different Oracle databases, then two ODI data servers are required, one for each schema, and a LKM will be required when performing data upload operations between these two ODI data servers.
  • Figure 8, below, illustrates two ODI data servers – Staging Area and Warehouse – which have been incorrectly configured for two schemas, MY_STG_AREA and MY_WAREHOUSE, respectively.  These two database schemas are located on the same Oracle database service – the JDBC URLs for these two ODI data servers are identical, they reference to the same Oracle database service.

 

Figure 8 - ODI Physical Architecture - Two Data Servers Configuration

Figure 8 – ODI Physical Architecture – Two Data Servers Configuration

 

  • The ODI data server configuration on Figure 8, above, is incorrect because it forces ODI to use a LKM to perform a data upload operation between two schemas that are located in the same data sever – there is no need to upload data that is already in the same data sever.   Also, in this case, the use of a LKM generates additional code that is not needed.  This unnecessary code will be executed by the data server, and this will result in a suboptimal performance of a mapping execution.  The unnecessary use of a LKM can be observed in the physical design of an ODI mapping – Figure 9, below, illustrates an example:

 

Figure 9 - ODI Mapping Physical Design – Loading Access Point

Figure 9 – ODI Mapping Physical Design – Loading Access Point

 

  • Figure 9, above, shows an ODI mapping, Dimensions.W_STATUS_D, with two datastores:  STATUS (the source datastore), and W_STATUS_D (the target datastore).  This ODI mapping uses the physical architecture defined on Figure 8, above.  The STATUS table is located on schema MY_STG_AREA, and the W_STATUS_D table is located on schema MY_WAREHOUSE.  In this example, ODI forces the use of a LKM because the schemas have been defined in separate ODI data servers.  This can be observed by exploring the loading access point called STATUS_AP.  This loading access point shows that a LKM –LKM Oracle to Oracle (datapump) – has been selected to upload data from the STATUS table.  To remove the loading access point for this mapping, the two schemas must be reconfigured under a single ODI data server.  Figure 10, below, shows the correct configuration:

 

Figure 10 - ODI Physical Architecture - One Data Server Configuration

Figure 10 – ODI Physical Architecture – One Data Server Configuration

 

  • Figure 11, below, shows the physical design of the same ODI mapping, Dimensions.W_STATUS_D.  ODI has removed both the loading access point and the LKM from the physical design; thus, the mapping will perform the data integration task without having to unnecessarily upload or stage the data from the source datastore.

 

Figure 11 - ODI Mapping Physical Design without a Loading Access Point

Figure 11 – ODI Mapping Physical Design without a Loading Access Point

 

Conclusion

 

ODI Knowledge Modules are code templates that perform data integration tasks in ODI mappings.  Out of the box, ODI offers over 150 knowledge modules – users can select, modify, and create their own knowledge modules as well.  When selecting a LKM, select the LKM that supports the fastest method for uploading data between data servers.  Follow the best practices discussed in this article to optimize the overall performance of your data upload operations in ODI.

For more Oracle Data Integrator best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-team Chronicles for Oracle Data Integrator (ODI).”

ODI Related Articles

Oracle Data Integrator Best Practices: Using Reverse-Engineering on the Cloud and on Premises

Using ODI Loading Knowledge Modules on the Oracle Database Cloud Service (DBCS)

Using Oracle Data Integrator (ODI) with Amazon Elastic MapReduce (EMR)

Using Oracle Data Pump with Oracle Data Integrator (ODI)

Integrating Oracle Data Integrator (ODI) On-Premise with Cloud Services

Oracle Data Integrator (ODI) Knowledge Modules (KMs)

Developing Knowledge Modules with Oracle Data Integrator

Oracle External Tables

Importing Objects in Oracle Data Integrator

Understanding Where to Install the ODI Standalone Agent

ODI 12c Exchange

Introducing Oracle Data Integrator (ODI) Exchange

Importing Data from SQL databases into Hadoop with Sqoop and Oracle Data Integrator (ODI)

Using Oracle Data Integrator (ODI) to Load BI Cloud Service (BICS)

$
0
0

For other A-Team articles about BICS, click here

Introduction

Oracle Data Integrator (ODI) is a comprehensive data integration platform that covers most data integration scenarios.  It has long been possible to use ODI to load data into BI Cloud Service (BICS) environments, that use Database as a Service (DBaaS) as the underlying database.

The recent 12.2.1.2.6 release of ODI added the ability to load data into BICS environments based on a Schema Service Database.  ODI does this by using the BICS REST API.

This article will walk through the following steps to set up ODI to load data into the BICS schema service database through this method:

  • Downloading latest version of ODI
  • Configuring the physical and logical connection to BICS in ODI
  • Loading BICS knowledge modules
  • Reverse engineering BICS model
  • Create a simple mapping
  • Importing the BICS certificate into the trust store for the standalone agent

This article will not cover the installation and setup of ODI.  The assumption is that a 12.2.1.2.6 environment has been stood up and is working correctly.  For details on how to install and configure ODI, see this document.

 

Main Article

Download The Latest Version of Oracle Data Integrator

Download and install the latest version of ODI from OTN through this link.

 

Configure and Test Connection to BICS

This article will walk through one (of the several) methods to set up the BICS connection with a Physical and Logical connection.  For more details on topology and other approaches, see this document.

1. In ODI studio, select the ‘Topology‘ tab, and expand out ‘Technologies‘ under the Physical Architecture section

Cursor_and_Windows7_x86

2. Scroll down to the ‘Oracle BI Cloud Service‘ entry, right click and select ‘New Data Server

Cursor_and_Windows7_x86

3. Give the Data Server a name, and enter the BICS Service URL, as well as the user credentials and Identity Domain.

The syntax for the URL is:

https://service-identity_domain.analytics.data_center.oraclecloud.com

This URL can be obtained from the BICS instance, by taking the first part of the URL up to ‘oraclecloud.com’

Oracle_BI_Cloud_Service

Note – the Data Loader path will default to /dataload/v1, leave this.

4. Save the Data Server.  ODI will give you an informational warning about needing to register at least one physical schema.  Click ‘OK‘.

Cursor_and_Windows7_x86

5. Test the connection by selecting ‘Test Connection

For the time being, use the ‘Local (No Agent)‘ option.

NOTE – Once configuration has been completed, the ODI Agent where the execution will be run should also be tested.  It is likely that additional configuration will need to be carried out – this is covered in the last section of this article ‘Importing the BICS certificate into the trust store for the standalone agent’.

Windows7_x86

If the credentials and URL have been entered correctly, a notification similar to the following should be displayed.  If an error is displayed, trouble-shoot and resolve before continuing.

Cursor_and_Windows7_x86

TIP :  

ODI studio’s local agent uses the JDK’s certificate store, whereas the Standalone Agent does not.  It is therefore possible – and quite likely – that while the local agent will provide a successful Test Connection, the Standalone agent will produce and error similar to the following:

oracle.odi.runtime.agent.invocation.InvocationException: oracle.odi.core.exception.OdiRuntimeException: javax.ws.rs.ProcessingException: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

To resolve this, the BICS Certificate needs to be added to the trust store used by the Standalone agent.  These steps are covered later in this article in the section ‘Importing Certificate into Trust Store

 

6. Right click on the Data Server created in the step 2, and select ‘New Physical Schema

Cursor_and_Windows7_x86

ODI has the ability to load to both the Database Objects (Tables) in the Schema Service Database, and also Data Sets.

This loading option is chosen in the ‘Target Type‘ dropdown.  The selection then associates the appropriate REST Operations for ODI to connect.  Note – once the target type has been chosen and saved, it cannot be changed.

7. In this example the Target Type of Table is selected.

Windows7_x86

8. Save the Physical Schema.

Because we haven’t associated this with a Logical Architecture yet, the following warning will be shown.  Click OK to complete the save.

Windows7_x86

9. Expand out the Logical Architecture section of Topology, and then right click on ‘Oracle BI Cloud Service‘ and create a ‘New Logical Schema

Windows7_x86

10. In the configuration window, give the Logical Schema a descriptive name, and associate your context(s) with the physical schema that was created in steps 6-8.  Save the changes.

Windows7_x86

11. Repeat the steps from 6 on if you need to create an additional connection to load Data Sets

 

Load BICS Knowledge Modules

ODI uses 2 different Knowledge Modules for BICS:

– a reverse knowledge module (RKM) called RKM Oracle BI Cloud Service, and

– an integration knowledge module (IKM) called IKM SQL to Oracle BI Cloud Service.

 

1. In ‘Designer‘ expand your project and the knowledge modules and see if the KMs are already available.

Cursor_and_Windows7_x86

If they are – continue to the ‘Reverse Engineering‘ section of this article.

2. If the KMs are not shown, right click on the Knowledge Modules section and select ‘Import Knowledge Modules

Windows7_x86

Browse to a path similar to this to find the import directory.

/u01/oracle/ODI12c/odi/sdk/xml-reference

3. In the import wizard, select the 2 BICS KMs, and then select ‘OK’ to load them.

Cursor_and_Windows7_x86

TIP :  

If you already have used ODI for other integration tasks, you may be tempted to use existing Knowledge Modules.  Please note that the IKM SQL to Oracle BI Cloud Service does not support loading the Oracle SDO_GEOMETRY data type column to the BICS target table.

Oracle BI Cloud Service cannot be used as the staging area, and does not support incremental update or flow/static check. Therefore, the following KMs will not work with the Oracle BI Cloud Service technology:

RKM SQL (JYTHON)

LKM File to SQL

CKM SQL

IKM SQL Incremental Update

IKM SQL Control Append

LKM SQL to SQL (JYTHON)

More details can be found in this document.

 

Reverse Engineer BICS

Reverse-engineering is the process that populates the model in ODI, by retrieving metadata from the data server containing the data structures.

 

1. Create a new model in Designer, by selecting the ‘New Model‘ option as shown below

Cursor_and_Windows7_x86

2. In the Definition tab, given the model a name, select the ‘Oracle BI Cloud Service‘ as the technology, and select the Logical Schema created previously.

Cursor_and_Windows7_x86

3. In the Reverse Engineer tab, leave the logical agent set to ‘Local (No Agent)‘, and select the RKM Oracle BI Cloud Service knowledge module.  Then save the changes.

Cursor_and_Windows7_x86

TIP :  

At the time of writing this article, there is a bug in the reverse knowledge module that will present an error if tables in the BICS environment contain non-standard characters.

An error like the following may be generated:

ODI-1590: The execution of the script failed.
Caused By: org.apache.bsf.BSFException: exception from Groovy: oracle.odi.runtime.rest.SnpsRSInvocationException: ODI-30163: REST tool invocation failed with response code : 500. URL : https://businessintelltrialXXXX-usoracletrialXXXXX.analytics.us2.oraclecloud.com/dataload/v1/tables/APEX$TEAM_DEV_FILES

There is at least one Apex related table within BICS environments that has a non-standard character.  That table, as shown in the error above, is ‘APEX$TEAM_DEV_FILES’.

Until this issue is fixed, a workaround is required.

The simplest is to go into the Apex environment attached to the BICS environment, rename the APEX$TEAM_DEV_FILES table temporarily, run the Reverse Engineer process, and then rename the table back.

Another method is to use the ‘Mask’ import option box.  If you have a specific table(s) you need to reverse engineer, then enter the name followed by %

For instance, if there were 5 tables all starting ‘FACT….’, then a mask of ‘FACT%’ could be used to reverse engineer those 5 tables.

 

4. Select the ‘Reverse Engineer‘ action, and then ‘OK‘ to run the action.

Cursor_and_Windows7_x86

5. This will start a session that can be viewed in the Operator.

Cursor_and_Windows7_x86

6. Once the session has completed, expand the model to confirm that the database objects have been imported correctly.  As shown below, the tables in the BICS Schema Service database are now available as targets.

Cursor_and_Windows7_x86

7. Expand the BICS individual database objects that you will load, and confirm within the attributes that the Datatypes have been set correctly.  Adjust where necessary and save.

Cursor_and_Windows7_x86

 

Create Mapping

1. Within the ‘Mapping‘ sub-menu of the project, select ‘New Mapping

Windows7_x86

2. Drag in the source table from the source that will be loaded into BICS, and then the BICS target table, and link the two together.  For more information on how to create mappings, see this document.

TIP :  

The BICS API only allows data to be loaded, not ‘read’ or ‘selected’.  Because of this, BICS using the Schema Service Database CAN ONLY BE USED as a TARGET for ODI mappings.  It can not be used as a SOURCE.

 

3. Make sure the Target is using the IKM SQL to Oracle BI Cloud Service:

Windows7_x86

and that an appropriate loading KM is used:

Cursor_and_Windows7_x86

4. Run the mapping, selecting the Local Agent

Windows7_x86

5. Confirm in the Operator that the mapping was successful.  Trouble-shoot an errors you find and re-run.

Cursor_and_Windows7_x86

 

Importing Certificate into Trust Store

To operate, it is likely that the Standalone Agent will require the BICS certificate be added to its trust store.

These instructions will use Microsoft Explorer, although other browsers offer similar functionality.

1. In a browser, open the BICS /analytics portal, then click on the padlock icon.  This will open an information box, within which select ‘View certificates

Cursor_and_Windows7_x86

2. In the ‘Details‘ tab, select the ‘Copy to File‘ option which will open an export wizard.

Windows7_x86

3. Select the ‘DER encoded binary‘ format and then ‘Next

Cursor_and_Windows7_x86

4. Chose a path and file name for the certificate, then ‘Next‘, and on the final screen ‘Finish‘ to export the certificate.

Cursor_and_Windows7_x86

 

TIP :  

This article will go through the steps needed to add this certificate to the DemoTrust.jks key store.  This should *ONLY* be followed for demonstration or test environments.  For production environments, follow best practice guidelines as outlined in this document.

 

5. Copy the certificate file created in the previous steps to a file system accessible by the host running the standalone ODI agent.

6. Set the JAVA_HOME to the path of the JDK used while installing the standalone agent, for example

export JAVA_HOME=/u01/oracle/jdk1.8.0_111/bin

7. Browse to the bin directory of the ODI Domain Home, in this test environment that path is as follows:

/u01/oracle/ODI12c/user_projects/domains/base_domain/bin

8. Run the ‘setODIDomainEnv‘ script.  In a linux environment this would be:

./setODIDomainEnv.sh

The DemoTrust.jks keystore used by the agent should be located in the following path:

$ORACLE_HOME/wlserver/server/lib

 

TIP :  

It is possible that there are a number of DemoTrust.jks key stores on the file system, so make sure the correct one is updated.  If this process fails to resolve the error with the Standalone Agent, search the file system and see if it is using a different trust store.

 

9. Browse to that directory and confirm the DemoTrust.jks file exists.  In that same directory – run the keytool command to import the certificate created earlier.

The syntax for the command is as follows, $CERTIFICATE referencing the name/path for the certificate file downloaded from the BICS environment through the browser, $ALIAS being a name for that, and $KEYSTORE the name/path of the key store.

keytool -importcert -file $CERTIFICATE -alias $ALIAS -keystore $KEYSTORE

In this example, the command would be:

keytool -importcert -file /u01/oracle/Downloads/BICS.cer -alias BICS -keystore DemoTrust.jks

the default password is DemoTrustKeyStorePassPhrase

10. Details of the certificate are displayed and a prompt to ‘Trust this certificate?’ is displayed.  Type ‘yes‘ and then hit enter.

Cursor_and_Windows7_x86

If the import is successful, a confirmation that the certificate was added to the keystore is given.

11. Return to ODI and run the mapping, this time selecting the Standalone agent, and confirm it runs successfully.

Summary

This article walked through the steps to configure ODI to load data into the BICS schema service database through the BICS API

For other A-Team articles about BICS, click here.

Viewing all 255 articles
Browse latest View live