Wednesday, May 6, 2015

Identifying and Managing the Data Warehouse "I Don’t Knows..."

Last week, in Part 1 of a 2-Parts ODI series, we discussed the vulnerabilities of the data warehouse, particularly when management is lacking knowledge of what jobs are in the Data Warehouse, what the jobs actually do, or how things have been coded.

In Part 2, we will focus on the importance of a project manager when it comes to leveraging ODI for your data warehouse projects.
As consultants, we often come across clients who think of ODI as if it were like every other ETL tool. There is a notion that we can jump right in, hook the technology on the left up to the technology on the right, press a button and walk away. Can you do that with ODI? Absolutely. But let’s look more closely at why you do not want to rush an ODI project and why project management in ODI is important.

Employing a Methodology

Assume you already have ODI packages built and you need additional work completed. The prior developer had not been particularly well organized with regard to the naming of objects built in ODI. You also notice they have hardcoded everything down to the “TO” and “FROM” address in the emails that get sent during the jobs…
So, how do we get out of this mess? Simply by coming up with a naming standard that can be used as a prefix on objects in ODI it will make it easier to troubleshoot down the road. Adding some variables to the project can make the object reusable and easier to maintain. In the scenario given above, a simple variable for the "TO and "FROM" block of an email notification can save a lot of money and development in the long run for a corporation.

Here at CheckPoint we have come up with a numbering methodology, which is shared with our project managers to keep ODI projects organized. By taking time to look at the whole picture, not just one job at a time, the packages can be rebuilt so they interact, rather than just exist.

By employing this methodology we can immediately identify areas of disparity as well as how to save on future maintenance costs. Helping to keep those "I Don't Knows…" out of your data warehouse like we talked about in Part 1. We require our developers to document as they go so as not to miss any of the details. This also makes future knowledge transfer as smooth as possible.

Data Warehouse Risk

The high rate of turnover of employees and contractors in the Data Warehouse arena often leaves managers in a troubling position. If the organization is not employing the use of ODI, there may be a lot of open-ended questions leading Management scrambling for answers – suddenly the Data Warehouse becomes a high-risk unit compromising the corporation.

Stopping the “I Don’t Knows...”

“I don’t know… what jobs are in the Data Warehouse.” Not only might the Manager not know which jobs there are, but which are actively running.
“I don’t know… what these jobs actually do. Many jobs in the data warehouse lack documentation that is easy readable or easily compliable on short notice.
Let’s use the scenario that an employee leaves – now the Manager has to play catch up. They have to hire a replacement and train them on the current environment. There never seems to be enough time for knowledge transfer from the employee exiting. ODI can help – it not only has a place to take notes in the job, but then with the simple click of a button the documentation will appear in PDF format. Something else to consider: data warehouses normally hold hundreds of jobs. How long would it take for you to compile all of the documentation for every single one?
“I don’t know… how this was coded.” Sometimes a Manager might have a great employee who knows Advance coding that pass parameter into other parameters that then pick something else. Are all your developers at this level? ODI forces the developer to use the same style of coding as for much of the functionality needed for data warehousing out of the box.
Any data warehouse can be great if it’s organized; here at CheckPoint we feel the way to do this is through ODI. We have come up with a system to not only keep the jobs organized but also standard across the board. With the new ODI 12c merging with OWB this brings even more out-of-the-box functionality for data warehousing to the table.

Next we’ll discuss ODI Project Management in Part 2 of Identifying and Managing the Data Warehouse "I Don’t Knows..."

Automap For ODI12c Not Working?

ODI 10g and 11g came with a nifty function that when you pulled a source and a target into an Interface it would ask you if you wanted to Automap columns. This function was a default setting that came when installing the client.
A client wanted to go with ODI12c and be on the bleeding edge of Oracle technology. Install and Upgrade went perfectly, but when we would drag from one bubble to another to cause an Automap to automagically happen, absolutely nothing happened. Why in the world would Oracle take this awesome and sometimes critical option away?
First things first. We downloaded the VMBox that Oracle so nicely provides on their website. (just ask and I will point you to this). After downloading ODI12c vm, reading the install guide, release notes, and completing the install we were ready to see what Oracle had done. Lo and behold, Automapping works using the prepackaged VM from Oracle. Next logical thing was to go out to OTN and see if anyone else was having this issue. All we were able to find was the following:
Attribute Matching
“The Attribute Matching Dialog is displayed when a connector is drawn to a projector component (see "Projector Components") in the Mapping Editor. The Attribute Matching Dialog gives you an option to automatically create expressions to map attributes from the source to the target component based on a matching mechanism. It also gives the option to create new attributes on the target based on the source, or new attributes on the source based on the target.
This feature allows you to easily define a set of attributes in a component that are derived from another component. For example, you could drag a connection from a new, empty Set component to a downstream target datastore. If you leave checked the Create Attributes On Source option in the Attribute Matching dialog, the Set component will be populated with all of the attributes of the target datastore. When you connect the Set component to upstream components, you will already have the target attributes ready for you to map the upstream attributes to.”
With all research exhausted we decided that it must be an issue with the install and upgrade. After everything was reinstalled and re-upgraded we discovered the Automap still wasn’t working. We then opened an SR with Oracle, which simply stated, “Why isn’t Automap working in 12c… Please Help!?” They informed us that it’s not standard to have Automapping active and it’s by user preference now, so it can be turned off and on at will. There is no mention of this anywhere.
With that said here are the steps that will make your life easier in the future... First, login to an ODI Development environment (make sure it’s not an execution environment).
Follow these steps:
1. Click on Tools>Preferences
B_1
2. Go To ODI and Click on the +
b_2
3. Under ODI click on the + beside User Interface
b_3
4. Under User Interface click on Designer, to the right you will see that the “Display ‘Attribute Matching’ dialog for mappings”
b_4
It’s that simple. Now you have Automapping.
Now let’s cover why you would want to turn off this amazing feature sometimes when you are developing. Let’s say you have a lot of development that is not a 1:1 column mapping, if you are not careful you could accidently create columns on your Components (Components are new to 12c’s flow based processing).
b_5
As you can see from the above screenshot there are a lot of options to uncheck just because you wanted to draw the data flow path from one object to another. Save yourself time -- unless you need to have it checked leave it turned off. Odds are if you are moving whole tables from one place to another, you are moving tables and data from SQLServer to Oracle.

Seamless iTunes and ODI Integration

Who said two powerhouses can’t work together in perfect harmony? Here we have Apple Inc. and Oracle -- two of the largest in their respective industries -- which can now interface with each other seamlessly. If you are a business that uploads your products to iTunes wouldn’t it be nice to be able to view reports on sales daily/weekly/monthly?That is where Apple Connect and ODI12c come in. Apple Connect gives users a way to download reports to show anything from daily to yearly stats on sales -- we are even given the option to pull pre-sales reports. Apple Connect does a very good job of documenting the different options for reporting in a .pdf document that can be found on Apples website www.itunesconnect.apple.com 
Why ODI12c?
Well let’s say that you wanted to take these reports and load them to a target area so you could do reporting on the sales numbers. Without a tool to schedule, a mechanism to download, and another tool to then unzip, and then another tool to load the data to a target environment, you would be looking at a customized mess. What if you only had to worry about dragging and dropping a couple of objects into a gui interface and drawing a line from one to the other? Then you could use ODI12c built-in scheduling -- virtually nothing custom involved.
Below is the Task Hierarchy from a Knowledge Module that I created to make this all happen with one click of the mouse:
b1
The steps are:
Download the file -- this uses built in ODI tools to call an API for iTunes Connect. There are additional options for download so I added flexibility to this step to make it user friendly to change.
Unzip File -- this step uses built-in ODI tools to call gzip.
The next steps comes packaged with ODI...These steps create and drop work tables and load the unzipped file to the target location in this case it is an Oracle table.
The next step is Archive Zip File -- since we already have it zipped we simple move this file to an archive folder.
The next step is Delete Source File -- since we don’t want to have this process run with old data we want to delete the file that we unzipped and are using to loading the target.
b2
Here is what the whole process looks like in the mapping area of ODI. Again all the steps listed above are happening in what you see in the diagram above, from unzip to delete source file.
What happens, though, if you need to reload a file and you don’t want to download and unzip to happen? Simply create an additional Physical Mapping Design that uses a built-in file to target integration; this sounds duplicative but it’s not. Then when you run the mapping you can select the Physical Mapping Design you want to run.
b3
In a few weeks, my next post will cover additional case studies which use different Physical Mapping Designs.

Additional Variables Grayed Out

Little Issue with Running a Scenario from a Package
If you have used ODI before you have created a package with multiple steps, and one of those steps was inserted as a scenario. Sometimes we need to pass variables to those scenario steps, but we quickly find out that there is a roadblock.Normally, you click on the scenario step within a package, in the properties section there is a tab called “Additional Variables.”  Under “Additional Variables” there is a small box that you can click that will add variables that have been declared in the scenario.



As you can see from the snapshot above, it is grayed out. According to Oracle this is a known bug.
I am happy to report that I found a workaround and here it is:
Simply go to the “Command” tab and add the variable there. This is just like adding the variables when calling it from an OS command line.
The format for that is “-<Project_Code>.<Variable_Name>=<Value_Passed”
Once that is added go back to the “Additional Variables” tab and make sure it parses correctly.
Simple enough, right? Once this was completed the package flawlessly passes the variable to scenario step. Please let me know if you have any questions on this or if you are facing any issue that you need help troubleshooting.

I wanted to created a blog that talked, in simple terms, about how to properly plan for a Oracle Data Integrator (ODI) project. i have seen other blogs out there on how to properly use, tips and trick for ODI, but not the different way that you have to think about ODI and organizing ODI, before starting a project.