Having recently attended one of Oracle’s coming out parties for its
new Data Mining software, I thought I’d take the opportunity while its
still fresh in my head to talk about how the two work together in
Oracle’s vision of BI.
Typical Query & Reporting, OLAP or other analysis tools tend to be good at getting the details of a type of decision you know you want to make. We are able to define metrics and dimensions and drill downs and reports and dashboards, as we usually have an idea about what we are after. We just don’t know the #s and products and regions and customers involved. A user might know how to define what a good customer is, and the BI system can develop that rule to show who the best customers are. But what about a relatively new customer – will they be a good one or a bad one? Can you make them into a better customer than they already are?
With that being stated as just a few points about the differences (I don’t want to get into the whole topic on this post), can the two get along? The answer is absolutely, and it is a particular strength of Oracle’s offerings as compared to others.
Keeping in line with Oracle’s philosophy of expanding functionality into the database engine, ODM is included in the Oracle 10 R2 database engine. The unique architecture of having it in the database, and accessible via PL/SQL or Java APIs allows all of the other capabilities built into the Oracle platform to work – such as parallelism, indexing, materialized views, security, High Availability, etc. The other great benefit of doing this is in its ability to eliminate data movement out of the database and into a Data Mining tool such as SAS – a time consuming effort for many environments.
The Data Mining process is very different than traditional BI. With Data Mining, you first goal is to build a model, which the software creates off of a sample dataset that you first have to clean. This model (think of it as super advanced function you can call on a record) can then be deployed to a single record or a whole dataset.
Use of a single record is useful in front-line applications, where a real time scoring may be done, with a resultant probability or outcome to be predicted. The most commonly used example here is in a Call Center environment, where based off of the call and customer coming in, a prediction can be made as to which Offers the customer would most likely respond. (If this sounds a lot like Oracle Real-Time Decision (RTD, acquired via Siebel via Sigma Dynamics), you are correct. In this class of BI tools, Oracle plans to sell two products which have some solid overlap in functionality and capability.) This scenario, where records are scored in real time, does not lend itself well to the batch processing that a typical BI system (OBI EE) does. Although the data mining functions are accessible directly from SQL, presently OBI EE cannot generate these SQL extensions. Of course one can get around that by using Opaque Views or database views and simply mapping a column. I would consider this something you wouldn’t want to do save but a few rare cases.
The second manner is the ability to score a whole recordset – to basically execute an UPDATE statement in a nightly or weekly ETL load, setting a column to a value computed by the Data Mining Model. In this scenario, the result is very simply a column in a table. As such, any of our normal techniques can be applied to the column – it can be a dimensional attribute useful for filtering or campaign segmenting, or it can be used as a metric, such as an Avg Expect Value. Sorry for the let down – there really isn’t anything flashy about using it.
What is Data Mining?
First off, what is Data Mining and how is it different from traditional Business Intelligence? Without getting into too much of a detailed discussion, Data Mining is the practice of extracting hidden knowledge out of a dataset to aid you in the decision making process. By hidden knowledge, think of predicting what products a customer might buy based on a profile that has been developed via a review of thousands of other customers. Or how about which attributes of a customer affects their buying patterns the most? Think statistics, probabilities, clusters, correlations and predictions instead of trends and summaries. Sometimes we don’t even know what to look for – Data Mining helps us out there with oftentimes startling revelations.Typical Query & Reporting, OLAP or other analysis tools tend to be good at getting the details of a type of decision you know you want to make. We are able to define metrics and dimensions and drill downs and reports and dashboards, as we usually have an idea about what we are after. We just don’t know the #s and products and regions and customers involved. A user might know how to define what a good customer is, and the BI system can develop that rule to show who the best customers are. But what about a relatively new customer – will they be a good one or a bad one? Can you make them into a better customer than they already are?
With that being stated as just a few points about the differences (I don’t want to get into the whole topic on this post), can the two get along? The answer is absolutely, and it is a particular strength of Oracle’s offerings as compared to others.
Keeping in line with Oracle’s philosophy of expanding functionality into the database engine, ODM is included in the Oracle 10 R2 database engine. The unique architecture of having it in the database, and accessible via PL/SQL or Java APIs allows all of the other capabilities built into the Oracle platform to work – such as parallelism, indexing, materialized views, security, High Availability, etc. The other great benefit of doing this is in its ability to eliminate data movement out of the database and into a Data Mining tool such as SAS – a time consuming effort for many environments.
The Data Mining process is very different than traditional BI. With Data Mining, you first goal is to build a model, which the software creates off of a sample dataset that you first have to clean. This model (think of it as super advanced function you can call on a record) can then be deployed to a single record or a whole dataset.
Use of a single record is useful in front-line applications, where a real time scoring may be done, with a resultant probability or outcome to be predicted. The most commonly used example here is in a Call Center environment, where based off of the call and customer coming in, a prediction can be made as to which Offers the customer would most likely respond. (If this sounds a lot like Oracle Real-Time Decision (RTD, acquired via Siebel via Sigma Dynamics), you are correct. In this class of BI tools, Oracle plans to sell two products which have some solid overlap in functionality and capability.) This scenario, where records are scored in real time, does not lend itself well to the batch processing that a typical BI system (OBI EE) does. Although the data mining functions are accessible directly from SQL, presently OBI EE cannot generate these SQL extensions. Of course one can get around that by using Opaque Views or database views and simply mapping a column. I would consider this something you wouldn’t want to do save but a few rare cases.
The second manner is the ability to score a whole recordset – to basically execute an UPDATE statement in a nightly or weekly ETL load, setting a column to a value computed by the Data Mining Model. In this scenario, the result is very simply a column in a table. As such, any of our normal techniques can be applied to the column – it can be a dimensional attribute useful for filtering or campaign segmenting, or it can be used as a metric, such as an Avg Expect Value. Sorry for the let down – there really isn’t anything flashy about using it.
Комментариев нет:
Отправить комментарий