ACE Director Alum Daniel Morgan, founder of Morgan's Library, is scheduling
complimentary technical Workshops on Database Security for the first 30
Oracle Database customers located anywhere in North America, EMEA, LATAM, or
APAC that send an email to
asra_us@oracle.com. Request a Workshop for
your organization today.
Purpose
The following is extracted from dbmsdmxf.sql.
The main principle behind the design of this package is a fact that SQL has enough power to efficiently perform most of the common mining xforms.
For example, binning can be done using CASE expression or DECODE function, and linear normalization is just a simple algebraic expression of the form (x - shift)/scale.
However, the queries that perform the xforms can be rather lengthy, thus it seems quite desirable to have some convenience routines that will help in query generation.
Thus, the goal of this package is to provide query generation services for the most common mining xforms, as well as to provide a framework that can be easily extended for implementing other xforms.
AUTHID
CURRENT_USER
Constants
Name
Data Type
Value
Nested Column
nest_num_col_type
NUMBER;
100001
nest_cat_col_type
NUMBER;
100002
nest_bd_col_type
NUMBER;
100003
nest_bf_col_type
NUMBER;
100004
nest_nums_col_name
VARCHAR2(30);
'DM_NESTED_NUMERICALS'
nest_cats_col_name
VARCHAR2(30);
'DM_NESTED_CATEGORICALS'
nest_bds_col_name
VARCHAR2(30);
'DM_NESTED_BINARY_DOUBLES'
nest_bfs_col_name
VARCHAR2(30);
'DM_NESTED_BINARY_FLOATS'
Data Types
TYPE column_list IS VARRAY(1000) OF dbms_quoted_id;
TYPE expression_rec IS RECORD (
lstmt dbms_sql.VARCHAR2A,
lb BINARY_INTEGER DEFAULT 1,
ub BINARY_INTEGER DEFAULT 0);
TYPE transform_list IS TABLE OF TRANSFORM_REC;
TYPE transform_rec IS RECORD (
attribute_name dbms_quoted_id,
attribute_subname VARCHAR2(4000),
expression EXPRESSION_REC,
reverse_expression EXPRESSION_REC,
attribute_spec VARCHAR2(4000));
SQL> desc c##safedba.dmtxcbc Name Null? Type
------------------ -------- --------------
COL VARCHAR2(128)
ATT VARCHAR2(4000)
VAL VARCHAR2(4000)
BIN VARCHAR2(4000)
SQL> desc c##safedba.dmtxcbn Name Null? Type
------------------ -------- --------------
COL VARCHAR2(128)
ATT VARCHAR2(4000)
VAL NUMBER
BIN VARCHAR2(4000)
SQL> desc c##safedba.dmtxcc Name Null? Type
------------------ -------- --------------
COL VARCHAR2(128)
ATT VARCHAR2(4000)
LCUT NUMBER
LVAL NUMBER
RCUT NUMBER
RVAL NUMBER
SQL> desc c##safedba.dmtxcnl Name Null? Type
------------------ -------- --------------
COL VARCHAR2(128)
ATT VARCHAR2(4000)
SHIFT NUMBER;
SCALE NUMBER;
Describes transformation of the data table specified by the stack definition
Overload 1
dbms_data_mining_transform.describe_stack(
xform_list IN sys.dbm_data_mining_transform.transform_list,
data_table_name IN VARCHAR2,
describe_list OUT sys.dbms_sql.desc_tab2,
data_schema_name IN VARCHAR2 DEFAULT NULL);
TBD
Overload 2
dbms_data_mining_transform.describe_stack(
xform_list IN sys.dbm_data_mining_transform.transform_list,
data_table_name IN VARCHAR2,
describe_list OUT sys.dbms_sql.desc_tab3,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER column in the data table that is not in the exclusion list finds numerical binning definition and inserts it into the definition table
dbms_data_mining_transform.insert_autobin_num_eqwidth(
bin_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
bin_num IN BINARY_INTEGER DEFAULT 3,
max_bin_num IN BINARY_INTEGER DEFAULT 100,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
round_num IN BINARY_INTEGER DEFAULT 6,
sample_size IN BINARY_INTEGER DEFAULT 5000,
bin_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
rem_table_name IN VARCHAR2 DEFAULT NULL,
rem_schema_name IN VARCHAR2 DEFAULT NULL);
For every VARCHAR2, CHAR column in the data table that is not in the exclusion list finds categorical binning definition and inserts it into the definition table
dbms_data_mining_transform.insert_bin_cat_freq(
bin_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
bin_num IN BINARY_INTEGER DEFAULT 9,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
default_num IN BINARY_INTEGER DEFAULT 2,
bin_support IN NUMBER DEFAULT NULL,
bin_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER column in the data table that is not in the exclusion list finds numerical binning definition and inserts it into the definition table
dbms_data_mining_transform.insert_bin_num_eqwidth(
bin_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
bin_num IN BINARY_INTEGER DEFAULT 10,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
round_num IN BINARY_INTEGER DEFAULT 6,
bin_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER column in the data table that is not in the exclusion list finds numerical binning definition and inserts it into the definition table
dbms_data_mining_transform.insert_bin_num_qtile(
bin_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
bin_num IN BINARY_INTEGER DEFAULT 10,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
bin_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER, VARCHAR2, or CHAR, as well as DM_NESTED_NUMERICAL, or DM_NESTED_CATEGORICAL column in the data table that is not in the exclusion list
finds numerical and categorical binning definition and inserts them into the corresponding definition tables
dbms_data_mining_transform.insert_bin_super(
num_table_name IN VARCHAR2,
cat_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
target_column_name IN VARCHAR2,
max_bin_num IN BINARY_INTEGER DEFAULT 1000,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
num_schema_name IN VARCHAR2 DEFAULT NULL,
cat_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
rem_table_name IN VARCHAR2 DEFAULT NULL,
rem_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER column in the data table that is not in the exclusion list finds clipping definition and inserts it into the definition table
dbms_data_mining_transform.insert_clip_trim_tail(
clip_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
tail_frac IN NUMBER DEFAULT 0.025,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
clip_schema_name IN VARCHAR2 DEFAULT NULL,,
data_schema_name IN VARCHAR2 DEFAULT NULL,);
For every NUMBER column in the data table that is not in the exclusion list finds clipping definition and inserts it into the definition table
dbms_data_mining_transform.insert_clip_winsor_tail(
clip_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
tail_frac IN NUMBER DEFAULT 0.025,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
clip_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every VARCHAR2, CHAR column in the data table that is not in the exclusion list finds missing value treatment definition and inserts it into the definition table
dbms_data_mining_transform.insert_miss_cat_mode(
miss_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
miss_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER column in the data table that is not in the exclusion list finds missing value treatment definition and inserts it into the definition table
dbms_data_mining_transform.insert_miss_num_mean(
miss_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
round_num IN BINARY_INTEGER DEFAULT 6,
miss_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER column in the data table that is not in the exclusion list finds normalization definition and inserts it into the definition table
dbms_data_mining_transform.insert_norm_lin_minmax(
norm_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
round_num IN BINARY_INTEGER DEFAULT 6,
norm_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER column in the data table that is not in the exclusion list finds normalization definition and inserts it into the definition table
dbms_data_mining_transform.insert_norm_lin_scale(
norm_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
round_num IN BINARY_INTEGER DEFAULT 6,
norm_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
For every NUMBER column in the data table that is not in the exclusion list finds normalization definition and inserts it into the definition table
dbms_data_mining_transform.insert_norm_lin_zscore(
norm_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
round_num IN BINARY_INTEGER DEFAULT 6,
norm_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL);
dbms_data_mining_transform.set_transform(
xform_list IN OUT NOCOPY sys.dbms_data_mining_transform.transform_list,
attribute_name IN VARCHAR2,
attribute_subname IN VARCHAR2,
expression IN CLOB,
reverse_expression IN VARCHAR2,
attribute_spec IN VARCHAR2 DEFAULT NULL);
For every column in the stack definition that has a matching entry in the transformation definition,
stacks categorical binning expression on top of the existing expression and updates the stack definition
dbms_data_mining_transform.stack_bin_cat(
bin_table_name IN VARCHAR2,
xform_list IN OUT NOCOPY sys.dbms_data_mining_transform.transform_list,
literal_flag IN BOOLEAN DEFAULT FALSE,
bin_schema_name IN VARCHAR2 DEFAULT NULL);
For every column in the stack definition that has a matching entry in the transformation definition,
stacks numerical binning expression on top of the existing expression and updates the stack definition
dbms_data_mining_transform.stack_bin_num(
bin_table_name IN VARCHAR2,
xform_list IN OUT NOCOPY sys.dbms_data_mining_transform.transform_list,
literal_flag IN BOOLEAN DEFAULT FALSE,
bin_schema_name IN VARCHAR2 DEFAULT NULL);
For every column in the stack definition that has a matching entry in the transformation definition,
stacks clipping expression on top of the existing expression and updates the stack definition
dbms_data_mining_transform.stack_clip(
clip_table_name IN VARCHAR2,
xform_list IN OUT NOCOPY sys.dbms_data_mining_transform.transform_list,
clip_schema_name IN VARCHAR2 DEFAULT NULL);
For every column in the stack definition that has a matching entry in the transformation definition,
stacks column removal expression instead of the existing expression and updates the stack definition
dbms_data_mining_transform.stack_col_rem(
rem_table_name IN VARCHAR2,
xform_list IN OUT NOCOPY sys.dbms_data_mining_transform.transform_list,
rem_schema_name IN VARCHAR2 DEFAULT NULL);
For every column in the stack definition that has a matching entry in the transformation definition,
stacks categorical missing value treatment expression on top of the existing expression and updates the stack definition
dbms_data_mining_transform.stack_miss_cat(
miss_table_name IN VARCHAR2,
xform_list IN OUT NOCOPY sys.dbms_data_mining_transform.transform_list,
miss_schema_name IN VARCHAR2 DEFAULT NULL);
For every column in the stack definition that has a matching entry in the transformation definition,
stacks numerical missing value treatment expression on top of the existing expression and updates the stack definition
dbms_data_mining_transform.stack_miss_num(
miss_table_name IN VARCHAR2,
xform_list IN OUT NOCOPY sys.dbms_data_mining_transform.transform_list,
miss_schema_name IN VARCHAR2 DEFAULT NULL);
For every column in the stack definition that has a matching entry in the transformation definition,
stacks linear normalization expression on top of the existing expression
and updates the stack definition
dbms_data_mining_transform.stack_norm_lin(
norm_table_name IN VARCHAR2,
xform_list IN OUT NOCOPY sys.dbms_data_mining_transform.transform_list,
norm_schema_name IN VARCHAR2 DEFAULT NULL);
Creates a view that performs categorical binning of the data table
dbms_data_mining_transform.xform_bin_cat(
bin_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
literal_flag IN BOOLEAN DEFAULT FALSE,
bin_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
PRAGMA SUPPLEMENTAL_LOG_DATA(xform_bin_cat, AUTO_WITH_COMMIT);
Creates a view that performs numerical binning of the data table.
Only the columns that are specified in the xform definition are binned, the remaining columns do not change
dbms_data_mining_transform.xform_bin_num(
bin_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
literal_flag IN BOOLEAN DEFAULT FALSE,
bin_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
PRAGMA SUPPLEMENTAL_LOG_DATA(xform_bin_num, AUTO_WITH_COMMIT);
Creates a view that performs clipping of the data table.
Only the columns that are specified in the xform definition are clipped, the remaining columns do not change
dbms_data_mining_transform.xform_clip(
clip_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
clip_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
PRAGMA SUPPLEMENTAL_LOG_DATA(xform_clip, AUTO_WITH_COMMIT);
Creates a view that performs column removal from the data table.
Only the columns that are specified in the xform definition are removed, the remaining columns do not change
dbms_data_mining_transform.xform_col_rem(
rem_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
rem_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
PRAGMA SUPPLEMENTAL_LOG_DATA(xform_col_rem, AUTO_WITH_COMMIT);
Creates a view that applies a given expression for every NUMBER column in the data table that is not in the exclusion list and in the inclusion list
dbms_data_mining_transform.xform_expr_num(
expr_pattern IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
exclude_list IN
sys.dbms_data_mining_transform.column_list DEFAULT NULL,
include_list IN
sys.dbms_data_mining_transform.column_list DEFAULT NULL,
col_pattern IN VARCHAR2 DEFAULT ':col',
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
Similar to xform_expr_num, except that it applies to CHAR and VARCHAR2 columns instead of NUMBER
dbms_data_mining_transform.xform_expr_str(
expr_pattern IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
exclude_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
include_list IN sys.dbms_data_mining_transform.column_list DEFAULT NULL,
col_pattern IN VARCHAR2 DEFAULT ':col',
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
Creates a view that performs categorical missing value treatment of the data table
dbms_data_mining_transform.xform_miss_cat(
miss_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
miss_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
PRAGMA SUPPLEMENTAL_LOG_DATA(xform_miss_cat, AUTO_WITH_COMMIT);
Creates a view that performs numerical missing value treatment of the data table
dbms_data_mining_transform.xform_miss_num(
miss_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
miss_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
PRAGMA SUPPLEMENTAL_LOG_DATA(xform_miss_num, AUTO_WITH_COMMIT);
Creates a view that performs linear normalization of the data table.
Only the columns that are specified in the xform definition are normalized, the remaining columns do not change
dbms_data_mining_transform.xform_norm_lin(
norm_table_name IN VARCHAR2,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
norm_schema_name IN VARCHAR2 DEFAULT NULL,
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);
PRAGMA SUPPLEMENTAL_LOG_DATA(xform_norm_lin, AUTO_WITH_COMMIT);
Creates a view that performs transformation of the data table specified by the stack definition
dbms_data_mining_transform.xform_stack(
xform_list IN sys.dbms_data_mining_transform.transform_list,
data_table_name IN VARCHAR2,
xform_view_name IN VARCHAR2,
data_schema_name IN VARCHAR2 DEFAULT NULL,
xform_schema_name IN VARCHAR2 DEFAULT NULL);