Perform a z-score normalization on continuous values in accordance with the PMML element NormContinuous.

xform_z_score(wrap_object, xform_info = NA, map_missing_to = NA, ...)

Arguments

wrap_object

Output of xform_wrap or another transformation function.

xform_info

Specification of details of the transformation.

map_missing_to

Value to be given to the transformed variable if the value of the input variable is missing.

...

Further arguments passed to or from other methods.

Value

R object containing the raw data, the transformed data and data statistics.

Details

Perform a z-score normalization on data given in xform_wrap format.

Given an input variable named InputVar, the name of the transformed variable OutputVar, and the desired value of the transformed variable if the input variable value is missing missingVal, the xform_z_score command including all the optional parameters is:

xform_info="InputVar -> OutputVar", map_missing_to="missingVal"

Two methods can be used to refer to the variables. The first method is to use its column number; given the data attribute of the boxData object, this would be the order at which the variable appears. This can be indicated in the format "column#". The second method is to refer to the variable by its name.

The name of the transformed variable is optional; if the name is not provided, the transformed variable is given the name: "derived_" + original_variable_name

missingValue, an optional parameter, is the value to be given to the output variable if the input variable value is missing. If no input variable names are provided, by default all numeric variables are transformed. Note that in this case a replacement value for missing input values cannot be specified.

See also

Author

Tridivesh Jena

Examples


# Load the standard iris dataset
data(iris)

# First wrap the data
iris_box <- xform_wrap(iris)

# Perform a z-transform on all numeric variables of the loaded
# iris dataset. These would be Sepal.Length, Sepal.Width,
# Petal.Length, and Petal.Width. The 4 new derived variables
# will be named derived_Sepal.Length, derived_Sepal.Width,
# derived_Petal.Length, and derived_Petal.Width
iris_box_1 <- xform_z_score(iris_box)

# Perform a z-transform on the 1st column of the dataset (Sepal.Length)
# and give the derived variable the name "dsl"
iris_box_2 <- xform_z_score(iris_box, xform_info = "column1 -> dsl")

# Repeat the above operation; adding the new transformed variable
# to the iris_box object
iris_box <- xform_z_score(iris_box, xform_info = "column1 -> dsl")

# Transform Sepal.Width(the 2nd column)
# The new transformed variable will be given the default name
# "derived_Sepal.Width"
iris_box_3 <- xform_z_score(iris_box, xform_info = "column2")

# Repeat the same operation as above, this time using the variable
# name
iris_box_4 <- xform_z_score(iris_box, xform_info = "Sepal.Width")

# Repeat the same operation as above, assign the transformed variable
# "derived_Sepal.Width". The value of 1.0 if the input value of the
# "Sepal.Width" variable is missing. Add the new information to the
# iris_box object.
iris_box <- xform_z_score(iris_box,
  xform_info = "Sepal.Width",
  "map_missing_to=1.0"
)