Pre-Processing Exporter Module¶
-
def
any_in
(seq_a, seq_b)[source]¶ Checks for common elements in two given sequence elements
Parameters: - seq_a (list) – A list of items
- seq_b (list) – A list of items
Returns: Return type: Returns a boolean value if any item of seq_a belongs to seq_b or visa versa
-
def
binarizer
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s Binarizer
Parameters: - trfm – Contains the Sklearn’s Binarizer preprocessing instance.
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to Binarizer preprocessing.
Return type: dictionary
-
def
cat_imputer
(trfm, col_names)[source]¶ Generates pre-processing elements for sklearn-pandas’ CategoricalImputer
Parameters: - trfm – Contains the Sklearn’s Imputer preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to Imputer preprocessing.
Return type: dictionary
-
def
count_vectorizer
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s CountVectorizer
Parameters: - trfm – Contains the Sklearn’s CountVectorizer preprocessing instance.
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to CountVectorizer preprocessing.
Return type: dictionary
-
def
get_class_name
(cls)[source]¶ Provides the class name for the given instance
Parameters: cls – Contains the Sklearn’s preprocessing instance Returns: Return type: Returns the class name of the pre-processed object.
-
def
get_derived_colnames
(trfm_name, col_names, *args)[source]¶ Generates derived column names for a given transformer
Parameters: - trfm_name (String) – Name of the derived field to be assigned after preprocessing
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pml_pp – Returns a list that contains names of the preprocessed features.
Return type: list
-
def
get_pml_derived_flds
(trfm, col_names, **kwargs)[source]¶ Generates elements related to pre-processing for a given transformer object
Parameters: - trfm – Contains the Sklearn’s preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pml_pp – Returns a dictionary that contains attributes related to any preprocessing function .
Return type: dictionary
-
def
get_preprocess_val
(ppln_sans_predictor, initial_colnames, model)[source]¶ Generates elements related to pre-processing
Parameters: - model – Contains an instance of Sklearn model
- ppln_sans_predictor – Contains an instance of Sklearn Pipeline
- initial_colnames (list) – Contains list of feature/column names.
Returns: pml_pp – Returns a dictionary that contains data related to pre-processing
Return type: dictionary
-
def
imputer
(trfm, col_names, **kwargs)[source]¶ Generates pre-processing elements for Scikit-Learn’s Imputer
Parameters: - trfm – Contains the Sklearn’s Imputer preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to Imputer preprocessing.
Return type: dictionary
-
def
lag
(trfm, col_names)[source]¶ Generates pre-processing elements for Nyoka’s Lag
Parameters: - trfm – Contains the Nyoka’s Lag instance.
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to Lag preprocessing.
Return type: dictionary
-
def
lbl_binarizer
(trfm, col_names, **kwargs)[source]¶ Generates pre-processing elements for Scikit-Learn’s LabelBinarizer
Parameters: - trfm – Contains the Sklearn’s Label Binarizer preprocessing instance.
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to Label Binarizer preprocessing.
Return type: dictionary
-
def
lbl_encoder
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s LabelEncoder
Parameters: - trfm – Contains the Sklearn’s LabelEncoder preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to LabelEncoder preprocessing.
Return type: dictionary
-
def
max_abs_scaler
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s MaxAbsScaler
Parameters: - trfm – Contains the Sklearn’s MaxabsScaler preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to MaxabsScaler preprocessing.
Return type: dictionary
-
def
min_max_scaler
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s MinMaxScaler
Parameters: - trfm – Contains the Sklearn’s MinMaxScaler preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to MinMaxScaler preprocessing.
Return type: dictionary
-
def
one_hot_encoder
(trfm, col_names, **kwargs)[source]¶ Generates pre-processing elements for Scikit-Learn’s OneHotEncoder
Parameters: - trfm – Contains the Sklearn’s One hot encoder preprocessing instance.
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to Label Binarizer preprocessing.
Return type: dictionary
-
def
pca
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s PCA
Parameters: - trfm – Contains the Sklearn’s PCA preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to PCA preprocessing.
Return type: dictionary
-
def
polynomial_features
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s PolynomialFeatures
Parameters: - trfm – Contains the Sklearn’s PolynomialFeatures preprocessing instance.
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to PolynomialFeatures preprocessing.
Return type: dictionary
-
def
rbst_scaler
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s RobustScaler
Parameters: - trfm – Contains the Sklearn’s RobustScaler preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to RobustScaler preprocessing.
Return type: dictionary
-
def
std_scaler
(trfm, col_names, **kwargs)[source]¶ Generates pre-processing elements for Scikit-Learn’s StandardScaler
Parameters: - trfm – Contains the Sklearn’s Standard Scaler preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to Standard Scaler preprocessing.
Return type: dictionary
-
def
tfidf_vectorizer
(trfm, col_names)[source]¶ Generates pre-processing elements for Scikit-Learn’s TfIdfVectorizer
Parameters: - trfm – Contains the Sklearn’s TfIdfVectorizer preprocessing instance
- col_names (list) – Contains list of feature/column names. The column names may represent the names of preprocessed attributes.
Returns: pp_dict – Returns a dictionary that contains attributes related to TfIdfVectorizer preprocessing.
Return type: dictionary