Skip to content

auto_code#

tdprepview.auto_code #

auto_code(DF, input_schema='', input_table='', non_feature_cols=[])

Generate Python code for a suggested preprocessing pipeline based on a DataFrame or database schema.

This function analyzes the input data or the database schema to automatically create a tdprepview preprocessing pipeline. The output is Python code that can serve as a starting point for building machine learning workflows in ClearScape.

Parameters:

Name Type Description Default
DF DataFrame

A teradataml.DataFrame containing the input data. If None, the DataFrame will be constructed from input_schema and input_table.

required
input_schema str

Database schema name, used to generate the DataFrame if DF is None.

''
input_table str

Database table or view name, used to generate the DataFrame if DF is None.

''
non_feature_cols list

List of column names to exclude from preprocessing, e.g., IDs or target variables.

[]

Returns:

Type Description
str

A string containing Python code that defines the suggested preprocessing pipeline.

Examples:

Generate code from a DataFrame:

import tdprepview
import teradataml as tdml

DF = tdml.DataFrame(tdml.in_schema("my_schema","my_table"))
code_str = tdprepview.auto_code(DF, non_feature_cols=["ID","target"])
print(code_str)
my_pipeline = tdprepview.Pipeline(steps= eval(code_str))

Generate code from database table directly:

import tdprepview

code_str = tdprepview.auto_code(
    DF=None,
    input_schema="my_schema",
    input_table="my_table",
    non_feature_cols=["ID","target"]
)
print(code_str)
my_pipeline = tdprepview.Pipeline(steps= eval(code_str))