Python Column Transformation

Executes Python function provided by the user on a column (columns) of DataFrame connected to its input port. Returns modified DataFrame.

Also returns a Transformer that can be later applied to another DataFrame with a Transform operation.

The function that will be executed has to:

The function is applied to the input DataFrame in parallel for better performance.

The variables and functions available in the operations’ global scope:

Example Python code:

def transform_value(value, column_name):
    return value

Since: Seahorse 1.0.0


Port Type Qualifier Description
0 DataFrame The DataFrame to be transformed.


Port Type Qualifier Description
0 DataFrame The output DataFrame.
1 Transformer A Transformer that allows to apply the operation on other DataFrames using a Transform.


Name Type Description
column operation code Code Snippet The Python code to be executed. It has to contain a Python function complying to the signature presented in the operation's description.
target type Choice The target type of the conversion. Possible values are: [String, Boolean, Timestamp, Double, Float, Long, Integer, Vector].
operate on InputOutputColumnSelector Input and output columns for the operation.



Name Value
column operation code
def transform_value(value, column_name):
    return min(value, 2.0)
target type double
operate on one column
input column "Weight"
output append new column
output column "WeightCutoff"


Animal Kind Weight
Cow Mammal 300.0
Ostrich Bird 0.5
Dog Mammal 5.0
Sparrow Bird 0.5
Thing null


Animal Kind Weight WeightCutoff
Cow Mammal 300.0 2.0
Ostrich Bird 0.5 0.5
Dog Mammal 5.0 2.0
Sparrow Bird 0.5 0.5
Thing null null