Executes Python function provided by the user on a column (columns) of DataFrame connected to its input port.
Returns modified DataFrame
.
Also returns a Transformer that can be later applied
to another DataFrame
with a Transform operation.
The function that will be executed has to:
be named transform_value
,
take exactly two arguments: the value to be transformed and the name of column currently being transformed,
return the transformed value that conforms to the selected target type (parameter).
The function is applied to the input DataFrame
in parallel for better performance.
The variables and functions available in the operations’ global scope:
dataframe()
- a function that returns the input DataFrame
for this operation.
Everytime the input DataFrame
changes, the dataframe()
returns the updated DataFrame
.
sc
- Spark Context
spark
- Spark Session
sqlContext
- SQL Context
def transform_value(value, column_name):
return value
Since: Seahorse 1.0.0
Port | Type Qualifier | Description |
---|---|---|
0 |
DataFrame |
The DataFrame to be transformed. |
Port | Type Qualifier | Description |
---|---|---|
0 |
DataFrame |
The output DataFrame . |
1 |
Transformer |
A Transformer that allows to apply the operation on other
DataFrames using a Transform. |
Name | Type | Description |
---|---|---|
column operation code |
Code Snippet |
The Python code to be executed. It has to contain a Python function complying to the signature presented in the operation's description. |
target type |
Choice |
The target type of the conversion. Possible values are:
[String, Boolean, Timestamp, Double, Float, Long, Integer, Vector] . |
operate on |
InputOutputColumnSelector |
Input and output columns for the operation. |
Name | Value |
---|---|
column operation code |
def transform_value(value, column_name): return min(value, 2.0) |
target type |
double |
operate on |
one column |
input column |
"Weight" |
output |
append new column |
output column |
"WeightCutoff" |
Animal | Kind | Weight |
---|---|---|
Cow | Mammal | 300.0 |
Ostrich | Bird | 0.5 |
Dog | Mammal | 5.0 |
Sparrow | Bird | 0.5 |
Thing | null |
Animal | Kind | Weight | WeightCutoff |
---|---|---|---|
Cow | Mammal | 300.0 | 2.0 |
Ostrich | Bird | 0.5 | 0.5 |
Dog | Mammal | 5.0 | 2.0 |
Sparrow | Bird | 0.5 | 0.5 |
Thing | null | null |