SQL Column Transformation

Executes a Spark SQL (enriched with some User Defined Functions) formula (as used in SELECT statement) provided by the user on a column (columns) of DataFrame connected to its input port. Returns modified DataFrame.

Also returns a Transformer that can be later applied to another DataFrame with a Transform operation.

Since: Seahorse 1.1.0

Input

Port Type Qualifier Description
0 DataFrame The input DataFrame.

Output

Port Type Qualifier Description
0 DataFrame The results of the transformation.
1 Transformer The transformer that allows to apply the operation on another DataFrame using Transform.

Parameters

Name Type Description
input column alias String The identifier that can be used in the Spark SQL formula (as used in SELECT statement) to refer the input column.
formula String The Spark SQL formula (as used in SELECT statement).
operate on InputOutputColumnSelector The input and output columns for the operation.

Example

Parameters

Name Value
input column alias "myAlias"
formula "MINIMUM(myAlias, 2.0)"
operate on one column
input column "Weight"
output append new column
output column "WeightCutoff"

Input

Animal Kind Weight
Cow Mammal 300.0
Ostrich Bird 0.5
Dog Mammal 5.0
Sparrow Bird 0.5
Thing null

Output

Animal Kind Weight WeightCutoff
Cow Mammal 300.0 2.0
Ostrich Bird 0.5 0.5
Dog Mammal 5.0 2.0
Sparrow Bird 0.5 0.5
Thing null null