SQL Transformation

Executes a Spark SQL (enriched with some User Defined Functions) expression provided by the user on a DataFrame connected to its input port. Returns the results of the execution as a DataFrame.

Also returns a Transformer that can be later applied to another DataFrame with a Transform operation.

Since: Seahorse 0.4.0

Input

Port Type Qualifier Description
0 DataFrame The DataFrame that the Spark SQL expression will be executed on.

Output

Port Type Qualifier Description
0 DataFrame The results of the Spark SQL expression.
1 Transformer A Transformer that allows to apply the operation to another DataFrames using a Transform.

Parameters

Name Type Description
dataframe id String The identifier that can be used in the Spark SQL expression to refer the input DataFrame.
expression Code Snippet The Spark SQL expression to be executed. The expression must be a valid Spark SQL expression.

Example

Parameters

Name Value
dataframe id "inputDF"
expression
select avg(temp) as avg_temp, max(windspeed) as max_windspeed from inputDF

Input

datetime windspeed hum temp
2011-01-03 21:00:00.0 0.1045 0.47 0.2
2011-01-03 22:00:00.0 0.1343 0.64 0.18
2011-01-03 23:00:00.0 0.1343 0.69 0.14
2011-02-11 07:00:00.0 0.0 0.68 0.1
2011-02-13 18:00:00.0 0.3284 0.28 0.42
2011-02-18 12:00:00.0 0.1642 0.72 0.44
2011-02-19 03:00:00.0 0.3881 0.13 0.44
2011-02-19 04:00:00.0 0.2985 0.14 0.42
2013-01-01 00:00:00.0 0.1343 0.65 0.26

Output

avg_temp max_windspeed
0.2888888888888889 0.3881