SQL Transformation
Executes a
Spark SQL
(enriched with some User Defined Functions)
expression provided by the user on a DataFrame connected to its input port.
Returns the results of the execution as a DataFrame
.
Also returns a Transformer that can be later applied
to another DataFrame
with a Transform operation.
Since: Seahorse 0.4.0
Port |
Type Qualifier |
Description |
0 |
DataFrame |
The DataFrame that the Spark SQL expression will be executed on. |
Output
Port |
Type Qualifier |
Description |
0 |
DataFrame |
The results of the Spark SQL expression. |
1 |
Transformer |
A Transformer that allows to apply the operation to another DataFrames
using a Transform. |
Parameters
Name |
Type |
Description |
dataframe id |
String |
The identifier that can be used in the Spark SQL expression to refer
the input DataFrame . |
expression |
Code Snippet |
The Spark SQL expression to be executed.
The expression must be a valid Spark SQL expression. |
Example
Parameters
Name |
Value |
dataframe id |
"inputDF" |
expression |
select avg(temp) as avg_temp, max(windspeed) as max_windspeed from inputDF |
datetime |
windspeed |
hum |
temp |
2011-01-03 21:00:00.0 |
0.1045 |
0.47 |
0.2 |
2011-01-03 22:00:00.0 |
0.1343 |
0.64 |
0.18 |
2011-01-03 23:00:00.0 |
0.1343 |
0.69 |
0.14 |
2011-02-11 07:00:00.0 |
0.0 |
0.68 |
0.1 |
2011-02-13 18:00:00.0 |
0.3284 |
0.28 |
0.42 |
2011-02-18 12:00:00.0 |
0.1642 |
0.72 |
0.44 |
2011-02-19 03:00:00.0 |
0.3881 |
0.13 |
0.44 |
2011-02-19 04:00:00.0 |
0.2985 |
0.14 |
0.42 |
2013-01-01 00:00:00.0 |
0.1343 |
0.65 |
0.26 |
Output
avg_temp |
max_windspeed |
0.2888888888888889 |
0.3881 |