Filter Rows
Creates a DataFrame containing only rows satisfying given condition.
The condition should be
Spark SQL
(enriched with some User Defined Functions)
condition (as used in WHERE
condition).
The order of the columns is preserved.
Also returns a Transformer that can be later applied
to another DataFrame using a Transform operation.
Since: Seahorse 1.0.0
Port |
Type Qualifier |
Description |
0 |
DataFrame |
A DataFrame to filter rows on. |
Output
Port |
Type Qualifier |
Description |
0 |
DataFrame |
The DataFrame containing only rows satisfying given condition. |
1 |
Transformer |
A Transformer that allows to apply the operation on other DataFrames using
a Transform. |
Parameters
Name |
Type |
Description |
condition |
Code Snippet |
The filtering condition. Rows not satisfying given condition will be excluded from output DataFrame .
It should be Spark SQL condition (as used in WHERE condition). |
Example
Parameters
Name |
Value |
condition |
0.4 < temp AND windspeed < 0.3 |
datetime |
windspeed |
hum |
temp |
2011-01-03 21:00:00.0 |
0.1045 |
0.47 |
0.2 |
2011-01-03 22:00:00.0 |
0.1343 |
0.64 |
0.18 |
2011-01-03 23:00:00.0 |
0.1343 |
0.69 |
0.14 |
2011-02-11 07:00:00.0 |
0.0 |
0.68 |
0.1 |
2011-02-13 18:00:00.0 |
0.3284 |
0.28 |
0.42 |
2011-02-18 12:00:00.0 |
0.1642 |
0.72 |
0.44 |
2011-02-19 03:00:00.0 |
0.3881 |
0.13 |
0.44 |
2011-02-19 04:00:00.0 |
0.2985 |
0.14 |
0.42 |
2013-01-01 00:00:00.0 |
0.1343 |
0.65 |
0.26 |
Output
datetime |
windspeed |
hum |
temp |
2011-02-18 12:00:00.0 |
0.1642 |
0.72 |
0.44 |
2011-02-19 04:00:00.0 |
0.2985 |
0.14 |
0.42 |