Filter Rows

Creates a DataFrame containing only rows satisfying given condition. The condition should be Spark SQL (enriched with some User Defined Functions) condition (as used in WHERE condition). The order of the columns is preserved.

Also returns a Transformer that can be later applied to another DataFrame using a Transform operation.

Since: Seahorse 1.0.0

Input

Port Type Qualifier Description
0 DataFrame A DataFrame to filter rows on.

Output

Port Type Qualifier Description
0 DataFrame The DataFrame containing only rows satisfying given condition.
1 Transformer A Transformer that allows to apply the operation on other DataFrames using a Transform.

Parameters

Name Type Description
condition Code Snippet The filtering condition. Rows not satisfying given condition will be excluded from output DataFrame. It should be Spark SQL condition (as used in WHERE condition).

Example

Parameters

Name Value
condition
0.4 < temp AND windspeed < 0.3

Input

datetime windspeed hum temp
2011-01-03 21:00:00.0 0.1045 0.47 0.2
2011-01-03 22:00:00.0 0.1343 0.64 0.18
2011-01-03 23:00:00.0 0.1343 0.69 0.14
2011-02-11 07:00:00.0 0.0 0.68 0.1
2011-02-13 18:00:00.0 0.3284 0.28 0.42
2011-02-18 12:00:00.0 0.1642 0.72 0.44
2011-02-19 03:00:00.0 0.3881 0.13 0.44
2011-02-19 04:00:00.0 0.2985 0.14 0.42
2013-01-01 00:00:00.0 0.1343 0.65 0.26

Output

datetime windspeed hum temp
2011-02-18 12:00:00.0 0.1642 0.72 0.44
2011-02-19 04:00:00.0 0.2985 0.14 0.42