Filter Columns

Creates a new DataFrame that contains only the selected columns. The order of the columns is preserved. Each column can be selected only once. Selecting a column more than once will not duplicate the column in the output. Thus, in the resulting DataFrame no column will be duplicated.

Also returns a Transformer that can be later applied to another DataFrame with a Transform operation.

Since: Seahorse 0.4.0

Input

Port Type Qualifier Description
0 DataFrame The DataFrame to select columns from.

Output

Port Type Qualifier Description
0 DataFrame The DataFrame containing the selected columns (and only them).
1 Transformer The Transformer that allows to apply the operation on other DataFrames using the Transform.

Parameters

Name Type Description
columns MultipleColumnSelector The columns to be included in the output DataFrame. Even if one of the columns is selected more than once (e.g. by name and by type) it will be included only once. An empty selection is supported, but when a column selected by name or by index does not exist, the operation will fail at runtime with ColumnsDoNotExistException.

Example

Parameters

Name Value
selected columns Selected columns: by name: ["city", "price"].

Input

city beds price
CityA 4.0 695611.0
CityC 2.0 294691.0
CityB 3.0 430784.0
CityB 2.0 336677.0
CityA 3.0 584639.0
CityA 4.0 579560.0

Output

city price
CityA 695611.0
CityC 294691.0
CityB 430784.0
CityB 336677.0
CityA 584639.0
CityA 579560.0