transform_select_columns
            TransformSelectColumnsAction
¶
    
              Bases: PipelineAction
Selects specified columns from the given DataFrame.
This method allows you to include or exclude specific columns from the
DataFrame. If include_columns is provided, only those columns will be
selected. If exclude_columns is provided, all columns except those will be
selected. The method ensures that the specified columns exist in the
DataFrame before performing the selection.
Example
Example Input Data:
| id | name | coordinates | attributes | 
|---|---|---|---|
| 1 | Alice | [10.0, 20.0] | {"age": 30, "city": "NY"} | 
| 2 | Bob | [30.0, 40.0] | {"age": 25, "city": "LA"} | 
Source code in src/cloe_nessy/pipeline/actions/transform_select_columns.py
                | 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |  | 
            run(context, *, include_columns=None, exclude_columns=None, raise_on_non_existing_columns=True, **_)
¶
    Selects specified columns from the given DataFrame.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| context | PipelineContext | Context in which this Action is executed. | required | 
| include_columns | list[str] | None | A list of column names that should be included. If provided, only these columns will be selected. | None | 
| exclude_columns | list[str] | None | A list of column names that should be excluded. If provided, all columns except these will be selected. | None | 
| raise_on_non_existing_columns | bool | If True, raise an error if a specified column is not found in the DataFrame. If False, ignore the column and continue with the selection. | True | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If a specified column is not found in the DataFrame. | 
| ValueError | If neither include_columns nor exclude_columns are provided, or if both are provided. | 
Returns:
| Type | Description | 
|---|---|
| PipelineContext | Context after the execution of this Action. |