transform_concat_columns
            TransformConcatColumnsAction
¶
    
              Bases: PipelineAction
Concatenates the specified columns in the given DataFrame.
Example
Source code in src/cloe_nessy/pipeline/actions/transform_concat_columns.py
                
            run(context, *, name='', columns=None, separator=None, **_)
¶
    Concatenates the specified columns in the given DataFrame.
Warning
Null Handling Behavior¶
The behavior of null handling differs based on whether a separator is provided:
- When separatoris specified: The function uses Spark'sconcat_ws, which ignoresNULLvalues. In this case,NULLvalues are treated as empty strings ("") and are excluded from the final concatenated result.
- When separatoris not specified: The function defaults to using Spark'sconcat, which returnsNULLif any of the concatenated values isNULL. This means the presence of aNULLin any input will make the entire outputNULL.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| context | PipelineContext | The context in which this Action is executed. | required | 
| name | str | The name of the new concatenated column. | '' | 
| columns | list[str] | None | A list of columns to be concatenated. | None | 
| separator | str | None | The separator used between concatenated column values. | None | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If no name is provided. | 
| ValueError | If no columns are provided. | 
| ValueError | If the data from context is None. | 
| ValueError | If 'columns' is not a list. | 
Returns:
| Type | Description | 
|---|---|
| PipelineContext | The context after the execution of this Action, containing the DataFrame with the concatenated column. |