delta_writer_base
            BaseDeltaWriter
¶
    
              Bases: BaseWriter, ABC
A class for writing DataFrames to Delta tables.
Source code in src/cloe_nessy/integration/writer/delta_writer/delta_writer_base.py
                | 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |  | 
            _delta_operation_log(table_identifier, operation_type)
¶
    Returns a dictionary containing the most recent delta log of a Delta table for given operation type.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| table_identifier | str | The identifier of the Delta table in the format 'catalog.schema.table'. | required | 
| operation_type | DeltaTableOperationType | A DeltaTableOperationType object specifying the type of operation for which metrics should be retrieved (UPDATE, DELETE, MERGE or WRITE). | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| dict | dict | A dictionary containing the operation log. | 
Source code in src/cloe_nessy/integration/writer/delta_writer/delta_writer_base.py
              
            _empty_dataframe_check(df, ignore_empty_df)
¶
    Checks if a DataFrame is empty and raises an exception if it is not expected to be empty.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df | DataFrame | The DataFrame to check for emptiness. | required | 
| ignore_empty_df | bool | If True, the function will return without raising an exception if the DataFrame is empty. If False, an EmptyDataframeException will be raised. | required | 
Raises:
| Type | Description | 
|---|---|
| EmptyDataframeException | If the DataFrame is empty and ignore_empty_df is False. | 
Source code in src/cloe_nessy/integration/writer/delta_writer/delta_writer_base.py
              
            _merge_match_conditions(columns)
  
      staticmethod
  
¶
    Merges match conditions of the given columns into a single string.
This function is used to generate an SQL query to match rows between two tables based on the specified columns.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| columns | list[str] | A list of strings representing the names of the columns to match. | required | 
Returns:
| Type | Description | 
|---|---|
| str | A string containing the match conditions, separated by " AND " | 
Example
Source code in src/cloe_nessy/integration/writer/delta_writer/delta_writer_base.py
              
            _partition_pruning_conditions(df, partition_cols)
  
      staticmethod
  
¶
    Generates partition pruning conditions for an SQL query.
This function is used to optimize the performance of an SQL query by only scanning the necessary partitions in a table, based on the specified partition columns and the data in a Spark dataframe.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df | A Spark dataframe containing the data to generate the partition pruning conditions from. | required | |
| partition_cols | list[str] | None | A list of strings representing the names of the partition columns. | required | 
Returns:
| Type | Description | 
|---|---|
| str | A string, representing the partition pruning conditions. | 
Example
Source code in src/cloe_nessy/integration/writer/delta_writer/delta_writer_base.py
              
            _report_delta_table_operation_metrics(table_identifier, operation_type)
¶
    Logs the most recent metrics of a Delta table for given operation type.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| table_identifier | str | The identifier of the Delta table in the format 'catalog.schema.table'. | required | 
| operation_type | DeltaTableOperationType | A DeltaTableOperationType object specifying the type of operation for which metrics should be retrieved (UPDATE, DELETE, MERGE or WRITE). | required | 
Source code in src/cloe_nessy/integration/writer/delta_writer/delta_writer_base.py
              
            DeltaWriterLogs
  
      dataclass
  
¶
    Dataclass defining the delta writer logs table.
Source code in src/cloe_nessy/integration/writer/delta_writer/delta_writer_base.py
                
            TableOperationMetricsLogs
  
      dataclass
  
¶
    Dataclass defining the table operation metrics logs table.