Logging¶

The logging module provides a simple interface to log messages to various targets. All components of the Framework use this module to log messages. It could also be used by application or notebook code to log messages.

Installation¶

Basic installation:

pip install cloe-logging --extra-index-url https://pkgs.dev.azure.com/initions-consulting/Public/_packaging/ics-py/pypi/simple/

Installation with extras (for Databricks and or Snowflake).

# for Snowflake
pip install cloe-logging[Snowflake] --extra-index-url https://pkgs.dev.azure.com/initions-consulting/Public/_packaging/ics-py/pypi/simple/

# for Databricks
pip install cloe-logging[Databricks] --extra-index-url https://pkgs.dev.azure.com/initions-consulting/Public/_packaging/ics-py/pypi/simple/

How to get a logger¶

This packages provides a LoggerFactory to get pre-configured loggers. The supported handler types are: console, file, unity_catalog, snowflake, log_analytics

Here is an example for getting a logger that logs to the console and to a file:

from cloe_logging import LoggerFactory

logger = LoggerFactory.get_logger(
    handler_types = ["console", "file"],
    filename="test.log",
)
logger.info("Hello World!")

Here is another example for a logger, that logs to the console and a Unity Catalog table:

from cloe_logging import LoggerFactory

logger = LoggerFactory.get_logger(
    handler_types = ["console", "unity_catalog"],
    catalog="test_catalog",
    schema="test_schema",
    table="test_table",
    columns={"column1": "STRING", "column2": "STRING"},
    workspace_url="test_workspace_url",
    warehouse_id="test_warehouse_id",
)
logger.info("message:Hello World")

The above code will create the following log record in the test_catalog.test_schema.test_table:

timestamp	log_level	message
2022-01-17 00:00:00	INFO	Hello World

Notice, that the log entry contains the timestamp and the level of the log entry. These columns are added automatically.

Handlers¶

Handlers can be added to python Loggers to send logs to different targets.

Log Analytics Handler¶

This handler send the logs to a Log Analytics Workspace. It is best suited for production workloads

Parameter	Description	Example Value
`workspace_id`	The workspace ID for Azure Log Analytics.	`12345678-1234-1234-1234-123456789012`
`shared_key`	The shared key for Azure Log Analytics.	`abcdef1234567890abcdef1234567890`
`log_type`	The log type for Azure Log Analytics.	`CustomLog`
`column_split_char`	The character used to split columns in the log message. Defaults to `\\|`.	`\\|`
`key_value_split_char`	The character used to split keys and values in the log message. Defaults to `:`.	`:`
`test_connectivity`	Whether to test connectivity to Azure Log Analytics when initializing the handler. Defaults to `True`.	`True`

Databricks Unity Catalog Handler¶

This handler sends the logs to a Unity Catalog table. It is not recommended to use it for big workloads, but it might be helpful for smaller scripts or demo purposes. The reason for that is, that it increases the runtime. The logs are inserted asynchronously, but still increase the runtime of workloads.

The handler uses the databricks python SDK and thereby reuses existing authentication from e.g. the Azure or Databricks CLI or any other spark connection that is already established.

Parameter	Description	Example Value
`catalog`	The name of the catalog to send logs to.	`my_catalog`
`schema`	The name of the schema to send logs to.	`default`
`table`	The name of the table to send logs to.	`logging_tests`
`columns`	A dictionary of column names and their corresponding data types defining the structure of the logs	`{"col1": "string", "col2": "int"}`
`workspace_url`	The URL of the Azure Databricks workspace.	`https://adb-4907596146531234.5.azuredatabricks.net/`
`warehouse_id`	The ID of the Databricks warehouse.	`123abc`
`column_split_char`	The character used to split columns in the log message. Defaults to `\\|`	`\\|`
`key_value_split_char`	The character used to split keys and values in the log message. Defaults to `:`.	`:`

Snowflake Handler¶

This handler sends the logs to a Snowflake Table.

The handler is configured to use environment variables for some values:

Environment Variable	Description
`CLOE_SNOWFLAKE_USER`	The Snowflake user.
`CLOE_SNOWFLAKE_PASSWORD`	The Snowflake password.
`CLOE_SNOWFLAKE_ACCOUNT`	The Snowflake account.
`CLOE_SNOWFLAKE_WAREHOUSE`	The Snowflake warehouse.
`CLOE_SNOWFLAKE_ROLE`	The Snowflake role.

Beside this, each logger required the following parameters upon initialization:

Parameter	Description	Example Value
`target_db`	The name of the Database to send logs to.	`my_database`
`target_schema`	The name of the schema to send logs to.	`public`
`target_table`	The name of the table to send logs to.	`logs`
`column_split_char`	The character used to split columns in the log message. Defaults to `\\|`	`\\|`
`key_value_split_char`	The character used to split keys and values in the log message. Defaults to `:`.	`:`

Azure Pipeline Logger¶

Using the Formatter¶

The Formatter can be added to a handler and provides good formatting options for Azure Pipelines.

Setup the logger like this, e.g.:

def init_logging() -> logging.Logger:
        logger = logging.getLogger("azure-pipeline-logger")
        logger.setLevel(logging.INFO)
        section_formatter = DevOpsFormatter(section_info=True)
        section_handler = logging.StreamHandler()
        section_handler.setFormatter(section_formatter)
        logger.addHandler(section_handler)
        return logger

my_logger = init_logger()
my_logger.info("Hello World!")

Using the Decorator¶

The build logger prints helpful information when running it in Azure Pipelines.

from cloe_logging.decorators import build_logger

@build_logger()
def my_test_function(arg1, arg2=None):
    time.sleep(1)

my_test_function("test1", arg2="test2")

The above code will output the logging statements below. These are well formatted in Azure pipelines.

##[group]my_test_function
##### START my_test_function WITH args [ 'test1', arg2='test2' ] #####

##### END my_test_function DURATION [ '1's ] #####
##[endgroup]