Skip to content

Logging

The logging module provides a simple interface to log messages to various targets. All components of the Framework use this module to log messages. It could also be used by application or notebook code to log messages.

Installation

Basic installation:

pip install cloe-logging --extra-index-url https://pkgs.dev.azure.com/initions-consulting/Public/_packaging/ics-py/pypi/simple/

Installation with extras (for Databricks and or Snowflake).

# for Snowflake
pip install cloe-logging[Snowflake] --extra-index-url https://pkgs.dev.azure.com/initions-consulting/Public/_packaging/ics-py/pypi/simple/

# for Databricks
pip install cloe-logging[Databricks] --extra-index-url https://pkgs.dev.azure.com/initions-consulting/Public/_packaging/ics-py/pypi/simple/

How to get a logger

This packages provides a LoggerFactory to get pre-configured loggers. The supported handler types are: console, file, unity_catalog, snowflake, log_analytics

Here is an example for getting a logger that logs to the console and to a file:

from cloe_logging import LoggerFactory

logger = LoggerFactory.get_logger(
    handler_types = ["console", "file"],
    filename="test.log",
)
logger.info("Hello World!")

Here is another example for a logger, that logs to the console and a Unity Catalog table:

from cloe_logging import LoggerFactory

logger = LoggerFactory.get_logger(
    handler_types = ["console", "unity_catalog"],
    catalog="test_catalog",
    schema="test_schema",
    table="test_table",
    columns={"column1": "STRING", "column2": "STRING"},
    workspace_url="test_workspace_url",
    warehouse_id="test_warehouse_id",
)
logger.info("message:Hello World")

The above code will create the following log record in the test_catalog.test_schema.test_table:

timestamp log_level message
2022-01-17 00:00:00 INFO Hello World

Notice, that the log entry contains the timestamp and the level of the log entry. These columns are added automatically.

Handlers

Handlers can be added to python Loggers to send logs to different targets.

Log Analytics Handler

This handler send the logs to a Log Analytics Workspace. It is best suited for production workloads

Parameter Description Example Value
workspace_id The workspace ID for Azure Log Analytics. 12345678-1234-1234-1234-123456789012
shared_key The shared key for Azure Log Analytics. abcdef1234567890abcdef1234567890
log_type The log type for Azure Log Analytics. CustomLog
column_split_char The character used to split columns in the log message. Defaults to \|. \|
key_value_split_char The character used to split keys and values in the log message. Defaults to :. :
test_connectivity Whether to test connectivity to Azure Log Analytics when initializing the handler. Defaults to True. True

Databricks Unity Catalog Handler

This handler sends the logs to a Unity Catalog table. It is not recommended to use it for big workloads, but it might be helpful for smaller scripts or demo purposes. The reason for that is, that it increases the runtime. The logs are inserted asynchronously, but still increase the runtime of workloads.

The handler uses the databricks python SDK and thereby reuses existing authentication from e.g. the Azure or Databricks CLI or any other spark connection that is already established.

Parameter Description Example Value
catalog The name of the catalog to send logs to. my_catalog
schema The name of the schema to send logs to. default
table The name of the table to send logs to. logging_tests
columns A dictionary of column names and their corresponding data types defining the structure of the logs {"col1": "string", "col2": "int"}
workspace_url The URL of the Azure Databricks workspace. https://adb-4907596146531234.5.azuredatabricks.net/
warehouse_id The ID of the Databricks warehouse. 123abc
column_split_char The character used to split columns in the log message. Defaults to \| \|
key_value_split_char The character used to split keys and values in the log message. Defaults to :. :

Snowflake Handler

This handler sends the logs to a Snowflake Table.

The handler is configured to use environment variables for some values:

Environment Variable Description
CLOE_SNOWFLAKE_USER The Snowflake user.
CLOE_SNOWFLAKE_PASSWORD The Snowflake password.
CLOE_SNOWFLAKE_ACCOUNT The Snowflake account.
CLOE_SNOWFLAKE_WAREHOUSE The Snowflake warehouse.
CLOE_SNOWFLAKE_ROLE The Snowflake role.

Beside this, each logger required the following parameters upon initialization:

Parameter Description Example Value
target_db The name of the Database to send logs to. my_database
target_schema The name of the schema to send logs to. public
target_table The name of the table to send logs to. logs
column_split_char The character used to split columns in the log message. Defaults to \| \|
key_value_split_char The character used to split keys and values in the log message. Defaults to :. :

Azure Pipeline Logger

Using the Formatter

The Formatter can be added to a handler and provides good formatting options for Azure Pipelines.

Setup the logger like this, e.g.:

def init_logging() -> logging.Logger:
        logger = logging.getLogger("azure-pipeline-logger")
        logger.setLevel(logging.INFO)
        section_formatter = DevOpsFormatter(section_info=True)
        section_handler = logging.StreamHandler()
        section_handler.setFormatter(section_formatter)
        logger.addHandler(section_handler)
        return logger

my_logger = init_logger()
my_logger.info("Hello World!")

Using the Decorator

The build logger prints helpful information when running it in Azure Pipelines.

from cloe_logging.decorators import build_logger

@build_logger()
def my_test_function(arg1, arg2=None):
    time.sleep(1)

my_test_function("test1", arg2="test2")

The above code will output the logging statements below. These are well formatted in Azure pipelines.

##[group]my_test_function
##### START my_test_function WITH args [ 'test1', arg2='test2' ] #####

##### END my_test_function DURATION [ '1's ] #####
##[endgroup]