Skip to content

How-to: Instantiate Nessy models from Metadata

This document shall help you getting started using the Nessy library.

Prerequisite: Define Metadata

The first step is to create Metadata matching your tables and schemas.

Each Schema and each Table are required to a metadata file as a .yaml file. A detailed guide on how these .yaml files are structured can be found in this metadata documentation.

Each table metadata file has to be located in a folder that is located on the same level as a schema metadata file. This way, each table metadata is clearly associated to one schema.

.
└── metadata
    ├── bronze
    │   ├── schema.yaml
    │   └── tables
    │       ├── table.1.yaml
    │       └── table.2.yaml
    ├── silver
    │   ├── schema.yaml
    │   └── tables
    │       ├── table.1.yaml
    │       └── table.2.yaml
    └── gold
        ├── schema.yaml
        └── tables
            ├── table.1.yaml
            └── table.2.yaml

Unity Catalog Adapter

In cases, where it's not feasible to define all metadata in YAML format, the objects can also be derived from already existing objects, e.g. using the UnityCatalogAdapter.

from cloe_nessy.models import UnityCatalogAdapter
table = UnityCatalogAdapter().get_table_by_name("catalog.schema.table_name")

Instantiating Table objects

Tables are easy to create by providing the path to a singe table-metadata file:

from pathlib import Path

from cloe_nessy.models import Table

metadata_path = Path("/my_path/metadata/schema_bronze/tables/my_table.yaml")

table, error = Table.read_instance_from_file(
    instance_path = metadata_path
    catalog_name="my_catalog",
    schema_name="schema_bronze",
    )

if error:
    print(error)
    raise

or by providing a path to a directory of table-metadata:

from cloe_nessy.models import Table
from pathlib import Path

metadata_path = Path("/my_path/metadata/schema_bronze/tables")

tables, errors = Table.read_instances_from_directory(
    instance_path = metadata_path
    catalog_name="my_catalog",
    schema_name="schema_bronze",
    )

if errors:
    print(error)
    raise

To ensure that your metadata is in the correct format, a validation will occur during the creation of the Table object. Make sure to check the errors returned after instantiation and resolve any remaining validation errors.

Instantiating Schema objects

Schema objects can be created in the following way:

metadata_path = Path("/my_path/metadata/schema_bronze/schema_bronze.yaml")

schema, errors = Schema.read_instance_from_file(
    instance_path= metadata_path,
    table_dir_name="tables",
)
if errors:
    print(error)
    raise
As previously explained, the table_dir_name has to be the name of the folder, where the table metadata associated with the schema are located. This folder has to be located in the same folder as the schema .yaml file (in this example we would have schema_bronze/schema_bronze.yaml & schema_bronze/tables)