astro.files.locations.azure.wasb

Module Contents

Classes

WASBLocation

Handler WASB object store operations

exception astro.files.locations.azure.wasb.WASBLocationException

Bases: Exception

Common base class for all non-exit exceptions.

class astro.files.locations.azure.wasb.WASBLocation(path, conn_id=None, load_options=None)

Bases: astro.files.locations.base.BaseFileLocation

Handler WASB object store operations

Parameters:
property hook: airflow.providers.microsoft.azure.hooks.wasb.WasbHook
Return type:

airflow.providers.microsoft.azure.hooks.wasb.WasbHook

property transport_params: dict

get WASB credentials for storage

Return type:

dict

property paths: list[str]

Resolve WASB file paths with prefix

Return type:

list[str]

property smartopen_uri: str

SmartOpen does not support URIs prefixed with wasb, so we need to change them to azure.

Returns:

URI compatible with SmartOpen for Azure BlobStorage.

Return type:

str

property size: int

Return file size for WASB location

Return type:

int

property openlineage_dataset_namespace: str

Returns the open lineage dataset namespace as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md

Return type:

str

property openlineage_dataset_name: str

Returns the open lineage dataset name as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md

Return type:

str

property snowflake_stage_path: str
Get the altered path if needed for stage creation in snowflake stage creation. We need to modify the path since

Snowflake only accepts paths of format for stage creation: “azure://<storage_account>.blob.core.windows.net/<container_name>/load/files/” But SDK accepts paths “wasb://<container_name>/<filename>” or “wasbs://<container_name>/<filename>” To bridge the gap we use this method

Return type:

str

property databricks_uri: str

Return a Databricks compatible WASB URI, including the Azure storage account host. Example: wasb://astro-sdk@astrosdk.blob.core.windows.net/homes.csv

Returns:

self.path, including the Azure storage account host

Return type:

str

location_type
supported_conn_type
LOAD_OPTIONS_CLASS_NAME = ('WASBLocationLoadOptions',)
AZURE_HOST = 'blob.core.windows.net'
exists()

Check if the file exists or not

Return type:

bool

databricks_auth_settings()

Required settings to transfer files in/to Databricks. Currently relies on storage account access key, as described in: https://docs.databricks.com/storage/azure-storage.html

Returns:

A dictionary of settings keys to settings values

Return type:

dict