check_table

Module Contents

Classes

SQLCheckOperator

Performs one or more of the checks provided in the checks dictionary.

Functions

check_table(dataset, checks[, partition_clause, task_id])

Performs one or more of the checks provided in the checks dictionary.

class check_table.SQLCheckOperator(*, dataset, checks, partition_clause=None, task_id=None, **kwargs)

Bases: airflow.providers.common.sql.operators.sql.SQLTableCheckOperator

Performs one or more of the checks provided in the checks dictionary. Checks should be written to return a boolean result.

Parameters:
  • dataset (astro.table.BaseTable) – the table to run checks on

  • checks (Dict[str, Dict[str, Any]]) – the dictionary of checks, e.g.:

  • partition_clause (Optional[str]) –

  • task_id (Optional[str]) –

{
    "row_count_check": {"check_statement": "COUNT(*) = 1000"},
    "column_sum_check": {"check_statement": "col_a + col_b < col_c"},
}
Parameters:
  • partition_clause (Optional[str]) – a partial SQL statement that is added to a WHERE clause in the query built by the operator that creates partition_clauses for the checks to run on, e.g.

  • dataset (astro.table.BaseTable) –

  • checks (Dict[str, Dict[str, Any]]) –

  • task_id (Optional[str]) –

"date = '1970-01-01'"
template_fields = ('partition_clause', 'dataset')
execute(context)

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Parameters:

context (astro.utils.compat.typing.Context) –

get_db_hook()

Get the database hook for the connection.

Returns:

the database hook object.

Return type:

Any

check_table.check_table(dataset, checks, partition_clause=None, task_id=None, **kwargs)

Performs one or more of the checks provided in the checks dictionary. Checks should be written to return a boolean result.

Parameters:
  • dataset (astro.table.BaseTable) – the table to run checks on

  • checks (Dict[str, Dict[str, Any]]) – the dictionary of checks, e.g.:

  • partition_clause (Optional[str]) –

  • task_id (Optional[str]) –

Return type:

airflow.models.xcom_arg.XComArg

{
    "row_count_check": {"check_statement": "COUNT(*) = 1000"},
    "column_sum_check": {"check_statement": "col_a + col_b < col_c"},
}
Parameters:
  • partition_clause (Optional[str]) – a partial SQL statement that is added to a WHERE clause in the query built by the operator that creates partition_clauses for the checks to run on, e.g.

  • dataset (astro.table.BaseTable) –

  • checks (Dict[str, Dict[str, Any]]) –

  • task_id (Optional[str]) –

Return type:

airflow.models.xcom_arg.XComArg

"date = '1970-01-01'"