astro.sql.operators.raw_sql

Module Contents

Classes

RawSQLOperator

Given a SQL statement, (optional) tables and a (optional) function, execute the SQL statement

SdkLegacyRow

A subclass of Row that delivers 1.x SQLAlchemy behaviors

Functions

run_raw_sql([python_callable, conn_id, parameters, ...])

Given a python function that returns a SQL statement and (optional) tables, execute the SQL statement and output

class astro.sql.operators.raw_sql.RawSQLOperator(results_format=None, fail_on_empty=False, **kwargs)

Bases: astro.sql.operators.base_decorator.BaseSQLDecoratedOperator

Given a SQL statement, (optional) tables and a (optional) function, execute the SQL statement and apply the function to the results, returning the result of the function.

Disclaimer: this could potentially trash the XCom Database, depending on the XCom backend used and on the SQL statement/function declared by the user.

Parameters:
  • results_format (astro.constants.RunRawSQLResultFormat | None) –

  • fail_on_empty (bool) –

  • kwargs (Any) –

execute(context)

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Parameters:

context (astro.utils.compat.typing.Context) –

Return type:

Any

static make_row_serializable(rows)

Convert rows to a serializable format

Parameters:

rows (Any) –

Return type:

Any

static results_as_list(results)

Convert the result of a SQL query to a list

Parameters:

results (sqlalchemy.engine.ResultProxy) –

Return type:

list

static results_as_pandas_dataframe(result)

Convert the result of a SQL query to a pandas dataframe

Parameters:

result (sqlalchemy.engine.ResultProxy) –

Return type:

pandas.DataFrame

get_results_format_handler(results_format)
Parameters:

results_format (str) –

get_handler()
static get_wrapped_handler(fail_on_empty, conversion_func)
Parameters:
  • fail_on_empty (bool) –

  • conversion_func (Callable) –

class astro.sql.operators.raw_sql.SdkLegacyRow(parent, processors, keymap, key_style, data)

Bases: sqlalchemy.engine.row.LegacyRow

A subclass of Row that delivers 1.x SQLAlchemy behaviors for Core.

The LegacyRow class is where most of the Python mapping (i.e. dictionary-like) behaviors are implemented for the row object. The mapping behavior of Row going forward is accessible via the Row._mapping attribute.

New in version 1.4: - added LegacyRow which encapsulates most of the deprecated behaviors of Row.

version: int = 1
serialize()
static deserialize(data, version)
Parameters:
  • data (dict) –

  • version (int) –

static from_legacy_row(obj)
astro.sql.operators.raw_sql.run_raw_sql(python_callable=None, conn_id='', parameters=None, database=None, schema=None, handler=None, results_format=None, fail_on_empty=False, response_size=settings.RAW_SQL_MAX_RESPONSE_SIZE, **kwargs)

Given a python function that returns a SQL statement and (optional) tables, execute the SQL statement and output the result into a SQL table.

Use this function as a decorator like so:

@run_raw_sql
def my_sql_statement(table1: Table) -> Table:
    return "DROP TABLE {{table1}}"

In this example, by identifying parameters as Table objects, astro knows to automatically convert those objects into tables (if they are, for example, a dataframe). Any type besides table will lead astro to assume you do not want the parameter converted.

Please note that the run_raw_sql function will not create a temporary table. It will either return the result of a provided handler function or it will not return anything at all.

Parameters:
  • python_callable (Callable | None) – This parameter is filled in automatically when you use the transform function as a decorator. This is where the python function gets passed to the wrapping function

  • conn_id (str) – Connection ID for the database you want to connect to. If you do not pass in a value for this object we can infer the connection ID from the first table passed into the python_callable function. (required if there are no table arguments)

  • parameters (collections.abc.Mapping | collections.abc.Iterable | None) – parameters to pass into the SQL query

  • database (str | None) – Database within the SQL instance you want to access. If left blank we will default to the table.metadata.database in the first Table passed to the function (required if there are no table arguments)

  • schema (str | None) – Schema within the SQL instance you want to access. If left blank we will default to the table.metadata.schema in the first Table passed to the function (required if there are no table arguments)

  • handler (Callable | None) – Handler function to process the result of the SQL query. For more information please consult https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.Result

  • results_format (astro.constants.RunRawSQLResultFormat | None) – Format of the results returned by the SQL query. Overrides the handler, if provided.

  • fail_on_empty (bool) – If True and a results_format is provided, raises an exception if the query returns no results.

  • response_size (int) – Used to trim the responses returned to avoid trashing the Airflow DB. The default value is -1, which means the response is not changed. Otherwise, if the response is a list, returns up to the desired amount of items. If the response is a string, trims it to the desired size.

  • kwargs (Any) –

Returns:

By default returns None unless there is a handler function, in which case returns the result of the handler

Return type:

airflow.decorators.base.TaskDecorator