Changelog
1.0.0
Features
Improved the performance of
aql.load_fileby supporting database-specific (native) load methods. This is now the default behaviour. Previously, the Astro SDK Python would always use Pandas to load files to SQL databases which passed the data to worker node which slowed the performance. #557, #481Introduced new arguments to
aql.load_file:use_native_supportfor data transfer if available on the destination (defaults touse_native_support=True)native_support_kwargsis a keyword argument to be used by method involved in native support flow.enable_native_fallbackcan be used to fall back to default transfer(defaults toenable_native_fallback=True).
Now, there are three modes:
Native: Default, uses Bigquery Load Job in the case of BigQuery and Snowflake COPY INTO using external stage in the case of Snowflake.Pandas: This is how datasets were previously loaded. To enable this mode, use the argumentuse_native_support=Falseinaql.load_file.Hybrid: This attempts to use the native strategy to load a file to the database and if native strategy(i) fails , fallback to Pandas (ii) with relevant log warnings. #557
Allow users to specify the table schema (column types) in which a file is being loaded by using
table.columns. If this table attribute is not set, the Astro SDK still tries to infer the schema by using Pandas (which is previous behaviour).#532Add Example DAG for Dynamic Map Task with Astro-SDK. #377,airflow-2.3.0
Breaking Change
The
aql.dataframeargumentidentifiers_as_lower(which wasboolean, with default set toFalse) was replaced by the argumentcolumns_names_capitalization(stringwithin possible values["upper", "lower", "original"], default islower).#564The
aql.load_filebefore would change the capitalization of all column titles to be uppercase, by default, now it makes them lowercase, by default. The old behaviour can be achieved by using the argumentcolumns_names_capitalization="upper". #564aql.load_fileattempts to load files to BigQuery and Snowflake by using native methods, which may have pre-requirements to work. To disable this mode, use the argumentuse_native_support=Falseinaql.load_file. #557, #481aql.dataframewill raise an exception if the default Airflow XCom backend is being used. To solve this, either use an external XCom backend, such as S3 or GCS or set the configurationAIRFLOW__ASTRO_SDK__DATAFRAME_ALLOW_UNSAFE_STORAGE=True. #444Change the declaration for the default Astro SDK temporary schema from using
AIRFLOW__ASTRO__SQL_SCHEMAtoAIRFLOW__ASTRO_SDK__SQL_SCHEMA#503Renamed
aql.truncatetoaql.drop_table#554
Bug fixes
Enhancements
Improved the performance of
aql.load_filefor files for below:Get configurations via Airflow Configuration manager. #503
Change catching
ValueErrorandAttributeErrortoDatabaseCustomError#595Unpin pandas upperbound dependency #620
Remove markupsafe from dependencies #623
Added
extend_existingto Sqla Table object #626Move config to store DF in XCom to settings file #537
Make the operator names consistent #634
Use
exc_infofor exception logging #643Use lazy evaluated Type Annotations from PEP 563 #650
Provide Google Cloud Credentials env var for bigquery #679
Handle breaking changes for Snowflake provide version 3.2.0 and 3.1.0 #686
Misc
0.11.0
Feature:
Internals:
Enhancement:
0.10.0
Feature:
Breaking Change:
aql.mergeinterface changed. Argumentmerge_tablechanged totarget_table,target_columnsandmerge_columncombined tocolumnargument,merge_keysis changed totarget_conflict_columns,conflict_strategyis changed toif_conflicts. More details can be found at 422, #466
Enhancement:
0.9.2
Bug fix:
Change export_file to return File object #454.
0.9.1
Bug fix:
Table unable to have Airflow templated names #413
0.9.0
Enhancements:
Introduction of the user-facing
Table,MetadataandFileclasses
Breaking changes:
The operator
save_filebecameexport_fileThe tasks
load_file,export_file(previouslysave_file) andrun_raw_sqlshould be used with useTable,MetadataandFileinstancesThe decorators
dataframe,run_raw_sqlandtransformshould be used withTableandMetadatainstancesThe operators
aggregate_check,boolean_check,renderandstats_checkwere temporarily removedThe class
TempTablewas removed. It is possible to declare temporary tables by usingTable(temp=True). All the temporary tables names are prefixed with_tmp_. If the user decides to name aTable, it is no longer temporary, unless the user enforces it to be.The only mandatory property of a
Tableinstance isconn_id. If no metadata is given, the library will try to extract schema and other information from the connection object. If it is missing, it will default to theAIRFLOW__ASTRO__SQL_SCHEMAenvironment variable.
Internals:
Major refactor introducing
Database,File,FileTypeandFileLocationconcepts.
0.8.4
Enhancements:
Add support for Airflow 2.3 #367.
Breaking change:
We have renamed the artifacts we released to
astro-sdk-pythonfromastro-projects.0.8.4is the last version for which we have published bothastro-sdk-pythonandastro-projects.
0.8.3
Bug fix:
Do not attempt to create a schema if it already exists #329.
0.8.2
Bug fix:
Support dataframes from different databases in dataframe operator #325
Enhancements:
Add integration testcase for
SqlDecoratedOperatorto test execution of Raw SQL #316
0.8.1
Bug fix:
Snowflake transform without
input_table#319
0.8.0
Feature:
*load_file support for nested NDJSON files #257
Breaking change:
aql.dataframeswitches the capitalization to lowercase by default. This behaviour can be changed by usingidentifiers_as_lower#154
Documentation:
Fix commands in README.md #242
Add scripts to auto-generate Sphinx documentation
Enhancements:
Improve type hints coverage
Improve Amazon S3 example DAG, so it does not rely on pre-populated data #293
Add example DAG to load/export from BigQuery #265
Fix usages of mutable default args #267
Enable DeepSource validation #299
Improve code quality and coverage
Bug fixes:
Support
gcpbigqueryconnections #294Support
paramsargument inaql.renderto override SQL Jinja template values #254Fix
aql.dataframewhen table arg is absent #259
Others:
0.7.0
Feature:
load_fileto a Pandas dataframe, without SQL database dependencies #77
Documentation:
Simplify README #101
Add Release Guidelines #160
Add Code of Conduct #101
Add Contribution Guidelines #101
Enhancements:
Add SQLite example #149
Allow customization of
task_idwhen usingdataframe#126Use standard AWS environment variables, as opposed to
AIRFLOW__ASTRO__CONN_AWS_DEFAULT#175
Bug fixes:
Fix
mergeXComArgsupport #183Fixes to
load_file:Fixes to
render:Fix
transform, so it works with SQLite #159
Others:
0.6.0
Features:
Support SQLite #86
Support users who can’t create schemas #121
Ability to install optional dependencies (amazon, google, snowflake) #82
Enhancements:
Change
renderso it creates a DAG as opposed to a TaskGroup #143Allow users to specify a custom version of
snowflake_sqlalchemy#127
Bug fixes:
Others: