astro.files
Subpackages
Submodules
Package Contents
Functions
|
get file objects by resolving path_pattern from local/object stores |
|
Given an Airflow connection ID, identify if it exists. |
Classes
Handle all file operations, and abstract away the details related to location and file types. |
- astro.files.get_files(path_pattern, conn_id=None, filetype=None, normalize_config=None)
get file objects by resolving path_pattern from local/object stores path_pattern can be 1. local location - glob pattern 2. s3/gcs location - prefix
- Parameters
path_pattern (str) – path/pattern to a file in the filesystem/Object stores, supports glob and prefix pattern for object stores
conn_id (Optional[str]) – Airflow connection ID
filetype (Union[astro.constants.FileType, None]) – constant to provide an explicit file type
normalize_config (Optional[dict]) – parameters in dict format of pandas json_normalize() function
- Return type
List[File]
- astro.files.check_if_connection_exists(conn_id)
Given an Airflow connection ID, identify if it exists. Return True if it does or raise an AirflowNotFoundException exception if it does not.
- Parameters
conn_id (str) – Airflow connection ID
- Return bool
If the connection exists, return True
- Return type
bool
- class astro.files.File(path, conn_id=None, filetype=None, normalize_config=None)
Handle all file operations, and abstract away the details related to location and file types. Intended to be used within library.
- Parameters
path (str) –
conn_id (Optional[str]) –
filetype (Union[astro.constants.FileType, None]) –
normalize_config (Optional[dict]) –
- template_fields = ['location']
- property path
- Return type
str
- property conn_id
- Return type
Optional[str]
- property size
Return the size in bytes of the given file.
- Returns
File size in bytes
- Return type
int
- is_binary()
Return a FileType given the filepath. Uses a naive strategy, using the file extension.
- Returns
True or False
- Return type
bool
- create_from_dataframe(df)
Create a file in the desired location using the values of a dataframe.
- Parameters
df (pandas.DataFrame) – pandas dataframe
- Return type
None
- export_to_dataframe(**kwargs)
Read file from all supported location and convert them into dataframes
- Return type
pandas.DataFrame
- exists()
Check if the file exists or not
- Return type
bool
- __repr__()
Return repr(self).
- Return type
str
- __str__()
Return str(self).
- Return type
str