astro.files.types

Submodules

Package Contents

Functions

create_file_type(path[, filetype, normalize_config])

Factory method to create FileType super objects based on the file extension in path or filetype specified.

get_filetype(filepath)

Return a FileType given the filepath. Uses a naive strategy, using the file extension.

Classes

FileTypeConstants

Generic enumeration.

FileType

Abstract File type class, meant to be the interface to all client code for all supported file types

CSVFileType

Concrete implementation to handle CSV file type

JSONFileType

Concrete implementation to handle JSON file type

NDJSONFileType

Concrete implementation to handle NDJSON file type

ParquetFileType

Concrete implementation to handle Parquet file type

astro.files.types.create_file_type(path, filetype=None, normalize_config=None)

Factory method to create FileType super objects based on the file extension in path or filetype specified.

Parameters
Return type

base.FileType

astro.files.types.get_filetype(filepath)

Return a FileType given the filepath. Uses a naive strategy, using the file extension.

Parameters

filepath (str or pathlib.PosixPath) – URI or Path to a file

Returns

The filetype (e.g. csv, ndjson, json, parquet)

Return type

astro.constants.FileType

class astro.files.types.FileTypeConstants

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

CSV = csv
JSON = json
NDJSON = ndjson
PARQUET = parquet
class astro.files.types.FileType(path, normalize_config=None)

Bases: abc.ABC

Abstract File type class, meant to be the interface to all client code for all supported file types

Parameters
  • path (str) –

  • normalize_config (Optional[dict]) –

abstract export_to_dataframe(stream, **kwargs)

read file from one of the supported locations and return dataframe

Parameters

stream – file stream object

Return type

pandas.DataFrame

abstract create_from_dataframe(df, stream)

Write file to one of the supported locations

Parameters
  • df (pandas.DataFrame) – pandas dataframe

  • stream (io.TextIOWrapper) – file stream object

Return type

None

property name

get file type

__str__()

String representation of type

__repr__()

Return repr(self).

Return type

str

class astro.files.types.CSVFileType(path, normalize_config=None)

Bases: astro.files.types.base.FileType

Concrete implementation to handle CSV file type

Parameters
  • path (str) –

  • normalize_config (Optional[dict]) –

export_to_dataframe(stream, **kwargs)

read csv file from one of the supported locations and return dataframe

Parameters

stream – file stream object

Return type

pandas.DataFrame

create_from_dataframe(df, stream)

Write csv file to one of the supported locations

Parameters
  • df (pandas.DataFrame) – pandas dataframe

  • stream (io.TextIOWrapper) – file stream object

Return type

None

property name

get file type

class astro.files.types.JSONFileType(path, normalize_config=None)

Bases: astro.files.types.base.FileType

Concrete implementation to handle JSON file type

Parameters
  • path (str) –

  • normalize_config (Optional[dict]) –

export_to_dataframe(stream, **kwargs)

read json file from one of the supported locations and return dataframe

Parameters

stream (io.TextIOWrapper) – file stream object

create_from_dataframe(df, stream)

Write json file to one of the supported locations

Parameters
  • df (pandas.DataFrame) – pandas dataframe

  • stream (io.TextIOWrapper) – file stream object

Return type

None

property name

get file type

class astro.files.types.NDJSONFileType(path, normalize_config=None)

Bases: astro.files.types.base.FileType

Concrete implementation to handle NDJSON file type

Parameters
  • path (str) –

  • normalize_config (Optional[dict]) –

export_to_dataframe(stream, **kwargs)

read ndjson file from one of the supported locations and return dataframe

Parameters

stream – file stream object

create_from_dataframe(df, stream)

Write ndjson file to one of the supported locations

Parameters
  • df (pandas.DataFrame) – pandas dataframe

  • stream (io.TextIOWrapper) – file stream object

Return type

None

property name

get file type

static flatten(normalize_config, stream)

Flatten the nested ndjson/json.

Parameters
Returns

return dataframe containing the loaded data

Return type

pandas.DataFrame

class astro.files.types.ParquetFileType(path, normalize_config=None)

Bases: astro.files.types.base.FileType

Concrete implementation to handle Parquet file type

Parameters
  • path (str) –

  • normalize_config (Optional[dict]) –

export_to_dataframe(stream, **kwargs)

read parquet file from one of the supported locations and return dataframe

Parameters

stream – file stream object

create_from_dataframe(df, stream)

Write parquet file to one of the supported locations

Parameters
  • df (pandas.DataFrame) – pandas dataframe

  • stream (io.TextIOWrapper) – file stream object

Return type

None

property name

get file type