`spark_auto_mapper.data_types.first_valid_column`¶

Module Contents¶

Classes¶

AutoMapperFirstValidColumnType

Accepts any number of column definitions and will return the first valid column definition, similar to how

class spark_auto_mapper.data_types.first_valid_column.AutoMapperFirstValidColumnType(*columns)¶

Bases: spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase, Generic[_TAutoMapperDataType]

Accepts any number of column definitions and will return the first valid column definition, similar to how coalesce works, but with the existence of columns rather than null values inside the columns.

Useful for data sources in which columns may be renamed at some point and you want to process files from before and after the name change or when columns are added at a point and are missing from earlier files.

Parameters: columns (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) –

get_column_spec(self, source_df, current_column)¶

Gets the column spec for this automapper data type

Parameters

source_df (Optional[pyspark.sql.DataFrame]) – source data frame in case the automapper type needs that data to decide what to do
current_column (Optional[pyspark.sql.Column]) – (Optional) this is set when we are inside an array

Return type

pyspark.sql.Column

spark_auto_mapper.data_types.first_valid_column¶

Module Contents¶

Classes¶

`spark_auto_mapper.data_types.first_valid_column`¶