spark_auto_mapper.data_types.first_valid_column
¶
Module Contents¶
Classes¶
Accepts any number of column definitions and will return the first valid column definition, similar to how |
- class spark_auto_mapper.data_types.first_valid_column.AutoMapperFirstValidColumnType(*columns)¶
Bases:
spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase
,Generic
[_TAutoMapperDataType
]Accepts any number of column definitions and will return the first valid column definition, similar to how coalesce works, but with the existence of columns rather than null values inside the columns.
Useful for data sources in which columns may be renamed at some point and you want to process files from before and after the name change or when columns are added at a point and are missing from earlier files.
- Parameters
columns (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) –
- get_column_spec(self, source_df, current_column)¶
Gets the column spec for this automapper data type
- Parameters
source_df (Optional[pyspark.sql.DataFrame]) – source data frame in case the automapper type needs that data to decide what to do
current_column (Optional[pyspark.sql.Column]) – (Optional) this is set when we are inside an array
- Return type
pyspark.sql.Column