spark_auto_mapper.data_types.list

Module Contents

Classes

AutoMapperList

Base class for lists

class spark_auto_mapper.data_types.list.AutoMapperList(value, remove_nulls=True, include_null_properties=True)

Bases: spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase, Generic[_T]

Base class for lists Generics: https://mypy.readthedocs.io/en/stable/generics.html Multiple Inheritance: https://stackoverflow.com/questions/52754339/how-to-express-multiple-inheritance-in-python-type-hint

Generates a list (array) in Spark

Parameters
include_null_properties(self, include_null_properties)
Parameters

include_null_properties (bool) –

Return type

None

get_column_spec(self, source_df, current_column)

Gets the column spec for this automapper data type

Parameters
  • source_df (Optional[pyspark.sql.DataFrame]) – source data frame in case the automapper type needs that data to decide what to do

  • current_column (Optional[pyspark.sql.Column]) – (Optional) this is set when we are inside an array

Return type

pyspark.sql.Column

get_schema(self, include_extension)
Parameters

include_extension (bool) –

Return type

Optional[Union[pyspark.sql.types.StructType, pyspark.sql.types.DataType]]

__add__(self, other)

Allows adding items in an array using the + operation

Parameters
  • self – Set by Python. No need to pass.

  • other (AutoMapperList[_T]) – array to add to the current array

Example

A.column(“array1”) + [ “foo” ]

Return type

AutoMapperList[_T]