spark_auto_mapper.data_types.substring_by_delimiter

Module Contents

Classes

AutoMapperSubstringByDelimiterDataType

Returns the substring from string str before count occurrences of the delimiter.

class spark_auto_mapper.data_types.substring_by_delimiter.AutoMapperSubstringByDelimiterDataType(column, delimiter, delimiter_count)

Bases: spark_auto_mapper.data_types.text_like_base.AutoMapperTextLikeBase

Returns the substring from string str before count occurrences of the delimiter. If count is positive, everything the left of the final delimiter (counting from left) is returned. If count is negative, every to the right of the final delimiter (counting from the right) is returned. substring_index performs a case-sensitive match when searching for delimiter.

Parameters
  • column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) –

  • delimiter (str) –

  • delimiter_count (int) –

get_column_spec(self, source_df, current_column)

Gets the column spec for this automapper data type

Parameters
  • source_df (Optional[pyspark.sql.DataFrame]) – source data frame in case the automapper type needs that data to decide what to do

  • current_column (Optional[pyspark.sql.Column]) – (Optional) this is set when we are inside an array

Return type

pyspark.sql.Column