spark_auto_mapper.helpers.automapper_helpers
¶
Module Contents¶
Classes¶
- class spark_auto_mapper.helpers.automapper_helpers.AutoMapperHelpers¶
- static struct(value)¶
Creates a struct
- Parameters
value (Dict[str, Any]) – A dictionary to be converted to a struct
- Returns
A struct automapper type
- Return type
spark_auto_mapper.data_types.complex.struct_type.AutoMapperDataTypeStruct
- static complex(**kwargs)¶
Creates a complex type.
- Parameters
kwargs (spark_auto_mapper.type_definitions.defined_types.AutoMapperAnyDataType) – parameters to be used to create the complex type
- Returns
A complex automapper type
- Return type
spark_auto_mapper.data_types.complex.complex.AutoMapperDataTypeComplex
- static column(value)¶
Specifies that the value parameter should be used as a column name
- Parameters
value (str) – name of column
- Returns
A column automapper type
- Return type
spark_auto_mapper.data_types.array_base.AutoMapperArrayLikeBase
- static text(value)¶
Specifies that the value parameter should be used as a literal text
- Parameters
value (Union[spark_auto_mapper.type_definitions.native_types.AutoMapperNativeSimpleType, spark_auto_mapper.type_definitions.defined_types.AutoMapperTextInputType]) – text value
- Returns
a text automapper type
- Return type
spark_auto_mapper.data_types.text_like_base.AutoMapperTextLikeBase
- static expression(value)¶
Specifies that the value parameter should be executed as a sql expression in Spark
- Parameters
value (str) – sql
- Returns
an expression automapper type
- Example
A.expression( ” CASE
WHEN Member Sex = ‘F’ THEN ‘female’ WHEN Member Sex = ‘M’ THEN ‘male’ ELSE ‘other’
END ” )
- Return type
spark_auto_mapper.data_types.array_base.AutoMapperArrayLikeBase
- static date(value, formats=None)¶
Converts a value to date only For datetime use the datetime mapper type
- Parameters
value (spark_auto_mapper.type_definitions.defined_types.AutoMapperDateInputType) – value
formats (Optional[List[str]]) – (Optional) formats to use for trying to parse the value otherwise uses: y-M-d yyyyMMdd M/d/y
- Return type
- static datetime(value, formats=None)¶
Converts the value to a timestamp type in Spark
- Parameters
value (spark_auto_mapper.type_definitions.defined_types.AutoMapperDateInputType) – value
formats (Optional[List[str]]) – (Optional) formats to use for trying to parse the value otherwise uses Spark defaults
- Return type
spark_auto_mapper.data_types.datetime.AutoMapperDateTimeDataType
- static decimal(value, precision, scale)¶
Specifies the value should be used as a decimal
- Parameters
value (spark_auto_mapper.type_definitions.defined_types.AutoMapperAmountInputType) –
precision (int) – the maximum total number of digits (on both sides of dot)
scale (int) – the number of digits on right side of dot
- Returns
a decimal automapper type
- Return type
spark_auto_mapper.data_types.decimal.AutoMapperDecimalDataType
- static amount(value)¶
Specifies the value should be used as an amount
- Parameters
value (spark_auto_mapper.type_definitions.defined_types.AutoMapperAmountInputType) –
- Returns
an amount automapper type
- Return type
spark_auto_mapper.data_types.amount.AutoMapperAmountDataType
- static boolean(value)¶
Specifies the value should be used as a boolean
- Parameters
value (spark_auto_mapper.type_definitions.defined_types.AutoMapperBooleanInputType) –
- Returns
a boolean automapper type
- Return type
spark_auto_mapper.data_types.boolean.AutoMapperBooleanDataType
- static number(value)¶
Specifies value should be used as a number
- Parameters
value (spark_auto_mapper.type_definitions.defined_types.AutoMapperNumberInputType) –
- Returns
a number automapper type
- Return type
spark_auto_mapper.data_types.number.AutoMapperNumberDataType
- static concat(*args)¶
concatenates a list of values. Each value can be a string or a column
- Parameters
args (Union[spark_auto_mapper.type_definitions.native_types.AutoMapperNativeTextType, spark_auto_mapper.type_definitions.wrapper_types.AutoMapperWrapperType, spark_auto_mapper.data_types.text_like_base.AutoMapperTextLikeBase, spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase]) – string or column
- Returns
a concat automapper type
- Return type
spark_auto_mapper.data_types.concat.AutoMapperConcatDataType
- static if_(column, check, value, else_=None)¶
Checks if column matches check_value. Returns value if it matches else else_
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column to check
check (Union[spark_auto_mapper.type_definitions.defined_types.AutoMapperAnyDataType, List[spark_auto_mapper.type_definitions.defined_types.AutoMapperAnyDataType]]) – value to compare the column to
value (_TAutoMapperDataType) – what to return if the value matches
else – what value to assign if check fails
else_ (Optional[_TAutoMapperDataType]) –
- Returns
an if automapper type
- Return type
_TAutoMapperDataType
- static if_not(column, check, value)¶
Checks if column matches check_value. Returns value if it does not match
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column to check
check (Union[spark_auto_mapper.type_definitions.defined_types.AutoMapperAnyDataType, List[spark_auto_mapper.type_definitions.defined_types.AutoMapperAnyDataType]]) – value to compare the column to
value (_TAutoMapperDataType) – what to return if the value matches
- Returns
an if automapper type
- Return type
_TAutoMapperDataType
- static if_not_null(check, value, when_null=None)¶
Checks if check is null
- Parameters
check (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column to check for null
value (_TAutoMapperDataType) – what to return if the value is not null
when_null (Optional[_TAutoMapperDataType]) – what value to assign if check is not
- Returns
an if_not_null automapper type
- Return type
_TAutoMapperDataType
- static if_not_null_or_empty(check, value, when_null_or_empty=None)¶
Checks if check is null or empty.
- Parameters
check (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column to check for null
value (_TAutoMapperDataType) – what to return if the value is not null
when_null_or_empty (Optional[_TAutoMapperDataType]) – what value to assign if check is not
- Returns
an if_not_null automapper type
- Return type
_TAutoMapperDataType
- static map(column, mapping, default=None)¶
maps the contents of a column to values
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column
mapping (Dict[Optional[spark_auto_mapper.type_definitions.defined_types.AutoMapperTextInputType], spark_auto_mapper.type_definitions.defined_types.AutoMapperAnyDataType]) – A dictionary mapping the contents of the column to other values e.g., {“Y”:”Yes”, “N”: “No”}
default (Optional[spark_auto_mapper.type_definitions.defined_types.AutoMapperAnyDataType]) – the value to assign if no value matches
- Returns
a map automapper type
- Return type
spark_auto_mapper.data_types.expression.AutoMapperDataTypeExpression
- static left(column, length)¶
Take the specified number of first characters in a string
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to use
length (int) – number of characters to take from left
- Returns
a concat automapper type
- Return type
spark_auto_mapper.data_types.substring.AutoMapperSubstringDataType
- static right(column, length)¶
Take the specified number of last characters in a string
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to use
length (int) – number of characters to take from right
- Returns
a concat automapper type
- Return type
spark_auto_mapper.data_types.substring.AutoMapperSubstringDataType
- static substring(column, start, length)¶
Finds a substring in the specified string.
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to use
start (int) – position to start
length (int) – number of characters to take
- Returns
a concat automapper type
- Return type
spark_auto_mapper.data_types.substring.AutoMapperSubstringDataType
- static string_before_delimiter(column, delimiter)¶
Take the specified number of first characters in a string
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to use
delimiter (str) – string to use as delimiter
- Returns
a concat automapper type
- Return type
spark_auto_mapper.data_types.substring_by_delimiter.AutoMapperSubstringByDelimiterDataType
- static string_after_delimiter(column, delimiter)¶
Take the specified number of first characters in a string
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to use
delimiter (str) – string to use as delimiter
- Returns
a concat automapper type
- Return type
spark_auto_mapper.data_types.substring_by_delimiter.AutoMapperSubstringByDelimiterDataType
- static substring_by_delimiter(column, delimiter, delimiter_count)¶
Returns the substring from string str before count occurrences of the delimiter. substring_by_delimiter performs a case-sensitive match when searching for delimiter.
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to use
delimiter (str) – string to use as delimiter. can be a regex.
delimiter_count (int) –
- If delimiter_count is positive, everything the left of the final delimiter
(counting from left) is returned.
- If delimiter_count is negative, every to the right of the final delimiter
(counting from the right) is returned.
- Returns
a concat automapper type
- Return type
spark_auto_mapper.data_types.substring_by_delimiter.AutoMapperSubstringByDelimiterDataType
- static regex_replace(column, pattern, replacement)¶
Replace all substrings of the specified string value that match regexp with rep.
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to replace
pattern (str) – pattern to search for
replacement (str) – string to replace with
- Returns
a regex_replace automapper type
- Return type
spark_auto_mapper.data_types.regex_replace.AutoMapperRegExReplaceDataType
- static regex_extract(column, pattern, index)¶
Extracts a specific group matched by a regex from a specified column. If there was no match or the requested group does not exist, an empty string is returned.
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to replace
pattern (str) – pattern containing groups to match
index (int) – index of the group to return (1-indexed, use 0 to return the whole matched string)
- Returns
a regex_extract automapper type
- Return type
spark_auto_mapper.data_types.regex_extract.AutoMapperRegExExtractDataType
- static trim(column)¶
Trim the spaces from both ends for the specified string column.
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to trim
- Returns
a trim automapper type
- Return type
- static lpad(column, length, pad)¶
Returns column value, left-padded with pad to a length of length. If column value is longer than length, the return value is shortened to length characters.
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to left pad
length (int) – the desired length of the final string
pad (str) – the character to use to pad the string to the desired length
- Return type
- static hash(*args)¶
Calculates the hash code of given columns, and returns the result as an int column.
- Parameters
args (Union[spark_auto_mapper.type_definitions.native_types.AutoMapperNativeTextType, spark_auto_mapper.type_definitions.wrapper_types.AutoMapperWrapperType, spark_auto_mapper.data_types.text_like_base.AutoMapperTextLikeBase]) – string or column
- Returns
a concat automapper type
- Return type
- static coalesce(*args)¶
Returns the first value that is not null.
- Returns
a coalesce automapper type
- Parameters
args (_TAutoMapperDataType) –
- Return type
_TAutoMapperDataType
- static array_max(*args)¶
Returns the first value that is not null.
- Returns
a coalesce automapper type
- Parameters
args (_TAutoMapperDataType) –
- Return type
_TAutoMapperDataType
- static array_distinct(*args)¶
Returns the distinct items in the array.
- Returns
a coalesce automapper type
- Parameters
args (_TAutoMapperDataType) –
- Return type
_TAutoMapperDataType
- static if_regex(column, check, value, else_=None)¶
Checks if column matches check_value. Returns value if it matches else else_
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column to check
check (Union[str, List[str]]) – value to compare the column to. Has to be a string or list of strings
value (_TAutoMapperDataType) – what to return if the value matches
else – what value to assign if check fails
else_ (Optional[_TAutoMapperDataType]) –
- Returns
an if automapper type
- Return type
_TAutoMapperDataType
- static filter(column, func)¶
Filters a column by a function
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column to check
func (Callable[[pyspark.sql.Column], pyspark.sql.Column]) – func to filter by
- Returns
a filter automapper type
- Return type
spark_auto_mapper.data_types.filter.AutoMapperFilterDataType
- static transform(column, value)¶
transforms a column into another type or struct
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column to check
value (_TAutoMapperDataType) – func to create type or struct
- Returns
a transform automapper type
- Return type
List[_TAutoMapperDataType]
- static field(value)¶
Specifies that the value parameter should be used as a field name
- Parameters
value (str) – name of column
- Returns
A column automapper type
- Return type
spark_auto_mapper.data_types.text_like_base.AutoMapperTextLikeBase
- static current()¶
Specifies to use the current item
- Returns
A column automapper type
- Return type
spark_auto_mapper.data_types.text_like_base.AutoMapperTextLikeBase
- static split_by_delimiter(column, delimiter)¶
Split a string into an array using the delimiter
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to use
delimiter (str) – string to use as delimiter
- Returns
a concat automapper type
- Return type
spark_auto_mapper.data_types.split_by_delimiter.AutoMapperSplitByDelimiterDataType
- static float(value)¶
Converts column to float
- Returns
- Return type
- Parameters
value (spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase) –
- static flatten(column)¶
creates a single array from an array of arrays. If a structure of nested arrays is deeper than two levels, only one level of nesting is removed. source: http://spark.apache.org/docs/latest/api/python/_modules/pyspark/sql/functions.html#flatten
- Returns
a flatten automapper type
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) –
- Return type
spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase
- static first_valid_column(*columns)¶
- Allows for columns to be defined based in which a source column may not exist. If the optional source column does
not exist, the “default” column definition is used instead.
- return
a optional automapper type
- Parameters
columns (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) –
- Return type
spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase
- static if_column_exists(column, if_exists, if_not_exists)¶
check the if the column exists if exists returns if_exists if not if_not_exists
- Returns
a optional automapper type
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) –
if_exists (Optional[_TAutoMapperDataType]) –
if_not_exists (Optional[_TAutoMapperDataType]) –
- Return type
spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase
- static array(value)¶
creates an array from a single item. source: http://spark.apache.org/docs/latest/api/python/_modules/pyspark/sql/functions.html#array
- Returns
an array automapper type
- Parameters
value (spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase) –
- Return type
spark_auto_mapper.data_types.data_type_base.AutoMapperDataTypeBase
- static join_using_delimiter(column, delimiter)¶
Joins an array and forms a string using the delimiter
- Parameters
column (spark_auto_mapper.type_definitions.wrapper_types.AutoMapperColumnOrColumnLikeType) – column whose contents to use
delimiter (str) – string to use as delimiter
- Returns
a join automapper type
- Return type
spark_auto_mapper.data_types.join_using_delimiter.AutoMapperJoinUsingDelimiterDataType
- static unix_timestamp(value)¶
Joins an array and forms a string using the delimiter
- Parameters
value (spark_auto_mapper.type_definitions.defined_types.AutoMapperNumberInputType) – value to convert to unix timestamp
- Returns
a join automapper type
- Return type
spark_auto_mapper.data_types.unix_timestamp.AutoMapperUnixTimestampType