spark_pipeline_framework.utilities.spark_data_frame_helpers¶
Module Contents¶
Functions¶
|
|
|
parses the dictionary and converts it to a dataframe and creates a view |
|
Creates data frame from dictionary |
|
|
|
|
|
Efficient way to check if the data frame is empty without getting the count of the whole data frame |
|
|
|
|
|
|
|
|
|
|
|
converts a data frame into a list of dictionaries |
- spark_pipeline_framework.utilities.spark_data_frame_helpers.convert_to_row(d: Dict[Any, Any]) pyspark.Row¶
- spark_pipeline_framework.utilities.spark_data_frame_helpers.create_view_from_dictionary(view: str, data: List[Dict[str, Any]], spark_session: pyspark.sql.SparkSession, schema: Optional[pyspark.sql.types.StructType] = None) pyspark.sql.DataFrame¶
parses the dictionary and converts it to a dataframe and creates a view :param view: :param data: :param spark_session: :param schema: :return: new data frame
- spark_pipeline_framework.utilities.spark_data_frame_helpers.create_dataframe_from_dictionary(data: List[Dict[str, Any]], spark_session: pyspark.sql.SparkSession, schema: Optional[pyspark.sql.types.StructType] = None) pyspark.sql.DataFrame¶
Creates data frame from dictionary :param data: :param spark_session: :param schema: :return: data frame
- spark_pipeline_framework.utilities.spark_data_frame_helpers.create_empty_dataframe(spark_session: pyspark.sql.SparkSession) pyspark.sql.DataFrame¶
- spark_pipeline_framework.utilities.spark_data_frame_helpers.create_dataframe_from_json(spark_session: pyspark.sql.SparkSession, schema: pyspark.sql.types.StructType, json: str) pyspark.sql.DataFrame¶
- spark_pipeline_framework.utilities.spark_data_frame_helpers.spark_is_data_frame_empty(df: pyspark.sql.DataFrame) bool¶
Efficient way to check if the data frame is empty without getting the count of the whole data frame
- spark_pipeline_framework.utilities.spark_data_frame_helpers.spark_get_execution_plan(df: pyspark.sql.DataFrame, extended: bool = False) Any¶
- spark_pipeline_framework.utilities.spark_data_frame_helpers.spark_table_exists(sql_ctx: pyspark.SQLContext, view: str) bool¶
- spark_pipeline_framework.utilities.spark_data_frame_helpers.sc(df: pyspark.sql.DataFrame) pyspark.SparkContext¶
- spark_pipeline_framework.utilities.spark_data_frame_helpers.add_metadata_to_column(df: pyspark.sql.DataFrame, column: str, metadata: Any) pyspark.sql.DataFrame¶
- spark_pipeline_framework.utilities.spark_data_frame_helpers.get_metadata_of_column(df: pyspark.sql.DataFrame, column: str) Any¶
- spark_pipeline_framework.utilities.spark_data_frame_helpers.to_dicts(df: pyspark.sql.DataFrame, limit: int) List[Dict[str, Any]]¶
converts a data frame into a list of dictionaries :param df: :param limit: :return: