spark_pipeline_framework.pipelines.framework_pipeline

Module Contents

Classes

FrameworkPipeline

Abstract class for transformers that transform one dataset into another.

class spark_pipeline_framework.pipelines.framework_pipeline.FrameworkPipeline(parameters: Dict[str, Any], progress_logger: spark_pipeline_framework.progress_logger.progress_logger.ProgressLogger)

Bases: pyspark.ml.base.Transformer

Abstract class for transformers that transform one dataset into another.

New in version 1.3.0.

property parameters(self) Dict[str, Any]
fit(self, df: pyspark.sql.dataframe.DataFrame) FrameworkPipeline
_transform(self, df: pyspark.sql.dataframe.DataFrame) pyspark.sql.dataframe.DataFrame

Transforms the input dataset.

datasetpyspark.sql.DataFrame

input dataset.

pyspark.sql.DataFrame

transformed dataset

create_steps(self, my_list: Union[List[pyspark.ml.base.Transformer], List[spark_pipeline_framework.transformers.framework_transformer.v1.framework_transformer.FrameworkTransformer], List[Union[pyspark.ml.base.Transformer, List[pyspark.ml.base.Transformer]]], List[Union[spark_pipeline_framework.transformers.framework_transformer.v1.framework_transformer.FrameworkTransformer, List[spark_pipeline_framework.transformers.framework_transformer.v1.framework_transformer.FrameworkTransformer]]]]) List[pyspark.ml.base.Transformer]
finalize(self) None