spark_pipeline_framework.pipelines.v2.framework_pipeline¶
Module Contents¶
Classes¶
Abstract class for transformers that transform one dataset into another. |
- class spark_pipeline_framework.pipelines.v2.framework_pipeline.FrameworkPipeline(parameters: Dict[str, Any], progress_logger: spark_pipeline_framework.progress_logger.progress_logger.ProgressLogger, run_id: Optional[str], client_name: Optional[str] = None, vendor_name: Optional[str] = None, data_lake_path: Optional[str] = None, validation_output_path: Optional[str] = None)¶
Bases:
pyspark.ml.base.TransformerAbstract class for transformers that transform one dataset into another.
New in version 1.3.0.
- property parameters(self) Dict[str, Any]¶
- property run_id(self) str¶
- fit(self, df: pyspark.sql.dataframe.DataFrame) FrameworkPipeline¶
- _transform(self, df: pyspark.sql.dataframe.DataFrame) pyspark.sql.dataframe.DataFrame¶
Transforms the input dataset.
- dataset
pyspark.sql.DataFrame input dataset.
pyspark.sql.DataFrametransformed dataset
- dataset
- _check_validation(self, df: pyspark.sql.dataframe.DataFrame) None¶
- create_steps(self, my_list: Union[List[pyspark.ml.base.Transformer], List[spark_pipeline_framework.transformers.framework_transformer.v1.framework_transformer.FrameworkTransformer], List[Union[pyspark.ml.base.Transformer, List[pyspark.ml.base.Transformer]]], List[Union[spark_pipeline_framework.transformers.framework_transformer.v1.framework_transformer.FrameworkTransformer, List[spark_pipeline_framework.transformers.framework_transformer.v1.framework_transformer.FrameworkTransformer]]]]) List[pyspark.ml.base.Transformer]¶
- finalize(self) None¶