spark_pipeline_framework.proxy_generator.proxy_base

Module Contents

Classes

ProxyBase

Abstract class for transformers that transform one dataset into another.

class spark_pipeline_framework.proxy_generator.proxy_base.ProxyBase(parameters: Dict[str, Any], location: Union[str, pathlib.Path], progress_logger: Optional[spark_pipeline_framework.progress_logger.progress_logger.ProgressLogger] = None, verify_count_remains_same: bool = False)

Bases: spark_pipeline_framework.transformers.framework_transformer.v1.framework_transformer.FrameworkTransformer

Abstract class for transformers that transform one dataset into another.

New in version 1.3.0.

static read_file_as_string(file_path: str) str
property transformers(self) List[pyspark.ml.base.Transformer]
_transform(self, df: pyspark.sql.DataFrame) pyspark.sql.DataFrame

Transforms the input dataset.

datasetpyspark.sql.DataFrame

input dataset.

pyspark.sql.DataFrame

transformed dataset

fit(self, df: pyspark.sql.DataFrame) pyspark.ml.base.Transformer
get_python_transformer(self, import_module_name: str, mapping_file_name: Optional[str] = None) pyspark.ml.base.Transformer
get_python_mapping_transformer(self, import_module_name: str, mapping_file_name: Optional[str]) pyspark.ml.base.Transformer