Transform
Transform module
Transform aims to provide an easy interface for calling transformations on a Spark DataFrame, where the transformation is a function that accepts a DataFrame (df) and any number of keyword args.
koheesio.spark.transformations.transform.Transform #
Transform aims to provide an easy interface for calling transformations on a Spark DataFrame, where the transformation is a function that accepts a DataFrame (df) and any number of keyword args.
The implementation is inspired by and based upon: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.transform.html
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func |
Callable
|
The function to be called on the DataFrame. |
required |
params |
Dict
|
The keyword arguments to be passed to the function. Defaults to None. Alternatively, keyword arguments can be
passed directly as keyword arguments - they will be merged with the |
None
|
Example
a function compatible with Transform:#
verbose style input in Transform#
shortened style notation (easier to read)#
when too much input is given, Transform will ignore extra input#
Transform(
some_func,
a="foo",
# ignored input
c="baz",
title=42,
author="Adams",
# order of params input should not matter
b="bar",
)
using the from_func classmethod#
Source code in src/koheesio/spark/transformations/transform.py
func
class-attribute
instance-attribute
#
func: Callable = Field(
default=None,
description="The function to be called on the DataFrame.",
)
execute #
Call the function on the DataFrame with the given keyword arguments.
from_func
classmethod
#
Create a Transform class from a function. Useful for creating a new class with a different name.
This method uses the functools.partial
function to create a new class with the given function and keyword
arguments. This way you can pre-define some of the keyword arguments for the function that might be needed for
the specific use case.
Example
In this example, CustomTransform
is a Transform class with the function some_func
and the keyword argument
a
set to "foo". When calling some_func(b="bar")
, the function some_func
will be called with the keyword
arguments a="foo"
and b="bar"
.