Transform
Transform module
Transform aims to provide an easy interface for calling transformations on a Spark DataFrame, where the transformation is a function that accepts a DataFrame (df) and any number of keyword args.
koheesio.spark.transformations.transform.Transform #
Transform aims to provide an easy interface for calling transformations on a Spark DataFrame, where the transformation is a function that accepts a DataFrame (df) and any number of keyword args.
The implementation is inspired by and based upon: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.transform.html
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
func | 
            
                  Callable
             | 
            
               The function to be called on the DataFrame.  | 
            required | 
params | 
            
                  Dict
             | 
            
               The keyword arguments to be passed to the function. Defaults to None. Alternatively, keyword arguments can be
passed directly as keyword arguments - they will be merged with the   | 
            
                  None
             | 
          
Example
a function compatible with Transform:#
verbose style input in Transform#
shortened style notation (easier to read)#
when too much input is given, Transform will ignore extra input#
Transform(
    some_func,
    a="foo",
    # ignored input
    c="baz",
    title=42,
    author="Adams",
    # order of params input should not matter
    b="bar",
)
using the from_func classmethod#
Source code in src/koheesio/spark/transformations/transform.py
                    
                  
            func
  
      class-attribute
      instance-attribute
  
#
func: Callable = Field(default=None, description='The function to be called on the DataFrame.')
execute #
Call the function on the DataFrame with the given keyword arguments.
            from_func
  
      classmethod
  
#
    Create a Transform class from a function. Useful for creating a new class with a different name.
This method uses the functools.partial function to create a new class with the given function and keyword
arguments. This way you can pre-define some of the keyword arguments for the function that might be needed for
the specific use case.
Example
In this example, CustomTransform is a Transform class with the function some_func and the keyword argument
a set to "foo". When calling some_func(b="bar"), the function some_func will be called with the keyword
arguments a="foo" and b="bar".