Replace
Transformation to replace a particular value in a column with another one
koheesio.spark.transformations.replace.Replace #
Replace a particular value in a column with another one
Can handle empty strings ("") as well as NULL / None values.
Unsupported datatypes:
Following casts are not supported
will raise an error in Spark:
- binary
- boolean
- array<...>
- map<...,...>
Supported datatypes:
Following casts are supported:
- byte
- short
- integer
- long
- float
- double
- decimal
- timestamp
- date
- string
- void skipped by default
Any supported none-string datatype will be cast to string before the replacement is done.
Example
input_df:
id | string |
---|---|
1 | hello |
2 | world |
3 |
output_df = Replace(
column="string",
from_value="hello",
to_value="programmer",
).transform(input_df)
output_df:
id | string |
---|---|
1 | programmer |
2 | world |
3 |
In this example, the value "hello" in the column "string" is replaced with "programmer".
from_value
class-attribute
instance-attribute
#
from_value: Optional[str] = Field(
default=None,
alias="from",
description="The original value that needs to be replaced. If no value is given, all 'null' values will be replaced with the to_value",
)
to_value
class-attribute
instance-attribute
#
to_value: str = Field(
default=...,
alias="to",
description="The new value to replace this with",
)
ColumnConfig #
Column type configurations for the column to be replaced
koheesio.spark.transformations.replace.replace #
Function to replace a particular value in a column with another one