Strings
Adds a number of Transformations that are intended to be used with StringType column input. Some will work with other types however, but will output StringType or an array of StringType.
These Transformations take full advantage of Koheesio's ColumnsTransformationWithTarget class, allowing a user to apply column transformations to multiple columns at once. See the class docstrings for more information.
The following Transformations are included:
LowerConverts a string column to lower case.UpperConverts a string column to upper case.TitleCaseorInitCapConverts a string column to title case, where each word starts with a capital letter.
ConcatConcatenates multiple input columns together into a single column, optionally using the given separator.
pad:
PadPads the values ofsource_columnwith thecharacterup until it reacheslengthof charactersLPadPad with a character on the left side of the string.RPadPad with a character on the right side of the string.
RegexpExtractExtract a specific group matched by a Java regexp from the specified string column.RegexpReplaceSearches for the given regexp and replaces all instances with what is in 'replacement'.
ReplaceReplace all instances of a string in a column with another string.
SplitAllSplits the contents of a column on basis of a split_pattern.SplitAtFirstMatchLike SplitAll, but only splits the string once. You can specify whether you want the first or second part.
SubstringExtracts a substring from a string column starting at the given position.
trim:
TrimTrim whitespace from the beginning and/or end of a string.LTrimTrim whitespace from the beginning of a string.RTrimTrim whitespace from the end of a string.