This module defines the DeltaTableStreamWriter class, which is used to write streaming dataframes to Delta tables.
koheesio.spark.writers.delta.stream.DeltaTableStreamWriter
Delta table stream writer
Options
Options for DeltaTableStreamWriter
allow_population_by_field_name
class-attribute
instance-attribute
allow_population_by_field_name: bool = Field(
default=True,
description=" To do convert to Field and pass as .options(**config)",
)
maxBytesPerTrigger
class-attribute
instance-attribute
maxBytesPerTrigger: Optional[str] = Field(
default=None,
description="How much data to be processed per trigger. The default is 1GB",
)
maxFilesPerTrigger
class-attribute
instance-attribute
maxFilesPerTrigger: int = Field(
default == 1000,
description="The maximum number of new files to be considered in every trigger (default: 1000).",
)
execute
Source code in src/koheesio/spark/writers/delta/stream.py
| def execute(self) -> DeltaTableWriter.Output:
if self.batch_function:
self.streaming_query = self.writer.start()
else:
self.streaming_query = self.writer.toTable(tableName=self.table.table_name)
|