Skip to content

Stream

This module defines the DeltaTableStreamWriter class, which is used to write streaming dataframes to Delta tables.

koheesio.spark.writers.delta.stream.DeltaTableStreamWriter #

Delta table stream writer

Options #

Options for DeltaTableStreamWriter

allow_population_by_field_name class-attribute instance-attribute #

allow_population_by_field_name: bool = Field(default=True, description=' To do convert to Field and pass as .options(**config)')

maxBytesPerTrigger class-attribute instance-attribute #

maxBytesPerTrigger: Optional[str] = Field(default=None, description='How much data to be processed per trigger. The default is 1GB')

maxFilesPerTrigger class-attribute instance-attribute #

maxFilesPerTrigger: int = Field(default == 1000, description='The maximum number of new files to be considered in every trigger (default: 1000).')

execute #

execute()
Source code in src/koheesio/spark/writers/delta/stream.py
def execute(self):
    if self.batch_function:
        self.streaming_query = self.writer.start()
    else:
        self.streaming_query = self.writer.toTable(tableName=self.table.table_name)