Attributes¶
spark_expectations.core.current_dir = os.path.dirname(os.path.abspath(__file__))
module-attribute
¶
Functions¶
spark_expectations.core.get_config_dict(spark: SparkSession, user_conf: Dict[str, Union[str, int, bool, Dict[str, str]]] = None) -> tuple[Dict[str, Union[str, int, bool, Dict[str, str], None]], Dict[str, Union[bool, str]]]
¶
Retrieve both notification and streaming config dictionaries from the user configuration or Spark session or default configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spark
|
SparkSession
|
The Spark session to retrieve the configuration from. |
required |
user_conf
|
[Dict[str, Any]]
|
User configuration to merge with default configuration. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
tuple |
tuple[Dict[str, Union[str, int, bool, Dict[str, str], None]], Dict[str, Union[bool, str]]]
|
A tuple containing (notification_dict, streaming_dict). |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If there are errors parsing or retrieving the configuration. |
Source code in spark_expectations/core/__init__.py
spark_expectations.core.get_spark_session() -> SparkSession
¶
Source code in spark_expectations/core/__init__.py
spark_expectations.core.infer_safe_cast(input_value: Any) -> Union[int, float, bool, dict, str, None]
¶
Infers and safely casts the input value to int, float, bool, dict, str, or None.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_value
|
Any
|
The value to analyze (can be any type) |
required |
Returns:
| Type | Description |
|---|---|
Union[int, float, bool, dict, str, None]
|
Union[int, float, bool, dict, str, None]: The inferred and converted value |
Source code in spark_expectations/core/__init__.py
spark_expectations.core.load_configurations(spark: SparkSession) -> None
¶
Load Spark configuration settings from a YAML file and apply them to the provided SparkSession.
This function:
- Reads the configuration file located at ../config/spark-expectations-default-config.yaml.
- Separates streaming (se.streaming.*) and notification (spark.expectations.*) configurations into dictionaries.
- Sets other configuration values directly in the Spark session.
- Stores streaming and notification configs as JSON strings in Spark session configs.
- Raises RuntimeError for file not found, YAML parsing errors, permission issues, or other exceptions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spark
|
SparkSession
|
The SparkSession to apply configurations to. |
required |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If the configuration file is not found, cannot be parsed, or other errors occur. |