Normally, Apache Spark Structured Streaming expects you to continuously write to a sink using methods like .writeStream.format("delta").start().
MERGE operations (upserts) are not supported directly in streaming sinks. That is where foreachBatch comes in: it lets you write arbitrary batch logic, including MERGE INTO, joins, custom transformations, etc.
| Action you want to do | Needs foreachBatch? |
|---|---|
| Simple append to Delta | ❌ No |
| Merge/upsert into Delta | ✅ Yes |
| Write to an external system | ✅ Yes |
| Use complex Python logic | ✅ Yes |