What are Watermarks?
Watermarks in Apache Flink are a mechanism to handle event time and out-of-order events in stream processing. They represent a point in time in the data stream and indicate that no events with timestamps earlier than the watermark should be expected. Essentially, watermarks help Flink understand the progress of event time in the stream and trigger computations like window operations based on this understanding.
- Event Time Event Time is the time at which events actually occurred, as recorded in the event data itself. For more detailed information, you can refer to the Understanding Event Time in Apache Flink
- Ingestion Time Ingestion Time is the time when events enter the Flink pipeline.
- Processing Time Processing Time is the time when events are processed by Flink.
Watermarks
- Definition: A watermark is a timestamp that flows as part of the data stream and denotes the progress of event time.
- Purpose: Watermarks help in handling late events and triggering event-time-based operations like windowing.
Source: Apache Flink
Source: Apache Flink