Key Default Type Description
execution.checkpointing.externalized-checkpoint-retention
(none)

Enum

Possible values: [DELETE_ON_CANCELLATION, RETAIN_ON_CANCELLATION]
Externalized checkpoints write their meta data out to persistent storage and are not automatically cleaned up when the owning job fails or is suspended (terminating with job status `JobStatus#FAILED` or `JobStatus#SUSPENDED`. In this case, you have to manually clean up the checkpoint state, both the meta data and actual program state.

The mode defines how an externalized checkpoint should be cleaned up on job cancellation. If you choose to retain externalized checkpoints on cancellation you have to handle checkpoint clean up manually when you cancel the job as well (terminating with job status `JobStatus#CANCELED`).

The target directory for externalized checkpoints is configured via `state.checkpoints.dir`.
execution.checkpointing.interval
(none) Duration Gets the interval in which checkpoints are periodically scheduled.

This setting defines the base interval. Checkpoint triggering may be delayed by the settings `execution.checkpointing.max-concurrent-checkpoints` and `execution.checkpointing.min-pause`
execution.checkpointing.max-concurrent-checkpoints
1 Integer The maximum number of checkpoint attempts that may be in progress at the same time. If this value is n, then no checkpoints will be triggered while n checkpoint attempts are currently in flight. For the next checkpoint to be triggered, one checkpoint attempt would need to finish or expire.
execution.checkpointing.min-pause
0 ms Duration The minimal pause between checkpointing attempts. This setting defines how soon thecheckpoint coordinator may trigger another checkpoint after it becomes possible to triggeranother checkpoint with respect to the maximum number of concurrent checkpoints(see `execution.checkpointing.max-concurrent-checkpoints`).

If the maximum number of concurrent checkpoints is set to one, this setting makes effectively sure that a minimum amount of time passes where no checkpoint is in progress at all.
execution.checkpointing.mode
EXACTLY_ONCE

Enum

Possible values: [EXACTLY_ONCE, AT_LEAST_ONCE]
The checkpointing mode (exactly-once vs. at-least-once).
execution.checkpointing.prefer-checkpoint-for-recovery
false Boolean If enabled, a job recovery should fallback to checkpoint when there is a more recent savepoint.
execution.checkpointing.timeout
10 min Duration The maximum time that a checkpoint may take before being discarded.
execution.checkpointing.tolerable-failed-checkpoints
(none) Integer The tolerable checkpoint failure number. If set to 0, that meanswe do not tolerance any checkpoint failure.
execution.checkpointing.unaligned
false Boolean Enables unaligned checkpoints, which greatly reduce checkpointing times under backpressure.

Unaligned checkpoints contain data stored in buffers as part of the checkpoint state, which allows checkpoint barriers to overtake these buffers. Thus, the checkpoint duration becomes independent of the current throughput as checkpoint barriers are effectively not embedded into the stream of data anymore.

Unaligned checkpoints can only be enabled if `execution.checkpointing.mode` is `EXACTLY_ONCE` and if `execution.checkpointing.max-concurrent-checkpoints` is 1