Keeping operator .status accurate under failures
WHAT IT TESTS: status reliability under faults. OUTLINE: status can lag or go stale during partitions and crashes; make reconcile idempotent, observe true state each loop, use conditions and observedGeneration, handle conflicts.
WHAT IT TESTS: whether you can reason about consistency between a custom resource's status and an external system under failures. ANSWER OUTLINE: status is a best-effort observation, not transactional with the external action, so a crash between acting and writing status leaves them inconsistent; partitions make observation impossible or stale. Mitigate by making reconcile idempotent, re-observing real external state each loop rather than trusting cache, using conditions and observedGeneration, and handling update conflicts with retries.
Read the original → interview
- #kubernetes
- #operators
- #status
- #consistency
- #fault-tolerance
Get five bites like this every day.
Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.