Retained Evidence
Product Definition
Section titled “Product Definition”kube-insight is a historical evidence store for Kubernetes clusters.
It continuously records Kubernetes resource versions, historical topology, and
high-value troubleshooting facts. Its job is to answer operational questions
that current kubectl, expired Kubernetes Events, metrics, logs, and GitOps
history cannot answer alone.
Primary Use Case
Section titled “Primary Use Case”A service had a small latency or error-rate spike several hours ago.
Current state is healthy. No alert fired. The affected Pods may have restarted, moved, or been replaced. Kubernetes Events may already be expired.
The engineer wants to know:
- Did any related Pod get OOMKilled, evicted, restarted, or become unready?
- Did the Pod move to another Node?
- Did a Deployment, ReplicaSet, image, probe, env var, or resource limit change?
- Did the Node enter memory, disk, PID, or readiness pressure?
- Were other high-resource Pods colocated on the same Node?
- What exact resource versions prove the answer?
Platform Engineers
Section titled “Platform Engineers”They debug workloads and cluster behavior. They need historical Kubernetes state, topology, and evidence around incidents.
SRE / Observability Engineers
Section titled “SRE / Observability Engineers”They start from telemetry symptoms and need Kubernetes context around a time window.
Compliance / Governance Engineers
Section titled “Compliance / Governance Engineers”They need to reconstruct historical resource state, ownership, placement, and configuration.
Product Principles
Section titled “Product Principles”Full History Is The Source Of Truth
Section titled “Full History Is The Source Of Truth”Every retained resource version should be reconstructable. Derived facts and topology can be rebuilt; raw history cannot.
Incident Queries Should Not Scan All JSON
Section titled “Incident Queries Should Not Scan All JSON”Common troubleshooting paths must use compact relationship and fact indexes. Historical JSON reconstruction is reserved for evidence drill-down and diff.
Kubernetes-Aware, Backend-Flexible
Section titled “Kubernetes-Aware, Backend-Flexible”The product is Kubernetes-specific at the domain layer, but storage backends should remain replaceable.
Evidence First, Not Automatic Blame
Section titled “Evidence First, Not Automatic Blame”The product should rank and explain evidence. It should avoid claiming a root cause without enough signal.
Non-Goals
Section titled “Non-Goals”- Replace metrics, logs, traces, or alerting.
- Become a generic schemaless JSON warehouse.
- Index every scalar field in every historical JSON document by default.
- Store Secret payloads by default.
- Require PostgreSQL as the only backend during early PoC.
Product Surface
Section titled “Product Surface”CLI:
kube-insight dev collect samples --context staging --output data/kube-sampleskube-insight dev ingest --dir data/kube-samples --db kube-insight.dbkube-insight query service checkout-api \ --namespace production \ --from 2026-05-11T10:05:00Z \ --to 2026-05-11T10:20:00Zkube-insight query topology --db kube-insight.db --kind Service --name checkout-api --namespace productionkube-insight diff deployment checkout-api --from-version 120 --to-version 125API:
GET /clustersGET /objectsGET /objects/{id}/latestGET /objects/{id}/versionsGET /objects/{id}/versions/{version_id}GET /objects/{id}/diff?from=&to=POST /investigationsGET /investigations/{id}GET /topology?root=&at=GET /facts?object_id=&from=&to=UI:
- Cluster timeline.
- Namespace/service/workload explorer.
- Investigation result view.
- Topology-at-time graph.
- Evidence timeline.
- Resource version diff.
- Storage and index strategy report.