Initial service load tests after development (baseline)
Summary from Grafana AI regarding load tests using wiremock:
Overall service health: solid, with one notable pressure point The service maintained ~99.998% availability and a <0.01% error rate across all tests — including at 1000 req/s. CPU showed no saturation (max 0.25 cores/pod, zero throttling), memory was stable at ~48–56 MiB per pod, JVM threads held steady at 10, and load was evenly distributed across pods with clean scale-up/scale-down.
Per-test observations
03/06 15:34 – Reduced wiremock latency
Good baseline. Istio shows this phase handled ~1003 rps peak cleanly. Connection pools within range.
03/06 16:20 – PWM/wiremock at 10–100ms
Gateway connection pool briefly touched its 2,000-connection ceiling (~16:40). Still functional, but the first sign of connection pressure.
03/06 16:58 – Wiremock at 400ms (real CDS latency)
Biggest stress signal in the dataset. The HTTP client connection pool hit its maximum of 13,000 connections at ~17:46. Higher latency means each connection is held longer, which drains the pool faster. At real CDS latency (400ms), the connection pool is at saturation risk — this is the configuration that most closely resembles production.
03/06 18:00 – 300 req/s, 90% cache hits, 10-100ms wiremock latency
Lighter downstream pressure due to high cache hit rate. No pressure signals observed.
04/06 08:00–09:25 – 100→700 req/s, 90% cache misses, 10-100ms wiremock latency
Request rate climbed to ~1060 rps peak. P99 latency reached 210ms around 08:50 (climbing noticeably at higher miss rates). HTTP pool remained at 12,000+ connections as tests progressed — it doesn't fully drain between runs, which compounds pool pressure.
04/06 09:25 – 1000 req/s, 75% cache misses, 10-100ms wiremock latency
Peak load sustained cleanly from an availability standpoint (99.992%). P99 at 210ms. HTTP pool still elevated. The service handles this volume, but with degraded latency.
Table taking into consideration story acceptance criteria:
| Date/Time | Config | Recommender Avail. | Router Avail. | P99 Latency | Notes |
|---|---|---|---|---|---|
| 03/06 16:20 | 1000 req/s, ~10% miss, wiremock latency 10-100ms | 100% | 99.9977% | 100ms recommender, 100ms router | No errors |
| 03/06 18:00 | 300 req/s, ~10% miss, wiremock latency 10-100ms | 100% | 99.9977% | 100ms each | No errors |
| 04/06 08:00 | 100 req/s, 90% miss, wiremock latency 10-100ms | 100% | 99.9977% | 110ms recommender, 115ms router | Minimal errors |
| 04/06 08:27 | 300 req/s, 90% miss, wiremock latency 10-100ms | 99.9984% | 99.9984% | 200ms | Some CDS client failures |
| 02/07 12:40 | 1300 req/s, 90% miss, wiremock latency 10-100ms | 99.999% | 99.999% | <200ms router, 100ms recommender | Some CDS client failures |
More test runs can be found in a comment in: https://jira.services.flutteruki.com/browse/CALLST-1481