Skip to content

Initial service load tests after development (baseline)

Summary from Grafana AI regarding load tests using wiremock:

Overall service health: solid, with one notable pressure point The service maintained ~99.998% availability and a <0.01% error rate across all tests — including at 1000 req/s. CPU showed no saturation (max 0.25 cores/pod, zero throttling), memory was stable at ~48–56 MiB per pod, JVM threads held steady at 10, and load was evenly distributed across pods with clean scale-up/scale-down.

Per-test observations

03/06 15:34 – Reduced wiremock latency

Good baseline. Istio shows this phase handled ~1003 rps peak cleanly. Connection pools within range.

03/06 16:20 – PWM/wiremock at 10–100ms

Gateway connection pool briefly touched its 2,000-connection ceiling (~16:40). Still functional, but the first sign of connection pressure.

03/06 16:58 – Wiremock at 400ms (real CDS latency)

Biggest stress signal in the dataset. The HTTP client connection pool hit its maximum of 13,000 connections at ~17:46. Higher latency means each connection is held longer, which drains the pool faster. At real CDS latency (400ms), the connection pool is at saturation risk — this is the configuration that most closely resembles production.

03/06 18:00 – 300 req/s, 90% cache hits, 10-100ms wiremock latency

Lighter downstream pressure due to high cache hit rate. No pressure signals observed.

04/06 08:00–09:25 – 100→700 req/s, 90% cache misses, 10-100ms wiremock latency

Request rate climbed to ~1060 rps peak. P99 latency reached 210ms around 08:50 (climbing noticeably at higher miss rates). HTTP pool remained at 12,000+ connections as tests progressed — it doesn't fully drain between runs, which compounds pool pressure.

04/06 09:25 – 1000 req/s, 75% cache misses, 10-100ms wiremock latency

Peak load sustained cleanly from an availability standpoint (99.992%). P99 at 210ms. HTTP pool still elevated. The service handles this volume, but with degraded latency.

Table taking into consideration story acceptance criteria:

Date/Time Config Recommender Avail. Router Avail. P99 Latency Notes
03/06 16:20 1000 req/s, ~10% miss, wiremock latency 10-100ms 100% 99.9977% 100ms recommender, 100ms router No errors
03/06 18:00 300 req/s, ~10% miss, wiremock latency 10-100ms 100% 99.9977% 100ms each No errors
04/06 08:00 100 req/s, 90% miss, wiremock latency 10-100ms 100% 99.9977% 110ms recommender, 115ms router Minimal errors
04/06 08:27 300 req/s, 90% miss, wiremock latency 10-100ms 99.9984% 99.9984% 200ms Some CDS client failures
02/07 12:40 1300 req/s, 90% miss, wiremock latency 10-100ms 99.999% 99.999% <200ms router, 100ms recommender Some CDS client failures

More test runs can be found in a comment in: https://jira.services.flutteruki.com/browse/CALLST-1481