-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
What: The runtime should be self-aware of its memory consumption and gracefully pause or refuse additional requests.
Why: This is to avoid the sidecar container from being killed due to being Out of Memory (OOM).
Who benefits from this? This is especially important when a memory limit has been defined on the Dapr sidecar, or when Dapr is run on systems with small available memory. That could be in an IoT / embedded scenario, on a Raspberry Pi Kubernetes Cluster - or simply a small Kubernetes cluster with very small VM sizes.
How can this be done?
Conceptually, I recommend that the Dapr HTTP and gRPC server upon each incoming request immediately check memory utilization. If the utilization is beyond the current allowable limit (say Kubernetes container resource limit - 50MB):
- Pause incoming connections for a certain period of time, frequently checking if the request can be processed because the memory utilization has been reduced.
- If after a certain period of time the memory utilization is not in the allowable range, the request should be denied with HTTP 503 (temporarily unavailable). This is far better than killing the sidecar pod.
Here is an example of how this could work conceptually:
Imagine this where my sidecar definition (the memory limit is exposed to the sidecar using the Kubernetes Downward API):
apiVersion: apps/v1
kind: Deployment
metadata:
name: fasthttp-example
spec:
replicas: 1
selector:
matchLabels:
app: fasthttp-example
template:
metadata:
labels:
app: fasthttp-example
spec:
containers:
- name: daprd
image: daprio/daprd:latest
env:
- name: MAX_MEMORY
valueFrom:
resourceFieldRef:
containerName: daprd
resource: limits.memory
divisor: "1Mi"
And for example the HTTP request handler of the Dapr API server could work like so:
package main
import (
"fmt"
"log"
"runtime"
"time"
"github.com/valyala/fasthttp"
)
const holdDuration = 3 * time.Second
const checkInterval = 100 * time.Millisecond
func main() {
maxMemoryStr := os.Getenv("MAX_MEMORY")
if maxMemoryStr == "" {
log.Fatal("MAX_MEMORY environment variable not set")
}
maxMemory, err := strconv.ParseInt(maxMemoryStr, 10, 64)
if err != nil {
log.Fatalf("Invalid MAX_MEMORY value: %v", err)
}
maxMemory *= 1024 * 1024
server := fasthttp.Server{
Handler: func(ctx *fasthttp.RequestCtx) {
start := time.Now()
for time.Since(start) < holdDuration {
var m runtime.MemStats
runtime.ReadMemStats(&m)
if m.Alloc <= maxMemory {
fmt.Fprintf(ctx, "Hello, world!\n")
return
}
time.Sleep(checkInterval)
}
ctx.SetStatusCode(fasthttp.StatusServiceUnavailable)
fmt.Fprintf(ctx, "Memory usage exceeded %d bytes\n", maxMemory)
},
}
log.Fatal(server.ListenAndServe(":8080"))
}