Skip to content

PoolManager pod specialization doesn't work as expected during a peak of simultaneous requests. #2452

@nunojca

Description

@nunojca

Fission/Kubernetes version

$ fission --version
client:
  fission/core:
    BuildDate: "2021-12-29T03:57:30Z"
    GitCommit: 327275d1
    Version: v1.15.1
server:
  fission/core:
    BuildDate: "2021-12-29T03:57:30Z"
    GitCommit: 327275d1
    Version: v1.15.1

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.4-eks-6b7464", GitCommit:"6b746440c04cb81db4426842b4ae65c3f7035e53", GitTreeState:"clean", BuildDate:"2021-03-19T19:35:50Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.5", GitCommit:"5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e", GitTreeState:"clean", BuildDate:"2021-12-16T08:32:32Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
n

Kubernetes platform: Docker Desktop

Describe the bug
The problem occurs when many requests to the same function occur simultaneously and it is the first time this function is being called. This will trigger in parallel the specialization of several pods for the same function, instead of handling multiple requests on the same function pod honoring the rpp configuration.

To Reproduce

  • Create an environment (e.g. go).
  • Create a function (e.g. go hello world) with PoolManager executor and rpp=5.
  • Let’s consider the following basic script:
cat test.sh
time fission fn test --name f2 &
time fission fn test --name f2 &
time fission fn test --name f2 &
time fission fn test --name f2 &
time fission fn test --name f2 &
  • Initially the warm pool have 3 pods:
kubectl get pods -n fission-function | grep Running
poolmgr-go-default-407982-6b66d6bb45-4lrfz   2/2     Running       0          6m41s
poolmgr-go-default-407982-6b66d6bb45-9vwfk   2/2     Running       0          6m39s
poolmgr-go-default-407982-6b66d6bb45-c59fr   2/2     Running       0          6m37s
  • The script will run in background 5 calls to the function:
./test.sh
Hello, world!

real    0m0.319s
user    0m0.037s
sys     0m0.036s
Hello, world!

real    0m0.324s
user    0m0.028s
sys     0m0.041s
Hello, world!

real    0m0.330s
user    0m0.072s
sys     0m0.000s
Hello, world!

real    0m2.185s
user    0m0.059s
sys     0m0.017s
Hello, world!

real    0m3.479s
user    0m0.031s
sys     0m0.046s
  • What we have seen is that 3 pods from the warm pool have been specialized and 3 function requests didn’t suffer cold starts (the first 3), but then no more pods were available on the warm pool and the remaining 2 requests had to wait for more pods to start and suffered a cold start.
kubectl get pods -n fission-function | grep Running
poolmgr-go-default-407982-6b66d6bb45-4lrfz   2/2     Running       0          8m14s
poolmgr-go-default-407982-6b66d6bb45-4tckz   2/2     Running       0          41s
poolmgr-go-default-407982-6b66d6bb45-9vwfk   2/2     Running       0          8m12s
poolmgr-go-default-407982-6b66d6bb45-c59fr   2/2     Running       0          8m10s
poolmgr-go-default-407982-6b66d6bb45-djbph   2/2     Running       0          41s
poolmgr-go-default-407982-6b66d6bb45-k67xk   2/2     Running       0          38s
poolmgr-go-default-407982-6b66d6bb45-nwdhk   2/2     Running       0          41s
poolmgr-go-default-407982-6b66d6bb45-qndrn   2/2     Running       0          39s
  • A total of 8 pods have been used! One for each request to the function, plus 3 pods that were started to keep the pool with size of 3. This behavior was not expected and it happens when multiple requests for the same function happen at the exact same time and there aren’t any specialized pod for the function. Then, Fission triggers in parallel the specialization of pods for the same function and it doesn't honor the rpp parameter.

Expected result
The expected result was the handling of requests honoring the rpp.

Actual result
The actual result is the exhaustion of the pool of pods with just one requests per pod.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions