Add `--condition` #276

felipecrs · 2023-08-25T00:03:12Z

Example:

stern . --condition=ready=false --tail=0

TODOs:

Only starts tailing logs for pods matching --condition
Stop tailing logs for pods that no longer match --condition
Add tests

Fixes #213

felipecrs · 2023-08-25T00:04:56Z

I'm giving #213 a go because I really need it. But I'm struggling with the second TODO. Is there a chance someone can help me?

PS: code is not yet ready for final review, but it's working. Also, I have almost zero experience with Golang.

superbrothers · 2023-09-01T11:53:01Z

I will look into it this weekend.

felipecrs · 2023-09-01T11:54:51Z

Thanks a lot @superbrothers!

superbrothers · 2023-09-07T12:18:46Z

It is on the way, but I have a working patch. I won't have much time this weekend but will be able to work on it after that. 816c896

You can try it out as soon as you build it.

felipecrs · 2023-09-08T12:33:26Z

@superbrothers, thanks a lot, you are the best!

However, it's not working quite as I expected. Let me help build a testing scenario too:

kubectl apply -f - <<'EOF'
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test
  serviceName: test
  template:
    metadata:
      labels:
        app: test
    spec:
      terminationGracePeriodSeconds: 1
      containers:
      - name: test
        image: bash
        command:
          - bash
          - -c
          - |
            count=0
            while true; do
              echo "Hello World: $count"
              count=$((count+1))
              if [[ $count -eq 10 ]]; then
                touch /tmp/healthy
              fi
              sleep 1
            done
        readinessProbe:
          exec:
            command:
            - cat
            - /tmp/healthy
          initialDelaySeconds: 10
          periodSeconds: 1
EOF

This pod becomes ready in about 11 seconds.

Start the stern process with:

./dist/stern '.*' --condition ready=false --tail 0

Stern correctly filters out the pod if the pod is already ready, but Stern does not stop tailing the logs when the pod becomes ready:

WindowsTerminal_lMhjxZZGe5.mp4

felipecrs · 2023-09-08T12:46:26Z

Ok, I think I know what's going on: K8s does not seem to emit an event for when the pod becomes ready. I'm investigating what can be done about it.

felipecrs · 2023-09-08T14:26:57Z

Checking kubectl wait's source code, it looks like they invoke an additional watcher for each resource until the condition is met:

https://github.com/kubernetes/kubectl/blob/0266cec8bce880387a33c4d948673749defeb0bc/pkg/cmd/wait/wait.go#L530

I wonder if and how we could do something similar:

If the pod initially matches condition and gets added, we invoke this additional watcher for such pod until the opposite condition is met. When the opposite condition is met, we add the pod to deleted.

(I already tried to tinker with it myself without much success)

tksm · 2023-09-09T03:47:21Z

Hi,

It appears that Stern has detected the events, but they could be filtered out by targetFilter.shouldAdd().

The following patch may resolve the issue, but I've noticed the behavior of --condition ready=false can be somewhat confusing. When we delete pods, the pod's Ready status reverts to False, causing the logs to restart from the beginning. Additionally, the Ready status can change due to other reasons.

diff --git a/stern/stern.go b/stern/stern.go
index c492519..dcece50 100644
--- a/stern/stern.go
+++ b/stern/stern.go
@@ -222,7 +222,7 @@ func Run(ctx context.Context, config *Config) error {
 					if !ok {
 						continue
 					}
-					cancel.(func())()
+					cancel.(context.CancelFunc)()
 				case <-nctx.Done():
 					return nil
 				}
diff --git a/stern/target.go b/stern/target.go
index 553e8c2..cbc6788 100644
--- a/stern/target.go
+++ b/stern/target.go
@@ -136,8 +136,14 @@ OUTER:
 			Container: c.Name,
 		}
 
+		if !conditionFound {
+			visitor(t, false)
+			f.forget(string(pod.UID))
+			continue
+		}
+
 		if f.shouldAdd(t, string(pod.UID), c) {
-			visitor(t, conditionFound)
+			visitor(t, true)
 		}
 	}
 }

The demo below illustrates that the logs restart when the pod is deleted (at 00:18).

felipecrs · 2023-09-09T04:03:06Z

Wow, it looks to be working indeed as per your demo. Can't wait to try your patch out.

BTW the behavior you described is indeed misleading.

That would not happen though when using --tail=0, which is my use case (only capture live logs, not past logs).

I will think about what can be done to circumvent this.

Maybe Stern has to remember up until which point it tailed logs for a given pod, and when it starts tailing it again, it should tail from that point on.

tksm · 2023-09-09T04:15:14Z

@felipecrs Oh, sorry. I missed the --tail=0 option. Thank you for pointing it out.

As you mentioned, Stern behaves as expected when using --tail=0.

I believe it would be better to automatically set --tail=0 when the --condition option is specified in order to avoid confusion.

felipecrs · 2023-09-09T18:26:14Z

@tksm really amazing, indeed it works like a charm. I added both @superbrothers' commit and your patch to this PR now.

Something I believe I need to do before merging is to fix the condition syntax for the config file. I left it as a string there, which makes it exactly as the CLI, but perhaps I should convert it to an object like:

- condition: ready=false
+ condition:
+   name: ready
+   value: false

What do you think? Personally, I don't mind. Even the first one looks good to me.

Another thing to think about is the concern raised by @tksm above when not using --tail=0. I proposed one solution above but I'm not sure if it's worth implementing it. Another option is to simply have this caveat acknowledged somewhere, maybe in the README.

Co-authored-by: Takashi Kusumi <takashi.kusumi@gmail.com>

tksm · 2023-09-10T02:42:34Z

I'm glad to hear it worked as expected. 🎉

Config file

I believe that the first one (condition: ready=false) is better because it maintains consistency with other options. Additionally, we do not need to implement anything in that case.

--tail=0 option

For the initial implementation, I think that the caveat in README.md and the error message when --tail=0 is not set are sufficient. Maybe we should avoid automatically setting --tail=0 for future compatibility.

# Raise an error when `--tail=0` is not set
$ ./dist/stern '.*' --condition ready=false
Error: Currently, --condition and --no-follow=false only work with --tail=0
Usage:
  stern pod-query [flags]

diff --git a/cmd/cmd.go b/cmd/cmd.go
index 35e3060..2c70027 100644
--- a/cmd/cmd.go
+++ b/cmd/cmd.go
@@ -130,6 +130,9 @@ func (o *options) Validate() error {
 	if o.selector != "" && o.resource != "" {
 		return errors.New("--selector and the <resource>/<name> query can not be set at the same time")
 	}
+	if o.condition != "" && !o.noFollow && o.tail != 0 {
+		return errors.New("Currently, --condition and --no-follow=false only work with --tail=0")
+	}
 
 	return nil
 }

felipecrs · 2023-09-10T15:57:47Z

I'm sorry @tksm, but why enforce --tail=0 to --no-follow=false? Maybe you meant --no-follow=true, where it makes more sense: if --no-follow=true, no live logs would be shown, only past logs. Thus, filtering only live logs with --tail=0 makes no sense.

I changed it in the commit.

felipecrs · 2023-09-10T16:15:08Z

I also just realized that --condition=ready=false will filter out pods which does not have a health check, like Jobs.

For example, with --condition=ready=false, logs for a pod like below will not be shown:

apiVersion: batch/v1
kind: Job
metadata:
  name: test-two
spec:
  completions: 1
  template:
    metadata:
      labels:
        app: test-two
    spec:
      terminationGracePeriodSeconds: 1
      restartPolicy: OnFailure
      containers:
      - name: test-two
        image: bash
        command:
          - bash
          - -c
          - |
            count=0
            while true; do
              echo "World Hello: $count"
              count=$((count+1))
              if [[ $count -eq 10 ]]; then
                exit 1
              fi
              sleep 1
            done

Similarly, for a pod which does not have a readinessProbe:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-two
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test-two
  serviceName: test-two
  template:
    metadata:
      labels:
        app: test-two
    spec:
      terminationGracePeriodSeconds: 1
      containers:
      - name: test-two
        image: bash
        command:
          - bash
          - -c
          - |
            count=0
            while true; do
              echo "World Hello: $count"
              count=$((count+1))
              if [[ $count -eq 10 ]]; then
                exit 0
              fi
              sleep 1
            done

My use case (#213) was to filter out only healthy pods, and simple conditions like that cannot fulfil my needs.

The only solution I thought to circumvent this is: --only-condition-pods-with-readiness. But:

The flag name is ugly as hell (don't you think? or do you have better suggestions?)
I could just simplify this whole thing to --only-unhealthy-pods and drop --condition entirely.

Feedback is appreciated.

Co-authored-by: Takashi Kusumi <takashi.kusumi@gmail.com>

felipecrs · 2023-09-10T16:30:59Z

Another thing, @tksm, I think allowing people to specify --tail more than 0 can be useful. Imagine a situation like:

I want to only display logs for pods which are not ready, but displaying their last 3 lines before they became unhealthy.

felipecrs · 2023-09-10T18:00:17Z

Ok, now it fulfils my use case:

stern . --condition=ready=false --tail=0 --only-condition-pods-with-readiness

Please let me know what you think about it. And, as always, thanks for the help.

felipecrs · 2023-09-11T22:29:02Z

Ok, I have been soaking this internally during the day and the results are really good for my use case.

I would like to ask the maintainers to give me the go ahead on the current solution (--only-condition-pods-with-readiness) before I start cleaning up the PR and working on fixing the tests.

felipecrs · 2025-01-06T18:06:45Z

@tksm, I have simplified and completed this PR. I believe it's good to be merged. Can you please check it again?

felipecrs · 2025-01-07T04:44:42Z

@superbrothers feel free to take a look at this one too. 😅

guettli · 2025-01-07T13:37:53Z

@guettli, do you mean stern could use your tool as a library to handle these conditions?

You can use my code as a library, if you want. I am unsure whether the code is actually usable as a library. Up to now this was not on my mind while coding it. Feel free to create an issue or contact my directly, if you have a question!

felipecrs · 2025-01-07T13:41:50Z

Got it. Thank you. For now the current implementation of this PR should be good enough, but I'll keep an eye on it.

tksm

Hi, thanks for the implementation! 🎉

I've confirmed that it works as intended with --tail=0. The manifest I used for testing is attached below.

However, I noticed that using --condition without --tail=0 or --no-follow can lead to somewhat confusing behavior when the condition changes repeatedly. Would you consider adding a note to the documentation and help about this? Alternatively, setting --tail=0 as the default when --condition is specified might help clarify things.

Another option is to automatically set TailLines=0 when the pod's log has already been displayed once, ensuring that logs start from the next event.

Confirmation

These are three outputs I confirmed with different options.

without --condition: all logs from the pod are shown
with --condition=ready: only logs are shown when the pod's condition is ready
with --condition=ready=false: only logs are shown when the pod's condition is not ready

Manifest

apiVersion: v1
kind: Pod
metadata:
  name: ready-flipper
spec:
  containers:
  - name: ready-flipper
    image: bash
    command:
      - bash
      - -c
      - |
        for _ in $(seq 2); do
          rm -f /tmp/ready; sleep 3
          for _ in $(seq 3); do echo not ready; sleep 1; done
          touch /tmp/ready; sleep 3
          for _ in $(seq 3); do echo ready; sleep 1; done
        done
        sleep 1000000
    readinessProbe:
      exec:
        command:
        - cat
        - /tmp/ready
      initialDelaySeconds: 1
      periodSeconds: 1
      failureThreshold: 1

stern/stern.go

felipecrs · 2025-01-10T19:56:45Z

Another option is to automatically set TailLines=0 when the pod's log has already been displayed once, ensuring that logs start from the next event.

@tksm, this would be the best. I'll try to implement it.

felipecrs · 2025-01-10T20:20:36Z

Another option is to automatically set TailLines=0 when the pod's log has already been displayed once, ensuring that logs start from the next event.

@tksm, this would be the best. I'll try to implement it.

Or do you have any hint on implementing it? (Or even if you'd like to implement it yourself, please go ahead).

It's like I was brain washed, I'm shocked.

tksm · 2025-01-11T11:08:17Z

Perhaps this can be achieved by skipping the f.forget() call and adding a new field to targetFilter.targetStates to track whether the pod's log has already been displayed. However, it might be complex, so I'm not sure if it's worth implementing.
(Sorry, I don't have time to implement this myself.)

felipecrs · 2025-01-12T19:43:59Z

@tksm thank you very much for the suggestion, but I was still not able to implement anything useful (spent a few hours on it).

So, I just added a flag check for now. Perhaps we can improve this in the future, but I don't think it's worth hold this feature because of it.

cmd/cmd.go

cmd/cmd_test.go

tksm

LGTM 🚀

Thank you very much for the contribution!

felipecrs · 2025-01-13T13:52:50Z

Thank you for patiently helping me complete this PR!

Add --condition condition-name[=condition-value]

7552fbe

Clarify --help

4ae54ad

felipecrs changed the title ~~Add --condition condition-name[=condition-value]~~ Add support for --condition Aug 25, 2023

wip

a8192f2

Fix stern not stopping when ready

31cae76

Co-authored-by: Takashi Kusumi <takashi.kusumi@gmail.com>

felipecrs force-pushed the condition branch from 1d57083 to 31cae76 Compare September 9, 2023 18:27

felipecrs and others added 2 commits September 10, 2023 13:17

Ensure --tail=0 is set

d39a3e7

Co-authored-by: Takashi Kusumi <takashi.kusumi@gmail.com>

Update README

c19b7e0

felipecrs added 3 commits September 10, 2023 14:55

Add --only-condition-pods-with-readiness

7bf930d

Allow --tail different than 0 for --condition

03388c9

Update README example

2962e79

felipecrs changed the title ~~Add support for --condition~~ Add --condition Sep 10, 2023

Optimize condition comparison a little bit

ace7303

felipecrs added 2 commits January 6, 2025 14:55

Add tests

fe19430

Drop pod readiness probe check functionality

32adf34

felipecrs marked this pull request as ready for review January 6, 2025 18:03

felipecrs requested review from superbrothers, floryut, rkmathi and tksm as code owners January 6, 2025 18:03

Merge branch 'master' into condition

c287850

tksm reviewed Jan 10, 2025

View reviewed changes

stern/stern.go Outdated Show resolved Hide resolved

stern/stern.go Outdated Show resolved Hide resolved

stern/stern.go Outdated Show resolved Hide resolved

Apply review comments

19a5cec

Add check for --condition without --tail=0 for now

e446e00

felipecrs requested a review from tksm January 12, 2025 19:44

tksm reviewed Jan 13, 2025

View reviewed changes

cmd/cmd.go Outdated Show resolved Hide resolved

Allow --no-follow

d959228

felipecrs requested a review from tksm January 13, 2025 02:58

tksm reviewed Jan 13, 2025

View reviewed changes

cmd/cmd_test.go Outdated Show resolved Hide resolved

Fix test case

d9eae36

felipecrs requested a review from tksm January 13, 2025 13:18

tksm approved these changes Jan 13, 2025

View reviewed changes

tksm merged commit 2576972 into stern:master Jan 13, 2025
2 checks passed

felipecrs deleted the condition branch January 13, 2025 13:51

Add --condition #276

Add --condition #276

Uh oh!

Conversation

felipecrs commented Aug 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felipecrs commented Aug 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

superbrothers commented Sep 1, 2023

Uh oh!

felipecrs commented Sep 1, 2023

Uh oh!

superbrothers commented Sep 7, 2023

Uh oh!

felipecrs commented Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felipecrs commented Sep 8, 2023

Uh oh!

felipecrs commented Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tksm commented Sep 9, 2023

Uh oh!

felipecrs commented Sep 9, 2023

Uh oh!

tksm commented Sep 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felipecrs commented Sep 9, 2023

Uh oh!

tksm commented Sep 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Config file

--tail=0 option

Uh oh!

felipecrs commented Sep 10, 2023

Uh oh!

felipecrs commented Sep 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felipecrs commented Sep 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felipecrs commented Sep 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felipecrs commented Sep 11, 2023

Uh oh!

felipecrs commented Jan 6, 2025

Uh oh!

felipecrs commented Jan 7, 2025

Uh oh!

guettli commented Jan 7, 2025

Uh oh!

felipecrs commented Jan 7, 2025

Uh oh!

tksm left a comment

Choose a reason for hiding this comment

Confirmation

Manifest

Uh oh!

Uh oh!

Uh oh!

Uh oh!

felipecrs commented Jan 10, 2025

Uh oh!

felipecrs commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tksm commented Jan 11, 2025

Uh oh!

felipecrs commented Jan 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tksm left a comment

Add `--condition` #276

Add `--condition` #276

felipecrs commented Aug 25, 2023 •

edited

Loading

felipecrs commented Aug 25, 2023 •

edited

Loading

felipecrs commented Sep 8, 2023 •

edited

Loading

felipecrs commented Sep 8, 2023 •

edited

Loading

tksm commented Sep 9, 2023 •

edited

Loading

tksm commented Sep 10, 2023 •

edited

Loading

felipecrs commented Sep 10, 2023 •

edited

Loading

felipecrs commented Sep 10, 2023 •

edited

Loading

felipecrs commented Sep 10, 2023 •

edited

Loading

felipecrs commented Jan 10, 2025 •

edited

Loading

felipecrs commented Jan 12, 2025 •

edited

Loading