Skip to content

[Bug]: Use a unique token for log pagination instead of a timestamp #2833

@peterschmidt85

Description

@peterschmidt85

Steps to reproduce

  • Adjust the source code to display log log.timestamp instead of log.message
  • Add print("next_start_time", next_start_time) at the end of the while loop (resp = self._api_client.logs.poll in runs.py)

Run the following with the file/AWS/GCP logging.

type: task
name: print-numbers

commands:
  - |
    python -c "for i in range(1, 100001): print(i)"
$ dstack logs print-numbers | sort | uniq -c > print-numbers.txt

Actual behaviour

The polling function across all storages (file/GCP/AWS) uses start_time as a way to paginate logs.
In case too many event share the same timestamp, it truncates the output.

Expected behaviour start_time for pagination.

It should always use a unique token:

  • AWS/GCP - support their own
  • File logging may use line number from the file.

The polling function should NOT use

dstack version

0.19.15

Server logs

Additional information

print-numbers.txt
print-numbers-aws.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions