-
Notifications
You must be signed in to change notification settings - Fork 1.2k
exp show :Add --hide-queued
and --hide-failed
flag
#8318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dvc/repo/experiments/show.py
Outdated
def _get_last_error_message( | ||
celery_queue: "LocalCeleryQueue", exp_rev: str | ||
) -> str: | ||
queue_entry: Optional[ | ||
"QueueEntry" | ||
] = celery_queue.match_queue_entry_by_name( | ||
{exp_rev}, celery_queue.iter_failed() | ||
).get( | ||
exp_rev | ||
) | ||
if queue_entry: | ||
try: | ||
proc_info: "ProcessInfo" = celery_queue.proc[queue_entry.stash_rev] | ||
with open(proc_info.stdout, encoding="utf-8") as f_read: | ||
return f_read.readlines()[-1].strip() | ||
except (KeyError, FileNotFoundError): | ||
pass | ||
return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is really a reliable way to handle this case. It only works if the pipeline was run without debugging enabled.
@mattseddon can weigh in on this, but I think it would be better to just use a generic "Experiment run failed." error message, and then add a new field where we can pass the log file path to the vscode extension. And then on the vscode side they could potentially display the entire log to the user.
Actually, passing the log location seems useful in every exp case, not just for failures. This is something that the vscode extension could also get through dvc queue logs
but I think it might be easier for them to just get the log file path and read it themselves when needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan on this is to use "type" field in "error" later in which I can add info of in which part (initialization, env preparing, running, result collection) of the tasks the error comes out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I would still suggest just using a generic error message for now instead of trying to use the last log line, since the last log line is not actually guaranteed to contain anything useful.
In the long run we should probably also add an error message field to the executor info and save it here:
dvc/dvc/repo/experiments/executor/base.py
Lines 568 to 580 in d2a31eb
try: | |
logger.debug("Running repro in '%s'", os.getcwd()) | |
yield dvc | |
except CheckpointKilledError: | |
raise | |
except DvcException: | |
if log_errors: | |
logger.exception("") | |
raise | |
except Exception: | |
if log_errors: | |
logger.exception("unexpected error") | |
raise |
So on getting an exception during repro, we would just do something like
info.error = str(exc)
and then that would be included in the dump:
dvc/dvc/repo/experiments/executor/base.py
Line 494 in d2a31eb
if infofile is not None: |
This would give you a more reliable error message for the failed exp case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattseddon can weigh in on this, but I think it would be better to just use a generic "Experiment run failed." error message, and then add a new field where we can pass the log file path to the vscode extension. And then on the vscode side they could potentially display the entire log to the user.
Actually, passing the log location seems useful in every exp case, not just for failures. This is something that the vscode extension could also get through dvc queue logs but I think it might be easier for them to just get the log file path and read it themselves when needed
Yes, the log location would be helpful.
The current approach for any error is to display the provided error as a tooltip in the experiments table. E.g:
It is unlikely that we would want to put the entire log into a tooltip so we would have to change the approach there. It would be good to get as much information about the error as possible without having to parse the logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the error log handling this LGTM
It would be helpful if you could not add the error key into the dict unless there has been an error (i.e the dict is non-empty). |
5a12c6b
to
2d9964f
Compare
a6de22a
to
caf7bb2
Compare
Use a more general message to replace the error message. BTW, I found the tasks can fail for
|
Thanks @karajan1001! Could you rebase? Do you need either @pmrowla or @mattseddon to review? |
I think both the engineering and the API sides require a check. |
fix: iterative#7986 1. Add two new flags `--hide-queued` and `--hide-failed` to `exp show` 2. Allow `exp show` to show failed experiments. 3. Add unit test for the failed experiments shown. 4. Add name support for failed exp 5. Add error msg to the `exp show` output
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. π
fix: #7986
exp show
.--hide-queued
and--hide-failed
toexp show
exp show
to show failed experiments.exp show
outputNeed a coordinate with VS-CODE extension. @mattseddon
Example of current output in JSON:
for failed exp
for other
2.["data"]["error"]["msg"] will show the last line of the output log.
dvc/repo/experiments/show.py