[ci] react daily test #1668

zhulinJulia24 · 2024-11-08T02:30:10Z

react daily test
add fullbench test based on v1.1 (exclude scicode because it takes too long and maybe oom)
add subjective testcases

MaiziXiao · 2024-11-12T10:29:25Z

.github/scripts/eval_regression_chat_subjective_fullbench.py

+
+models = sorted(models, key=lambda x: x['run_cfg']['num_gpus'])
+
+judge_models = deepcopy([models[1]])


Why are we using this model here?

MaiziXiao · 2024-11-12T10:31:37Z

.github/scripts/eval_regression_chat_subjective_fullbench.py

+    from opencompass.configs.models.hf_internlm.hf_internlm2_5_7b_chat import \
+        models as hf_internlm2_5_7b_chat_model  # noqa: F401, E501
+    from opencompass.configs.models.hf_internlm.lmdeploy_internlm2_5_7b_chat import \
+        models as lmdeploy_internlm2_5_7b_chat_model  # noqa: F401, E501


Why do we only test 2 models here? Is this just for function check of subjective evaluation?

Why do we only test 2 models here? Is this just for function check of subjective evaluation?

yes. use subset datasets

MaiziXiao

LGTM

* updaste * update * update * update * update * update * update * update * update * update * updaste * update * update * refactor summarize * update * update * update * update * update * updaste * update * update * update * update * updaste * update * update * update * update * update * updaste * updaste * update * update * update * update * update * update * update * update * update * update * update * Update daily-run-test.yml * Update daily-run-test.yml * update * update * update * update * update * Update daily-run-test.yml * update * update * update * update * update * update * update * update * update * update * update * Update daily-run-test.yml * Update daily-run-test.yml * update * update * Update daily-run-test.yml * update * update * update --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>

zhulin1 and others added 30 commits November 1, 2024 11:47

updaste

e996871

update

00671de

update

9ce36f4

update

bf6c6b9

update

2ce0140

update

d8cc116

update

f377946

update

4384197

update

d2cceae

update

65ddd9d

updaste

b80548d

update

a346775

Merge branch 'open-compass:main' into add_subjective_testcase

6a66fdf

update

5b83030

refactor summarize

6e6f211

update

e2ef73e

update

b3ea6e6

update

b91b7ab

update

9f161e5

update

ba0ab87

updaste

9146ace

update

913c24b

update

fae12c9

update

b5116db

update

7272b53

updaste

db6fce9

update

2e26a2f

update

82bda4d

update

249f71d

update

8e481c5

update

e0b30db

zhulinJulia24 had a problem deploying to prod November 8, 2024 03:17 — with GitHub Actions Failure

update

f548ee5

zhulinJulia24 had a problem deploying to prod November 8, 2024 03:32 — with GitHub Actions Failure

zhulinJulia24 had a problem deploying to prod November 8, 2024 03:36 — with GitHub Actions Failure

Update daily-run-test.yml

63e6330

zhulinJulia24 temporarily deployed to prod November 8, 2024 05:47 — with GitHub Actions Inactive

Update daily-run-test.yml

a8f40c2

zhulinJulia24 had a problem deploying to prod November 8, 2024 08:22 — with GitHub Actions Failure

zhulinJulia24 temporarily deployed to prod November 8, 2024 08:53 — with GitHub Actions Inactive

update

3164f09

zhulinJulia24 temporarily deployed to prod November 8, 2024 10:02 — with GitHub Actions Inactive

update

03eab4f

zhulinJulia24 had a problem deploying to prod November 11, 2024 02:10 — with GitHub Actions Error

Update daily-run-test.yml

0fa88b8

zhulinJulia24 temporarily deployed to prod November 11, 2024 02:15 — with GitHub Actions Inactive

zhulin1 added 2 commits November 12, 2024 10:57

update

2d80933

update

43504c7

zhulinJulia24 had a problem deploying to prod November 12, 2024 03:08 — with GitHub Actions Failure

update

35502bf

zhulinJulia24 had a problem deploying to prod November 12, 2024 06:10 — with GitHub Actions Failure

update

15b6d62

tonysy requested a review from MaiziXiao November 12, 2024 07:17

zhulinJulia24 temporarily deployed to prod November 12, 2024 07:20 — with GitHub Actions Inactive

MaiziXiao reviewed Nov 12, 2024

View reviewed changes

MaiziXiao approved these changes Nov 12, 2024

View reviewed changes

MaiziXiao merged commit a9d6b64 into open-compass:main Nov 12, 2024
8 checks passed

zhulinJulia24 deleted the add_subjective_testcase branch November 15, 2024 07:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ci] react daily test #1668

[ci] react daily test #1668

zhulinJulia24 commented Nov 8, 2024

Uh oh!

MaiziXiao Nov 12, 2024

Uh oh!

MaiziXiao Nov 12, 2024

Uh oh!

zhulinJulia24 Nov 12, 2024

Uh oh!

MaiziXiao left a comment

Uh oh!

Uh oh!

Uh oh!


		models = sorted(models, key=lambda x: x['run_cfg']['num_gpus'])

		judge_models = deepcopy([models[1]])

[ci] react daily test #1668

[ci] react daily test #1668

Conversation

zhulinJulia24 commented Nov 8, 2024

Uh oh!

MaiziXiao Nov 12, 2024

Choose a reason for hiding this comment

Uh oh!

MaiziXiao Nov 12, 2024

Choose a reason for hiding this comment

Uh oh!

zhulinJulia24 Nov 12, 2024

Choose a reason for hiding this comment

Uh oh!

MaiziXiao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!