-
Notifications
You must be signed in to change notification settings - Fork 646
[Feature] Add MultiPL-E & Code Evaluator #1963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Zhudongsheng75
commented
Mar 20, 2025
- Developed a client/server-based code evaluator;
- Based on this code evaluator, the MultiPL-E dataset has been added. Currently, OC can support evaluation tasks in multiple programming languages.
Also, please update https://github.com/open-compass/opencompass/blob/main/dataset-index.yml |
multiple_infer_cfg = dict( | ||
prompt_template=dict(type=PromptTemplate, template='Based on the provided {language} code snippet, complete the subsequent content. The initial part of the completed code must match the provided code snippet exactly:\n{prompt}'), | ||
retriever=dict(type=ZeroRetriever), | ||
inferencer=dict(type=GenInferencer, max_out_len=2048), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider removing this max_out_len to avoid the truncation for the long cot decoding scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
change num_repeats>1, otherwise the number in | ||
`.cache/dataset_size.json` might be inconsistent. | ||
|
||
Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc string is missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* multiple_code develop * multiple_code update * comments upadate * index upadate