Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: open-compass/opencompass
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 0.3.6
Choose a base ref
...
head repository: open-compass/opencompass
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 0.3.7
Choose a head ref
  • 19 commits
  • 140 files changed
  • 12 contributors

Commits on Nov 20, 2024

  1. [Update] Support new error code for Bailing model (#1702)

    * support new error code
    
    * fix the lint problems
    cuauty authored Nov 20, 2024
    Configuration menu
    Copy the full SHA
    05044df View commit details
    Browse the repository at this point in the history

Commits on Nov 21, 2024

  1. [CI] update torch version and add more datasets into daily testcase (#…

    …1701)
    
    * update
    
    * update
    
    * update
    
    * update
    
    * update
    
    * update
    
    * update
    
    * update
    
    * update
    
    * update
    
    ---------
    
    Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
    zhulinJulia24 and zhulin1 authored Nov 21, 2024
    Configuration menu
    Copy the full SHA
    ed81f9d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    500fb10 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    80e3b9e View commit details
    Browse the repository at this point in the history

Commits on Nov 25, 2024

  1. [Update] Update MATH dataset with model judge (#1711)

    * Update math with llm judge
    
    * Update math with llm judge
    
    * Update math with llm judge
    
    * Update math with llm judge
    
    * Update math with llm judge
    liushz authored Nov 25, 2024
    Configuration menu
    Copy the full SHA
    e49fcfd View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5c1916e View commit details
    Browse the repository at this point in the history
  3. [Feature] Add Korbench dataset (#1713)

    * first version for korbench
    
    * first stage for korbench
    
    * korbench_1
    
    * korbench_1
    
    * korbench_1
    
    * korbench_1
    
    * korbench_1_revised
    
    * korbench_combined_1
    
    * korbench_combined_1
    
    * kor_combined
    
    * kor_combined
    
    * update
    
    ---------
    
    Co-authored-by: MaiziXiao <xxllcc1993@gmail.com>
    epsilondylan and MaiziXiao authored Nov 25, 2024
    Configuration menu
    Copy the full SHA
    300adc3 View commit details
    Browse the repository at this point in the history

Commits on Nov 26, 2024

  1. [Update] Update Fullbench (#1712)

    * Update JuderBench
    
    * Support O1-style Prompts
    
    * Update Code
    tonysy authored Nov 26, 2024
    Configuration menu
    Copy the full SHA
    f97c4ea View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ef695e2 View commit details
    Browse the repository at this point in the history
  3. [Fix] Fix BailingAPI model (#1707)

    * [fix] sequence under the multiple samples
    
    * resolve the lint problems
    
    * change the parameter name
    
    * add another error code for retry
    
    * output the log for invalid response
    
    * format correction
    
    * update
    
    * update
    
    * update
    
    * update
    
    * add two model python files
    
    * update the default parameter
    
    * use random for delay
    
    * update the api example of bailing
    
    * remove the unnecessary parameter
    cuauty authored Nov 26, 2024
    Configuration menu
    Copy the full SHA
    bcb707d View commit details
    Browse the repository at this point in the history

Commits on Nov 27, 2024

  1. [Feature] Add Arc Prize Public Evaluation (#1690)

    * support arc prize
    
    * update arc-prize dataset info & update arc-prize evaluation performance
    jnanliu authored Nov 27, 2024
    Configuration menu
    Copy the full SHA
    f7dbe6b View commit details
    Browse the repository at this point in the history
  2. [Feature] Add P-MMEval (#1714)

    * Update with PMMEval
    
    * Update
    
    * Update __init__.py
    
    * Fix Bugs
    
    * Delete .pre-commit-config.yaml
    
    * Pull merge
    
    ---------
    
    Co-authored-by: liushz <qq1791167085@163.com>
    wanyu2018umac and liushz authored Nov 27, 2024
    Configuration menu
    Copy the full SHA
    90efcf2 View commit details
    Browse the repository at this point in the history

Commits on Nov 28, 2024

  1. [Fix] Fix pmmeval_gen config (#1719)

    * Update with PMMEval
    
    * Update
    
    * Update __init__.py
    
    * Fix Bugs
    
    * Delete .pre-commit-config.yaml
    
    * Pull merge
    
    * Fix pmmeval_gen config
    
    ---------
    
    Co-authored-by: wanyu <wanyu2018umac@gmail.com>
    Co-authored-by: wanyu2018umac <42405907+wanyu2018umac@users.noreply.github.com>
    3 people authored Nov 28, 2024
    Configuration menu
    Copy the full SHA
    06ab278 View commit details
    Browse the repository at this point in the history
  2. [Feature] Add Openai Simpleqa dataset (#1720)

    * Add Openai SimpleQA dataset
    
    * Add Openai SimpleQA dataset
    
    * Add Openai SimpleQA dataset
    
    * Update eval_simpleqa.py
    
    ---------
    
    Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com>
    liushz and MaiziXiao authored Nov 28, 2024
    Configuration menu
    Copy the full SHA
    c437135 View commit details
    Browse the repository at this point in the history
  3. [Fix] Update P-MMEVAL OSS data (#1722)

    * Update with PMMEval
    
    * Update
    
    * Update __init__.py
    
    * Fix Bugs
    
    * Delete .pre-commit-config.yaml
    
    * Pull merge
    
    * Fix pmmeval_gen config
    
    * Update P-MMEVAL data
    
    ---------
    
    Co-authored-by: wanyu <wanyu2018umac@gmail.com>
    Co-authored-by: wanyu2018umac <42405907+wanyu2018umac@users.noreply.github.com>
    3 people authored Nov 28, 2024
    Configuration menu
    Copy the full SHA
    b063779 View commit details
    Browse the repository at this point in the history

Commits on Nov 29, 2024

  1. Configuration menu
    Copy the full SHA
    fe6d76f View commit details
    Browse the repository at this point in the history

Commits on Dec 2, 2024

  1. [Update] Update max_out_len for datasets (#1726)

    * [Update] Update max_out_len for datasets
    
    * Update eval_regression_chat_objective_fullbench.py
    
    * Update eval_regression_chat.py
    
    * Update eval_regression_chat.py
    
    * Update oc_score_baseline_fullbench.yaml
    
    ---------
    
    Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com>
    MaiziXiao and zhulinJulia24 authored Dec 2, 2024
    Configuration menu
    Copy the full SHA
    9de27b4 View commit details
    Browse the repository at this point in the history
  2. [Update] Update Korbench dataset abbr (#1729)

    Co-authored-by: yufeng zhao <zhaoyufeng@pjlab.org.cn>
    epsilondylan and yufeng zhao authored Dec 2, 2024
    Configuration menu
    Copy the full SHA
    98c4666 View commit details
    Browse the repository at this point in the history

Commits on Dec 3, 2024

  1. Configuration menu
    Copy the full SHA
    e2a290f View commit details
    Browse the repository at this point in the history
Loading