Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: ollama/ollama
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.11.7
Choose a base ref
...
head repository: ollama/ollama
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.11.8
Choose a head ref
  • 7 commits
  • 25 files changed
  • 3 contributors

Commits on Aug 25, 2025

  1. 1 Configuration menu
    Copy the full SHA
    30fb7e1 View commit details
    Browse the repository at this point in the history

Commits on Aug 26, 2025

  1. Configuration menu
    Copy the full SHA
    85ccf73 View commit details
    Browse the repository at this point in the history
  2. convert: fix tensor sorting (#12015)

    there's two bugs here.
    
    1. the check for a layer id is incorrect and should be >= 0 since layer
       0 is valid
    2. if both tensors have an layer identifier, it will only compare the
       layer id which will return 0 if the tensors are in the same layer.
       instead it should fallback to comparing the full tensor name
    mxyng authored Aug 26, 2025
    Configuration menu
    Copy the full SHA
    86834a2 View commit details
    Browse the repository at this point in the history
  3. convert(gptoss): mxfp4 to ggml layout to avoid jit conversion (#12018)

    * convert: return bytes written
    
    * ggml flavor mxfp4
    
    * simplify jit conversion
    
    * comment
    mxyng authored Aug 26, 2025
    Configuration menu
    Copy the full SHA
    59412fb View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2025

  1. fix keep alive (#12041)

    mxyng authored Aug 27, 2025
    Configuration menu
    Copy the full SHA
    1081532 View commit details
    Browse the repository at this point in the history
  2. ggml: Avoid allocating CUDA primary context on unused GPUs

    The recent memory management changes caused all GPUs to be visible
    to the runner, regardless of whether they are ultimately used. This
    caused CUDA devices to allocate a primary context (~300 MB VRAM) on
    each GPU, for each model. This is unnecessary, so we can both avoid
    touching GPUs that we exclude in the early stage of allocation and
    freeing the memory for any that we touch but don't use.
    
    The issue will continue to exist for the old engine, since it touches
    all devices during initialization.
    jessegross committed Aug 27, 2025
    Configuration menu
    Copy the full SHA
    9d97e6a View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2025

  1. Configuration menu
    Copy the full SHA
    4383a3a View commit details
    Browse the repository at this point in the history
Loading