Skip to content

Conversation

ggerganov
Copy link
Member

@ggerganov ggerganov commented Nov 4, 2024

TODO:

  • fix build and tests

ggerganov and others added 18 commits November 4, 2024 10:50
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
… MobileVLM model. (llama/9763)

* ggml: Add POOL2D OP for GPU ACC to the Vulkan.

- The MobileVLM model now supports inference acceleration through GPU by utilizing the Vulkan backend.
- A GGML_OP_POOL_2D shader has been added. (Pooling)
- The encoding performance of the CLIP model improved from 2.8s on the CPU to 0.7s on the GPU.

Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>

* [fix] Correct the incorrect order of the parameters.

fix casting to int.

Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>

---------

Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>
* ggml : RISC-V vector gemv for q4_0_8x8

* ggml : Added WIP rvv q4_0_8x8 gemm

* ggml : Added initial implementation of rvv gemm

* ggml : optimize gemm to avoid register spillover

* ggml : Fix GCC rvv load alignment issue

* ggml : Format gemm rvv code

* ggml : Fix a typo in RVV q4_0_8_8 GEMM
* ggml : fix gguf string leak when reading kv pairs fails

* ggml : avoid crashing with GGML_ABORT when the KV has an invalid type

* ggml : avoid crashing on failed memory allocations when loading a gguf file
Get in line with the other backends by supporting the newer
backend/device registry interfaces.

Signed-off-by: Sergio Lopez <slp@redhat.com>
This is a more or less direct translation from the Metal implementation
to GLSL.

Signed-off-by: Sergio Lopez <slp@redhat.com>
* llama : fix buffer checks for mamba and rwk

* llama : fix missing worst case flag during reserve

* cuda : fix supports_op for norm

* disable sched SET_CAUSE
* llama : add simple-chat example

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
* metal : minor fixup in FA kernel

ggml-ci

* metal : use the unrolled loop variable

* metal : remove unused var
@ggerganov ggerganov changed the title sync : llam.cpp sync : llama.cpp Nov 4, 2024
@slaren
Copy link
Member

slaren commented Nov 4, 2024

The test-opt should just be disabled until it is updated in #988, since the opt interface has been removed it cannot be updated.

Looks like other tests are failing too, I will update them.

@slaren
Copy link
Member

slaren commented Nov 4, 2024

I disabled all tests and examples that depend on ggml_opt. They should be re-enabled or removed in #988.

@ggerganov ggerganov marked this pull request as ready for review November 4, 2024 17:37
@ggerganov ggerganov merged commit f3c1e6a into master Nov 4, 2024
4 checks passed
@ggerganov ggerganov deleted the sync branch November 4, 2024 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants