Skip to content

release-v1.2.1

Latest
Compare
Choose a tag to compare
@yhcvb yhcvb released this 25 Jun 12:40
  • Added support for RWKV7, Qwen3, and MiniCPM4 models
  • Added support for the RV1126B platform
  • Enabled function calling capability
  • Enabled cross-attention inference
  • Optimize the callback function to support pausing inference
  • Supported multi-batch inference
  • Optimized KV cache clearing interface
  • Improved chat template parsing with support for thinking mode selection
  • Server demo updated to support OpenAI-compatible format
  • Added return of model inference performance statistics
  • Supported mrope multimodal position encoding
  • A new quantization optimization algorithm has been added to improve quantization accuracy