Skip to content
#

llm-evaluation

Here are 10 public repositories matching this topic...

ExpertFingerprinting: Behavioral Pattern Analysis and Specialization Mapping of Experts in GPT-OSS-20B's Mixture-of-Experts Architecture

  • Updated Aug 13, 2025
  • HTML

A beginner-friendly project for fine-tuning, testing, and deploying language models for sentiment analysis with a strong emphasis on quality assurance and testing methodologies.

  • Updated Mar 17, 2025
  • HTML

Comprehensive evaluation of Claude 4 Sonnet's mathematical assessment capabilities: 500 original problems revealing JSON-induced errors and systematic patterns in LLM evaluation tasks. Research demonstrates 100% accuracy on incorrect answers but 84.3% on correct ones due to premature decision-making in JSON structure.

  • Updated Jul 7, 2025
  • HTML

Improve this page

Add a description, image, and links to the llm-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more