Skip to content

[ROMM-2142] Custom SGDB title match #2220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Aug 7, 2025
Merged

[ROMM-2142] Custom SGDB title match #2220

merged 5 commits into from
Aug 7, 2025

Conversation

gantoine
Copy link
Member

@gantoine gantoine commented Aug 6, 2025

Description
Explain the changes or enhancements you are proposing with this pull request.

This PR changes SGDB matching to use a custom algorithm that mixes token overlap, sequence similarity and (custom) Levenshtein distance with different ratios to match the most likely game.

Fix #2142

Checklist
Please check all that apply.

  • I've tested the changes locally
  • I've updated relevant comments
  • I've assigned reviewers for this PR
  • I've added unit tests that cover the changes

Copy link

trunk-io bot commented Aug 6, 2025

Running Code Quality on PRs by uploading data to Trunk will soon be removed. You can still run checks on your PRs using trunk-action - see the migration guide for more information.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes the Levenshtein library dependency and replaces it with a custom implementation using Python's built-in difflib module, while also improving the game title matching algorithm with a multi-metric similarity scoring system.

  • Removes external Levenshtein dependency and implements custom distance calculation
  • Introduces advanced title similarity matching using token overlap, sequence matching, and Levenshtein distance
  • Refines matching thresholds and adds detailed logging for better debugging

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

File Description
pyproject.toml Removes the Levenshtein library dependency
backend/handler/metadata/sgdb_handler.py Implements custom Levenshtein distance function and enhanced similarity matching algorithm

@gantoine gantoine changed the title Romm 2142 [ROMM-2142] Custom SGDB title match Aug 6, 2025
Copy link

github-actions bot commented Aug 6, 2025

Test Results

548 tests  ±0   547 ✅ ±0   1m 0s ⏱️ ±0s
  1 suites ±0     1 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit f3a74bc. ± Comparison against base commit 00c9d74.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Aug 6, 2025

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
8798 6178 70% 0% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
backend/handler/metadata/sgdb_handler.py 33% 🟢
TOTAL 33% 🟢

updated for commit: f3a74bc by action🐍

@gantoine gantoine merged commit 0d5dfb6 into master Aug 7, 2025
9 checks passed
@gantoine gantoine deleted the romm-2142 branch August 7, 2025 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Steam Grid DB matching (incorrectly) by name? (4.0.0)
2 participants