Support using MusicBrainz IDs instead of other metadata #1363

vs49688 · 2021-09-23T12:55:41Z

Converts the library to use MusicBrainz IDs for albums, artists,
albumartists, and tracks. All playlists, stars, etc. are kept.

This is an irreversible transformation; the user is given a warning
and may cancel within 10 seconds by pressing ^C.

The mediafiles table is used as the sole source-of-truth to build
an entirely new hierarchy, based off the MusicBrainz IDs.

Existing entity metadata is applied to the new entities, except in the
case where one entity may need to be split (e.g. different releases of
the same album). These split entities are left blank.

References are then updated for both play queues and playlists.

The existing entities are then deleted, and a full-rescan is triggered
to fill in any missing information.

Closes #489.

certuna · 2021-09-24T09:24:32Z

I think it's a good idea in principle, just some remarks:

this will split albums where some tracks have different or no mbz_album_id, which may catch more than a few users by surprise, in particular those who group "singletons" together into one pseudo-compilation album (e.g. "Loose Tracks"). The alternative is using Grouping or Release Type for grouping these together (and excluding them from the album list), but unfortunately those features aren't implemented yet so you'd leave those users with no effective way to manage singletons
a good thing is this will allow splitting artists with the same name

vs49688 · 2021-09-24T10:30:31Z

I think it's a good idea in principle, just some remarks:

* this will split albums where some tracks have different or no `mbz_album_id`, which may catch more than a few users by surprise, in particular those who group "singletons" together into one pseudo-compilation album (e.g. "Loose Tracks"). The alternative is using Grouping or Release Type for grouping these together (and excluding them from the album list), but unfortunately those features aren't implemented yet so you'd leave those users with no effective way to manage singletons

In this case, I'm of the mind to tell the user to strip the MusicBrainz tags from those files, retaining the current behaviour. By moving them into a "Loose Tracks" album, you're effectively creating a "new" album. If this "new" album isn't on MusicBrainz, it shouldn't have MusicBrainz tags. Also, isn't this what playlists are for?
Of course, this might be a unpopular opinion, but I think it's the more "correct" one.

Another option would be to add a "navidrome_mbignore" tag, which if set would tell Navidrome to ignore any MusicBrainz tags it finds when scanning.

* a good thing is this will allow splitting artists with the same name

Indeed, I never considered this may need to be done with artists until after I'd pushed the commit.

vs49688 · 2021-09-25T16:12:08Z

Pushed new version that should handle track/album/artist/albumartist IDs.

certuna · 2021-09-25T16:14:21Z

One issue will be: what does Navidrome do with artists that have the same mbid but different tags in the Artist field, such as "DJ Tiesto" and "Tiesto"?

It's not so easy, you might get some strange behaviour, consider this case:

DJ Tiesto - Album 1 (with mbid's)
Tiesto - Album 2 (with mbid's)
DJ Tiesto - Album 3 (without mbid's)
Tiesto - Album 4 (without mbid's)

Current behaviour: album 1 and 3 grouped under DJ Tiesto, album 2 and 4 grouped under Tiesto
What will the new behaviour be?

vs49688 · 2021-09-25T16:21:06Z

Regardless of my changes, albums 3 & 4 should be under different artists (or the artists fixed).
As for 1 & 2, perhaps I should change it to only use the mbid if it exists?

vs49688 · 2021-09-25T16:46:25Z

Actually, thinking about it some more... What do we want the behaviour to be?
Do we want Navidrome to merge artists/albums/whatever based on mbid? That'd be a pretty drastic change in behaviour, but not an unreasonable one.

All I intended was to make it not merge different releases of the same album. If the tags in the files are wrong (thus causing Navidrome to split the artists/albums), then they should be fixed.

I'm fine with either option (it doesn't affect me as I run everything through Picard and beets), @deluan what do you think?

certuna · 2021-09-26T10:30:14Z

The more I think of it, it's indeed a drastic change.

Navidrome currently does all organization based on official tag fields: Artist, Album Artist, Album Title. It's not perfect (no disambiguation possible between artists or albums with the same name), but at least it's pretty transparent why stuff happens.

Moving to organization based on custom Musicbrainz ID tags is good if you consistently use those, but that puts the onus on users to have not only an immaculately tagged collection with 'regular' tags, but also consistently have the mbz tags correct, because you'll create splits between stuff with and without mbz id's. And this decreases transparency to the user why this happens, and we'll end up like Plex where the scanner feels like a black box where "anything can happen".

For example:

currently a user can move an album to another artist just by changing the Artist tag, that's very logical and simple. But the album still retains the old artist_mbid tag, what will Navidrome do?
currently a user can move a track to another album just by changing the Album Title tag. Logical, simple, and that's what pretty much every music player (foobar, Apple Music, MusicBee, etc) allows you to do. Grouping by release_mbid breaks that.

vs49688 · 2021-09-26T14:29:43Z

The more I think of it, it's indeed a drastic change.

Navidrome currently does all organization based on official tag fields: Artist, Album Artist, Album Title. It's not perfect (no disambiguation possible between artists or albums with the same name), but at least it's pretty transparent why stuff happens.

Moving to organization based on custom Musicbrainz ID tags is good if you consistently use those, but that puts the onus on users to have not only an immaculately tagged collection with 'regular' tags, but also consistently have the mbz tags correct, because you'll create splits between stuff with and without mbz id's. And this decreases transparency to the user why this happens, and we'll end up like Plex where the scanner feels like a black box where "anything can happen".

As long as the scanning behaviour is explicitly-and-well-documented, I don't really see the problem.

For example:

* currently a user can move an album to another artist just by changing the `Artist` tag, that's very logical and simple. But the album still retains the old `artist_mbid` tag, what will Navidrome do?

* currently a user can move a track to another album just by changing the `Album Title` tag. Logical, simple, and that's what pretty much every music player (foobar, Apple Music, MusicBee, etc) allows you to do. Grouping by `release_mbid` breaks that.

How about this?

Add a ScanUseMbzID option in the configuration file, unset by default.
When unset, the behaviour is unchanged, keeping it as it is now.

When set, if an album/artist has a mbid, Navidrome will use it (and only it). This provides a solution to #489, assuming everything's tagged correctly. If the mbid is NOT set, it'll fall back and create a new/duplicate artist/album. It is up to the user to tag the files correctly and trigger a rescan.

Forgive me for being pushy on this, but I need a fix for the #489 - it's affecting large part of my library.

deluan · 2021-09-26T17:01:55Z

Hey, sorry being late for the party :)

After reading the thread, these are my two cents (mostly agreeing with latest @vs49688 comment):

The new behaviour (using mbids) has to be configurable, and turned off by default
When the feature is enabled, if no mbid is found, fallback to the current behaviour.
I think it is useful only for albums, albumartists and artists, but not for tracks.

On top of that, a migration plan is a MUST, or else users that enable this will break their playlists, stars, playcounts, etc... The migration should go over all tables with an album_id or artist_id and recalculates a new id. Foreign keys must be observed. To make things even trickier, this will have to happen only when the config option is first enabled, and cannot be disabled afterwards (or it would mess with the DB), making the implementation tricky (similar to what I had to do for PasswordEncryptionKey)

vs49688 · 2021-09-27T06:57:54Z

How'd you prefer the migrations to be done?

Duplicate entries, except with the mbid as the id, fix up references, delete old, vacuum
Disable foreign keys, update in-place

deluan · 2021-09-27T15:42:23Z

How'd you prefer the migrations to be done?

Duplicate entries, except with the mbid as the id, fix up references, delete old, vacuum

Disable foreign keys, update in-place

Whichever is faster. I'd assume it is the later, but it is worth to test it out if you can.

vs49688 · 2021-09-29T02:40:32Z

It's mostly finished, all that needs to be done now is some cleanups, and to trigger a full rescan.

Also, where's the best place for this? I've just put it in the top-level cmd package to start with, but it should probably go elsewhere. I'd like to put it in db/migrations (specifically so I can use forceFullRescan), but can't because of circular imports.

vs49688 · 2021-10-06T14:14:07Z

Ping on this?

Specific issues I'd like feedback on:

It's in the root cmd package for now, where should it actually go?
Do the DeleteMany functions I've added to the {artist,album}Repository interfaces have security implications I'm not aware of - i.e. does it automatically create a REST route for it (or something, I haven't really looked at the code for this).

The current behaviour:

If "Scanner.UseMbzIDs" is enabled:
- Converts library to use MusicBrainz IDs for albums, artists, and albumartists
- All playlists, stars, etc. are kept.
This is an irreversible transformation; adequate warning is given.
If set accidentally, the user may cancel the migration within 10 seconds by pressing ^C.
Once enabled, it will throw an error if attempted to disable.
A full rescan is required to finish the process - this must be triggered manually.

deluan · 2021-10-07T13:36:18Z

Sorry for the delay. Will take a look tonight

Not MD5'ing them because they're UUIDs and can be eyeball'd easily in the database.

Converts the library to use MusicBrainz IDs for albums, artists, albumartists, and tracks. All playlists, stars, etc. are kept. This is an irreversible transformation; the user is given a warning and may cancel within 10 seconds by pressing ^C. The mediafiles table is used as the sole source-of-truth to build an entirely new hierarchy, based off the MusicBrainz IDs. Existing entity metadata is applied to the new entities, except in the case where one entity may need to be split (e.g. different releases of the same album). These split entities are left blank. References are then updated for both play queues and playlists. The existing entities are then deleted, and a full-rescan is triggered to fill in any missing information.

github-actions · 2024-09-18T06:09:11Z

Download the artifacts for this pull request:

deluan · 2024-11-05T17:42:42Z

Hey @vs49688, as you know, I'm adding support for this in #2709, but I'm struggling with what to do when the user has more than one copy of the same album (let's say a MP3 copy and a FLAC copy), and both have same MBIDs. How are you handling this?

Another challenge is that now that we will support multiple roles and artists for each album/track, we may get artists without MBIDs, as there are no tags for Composers, Lyricist, Remixer, Performer, etc... This could cause an artist to be duplicated (one with MBID, and another with just Name). Any ideas?

vs49688 · 2024-11-06T01:34:02Z

Hey @vs49688, as you know, I'm adding support for this in #2709, but I'm struggling with what to do when the user has more than one copy of the same album (let's say a MP3 copy and a FLAC copy), and both have same MBIDs. How are you handling this?

I don't. This is very much a "curate/tag your library properly" problem, in my mind. I'd argue that a certain degree of discipline is required for libraries at scale. If I do have any duplicates, beets catches them on import.

Maybe if this happens, just spit a warning and keep the existing one?

Another challenge is that now that we will support multiple roles and artists for each album/track, we may get artists without MBIDs, as there are no tags for Composers, Lyricist, Remixer, Performer, etc... This could cause an artist to be duplicated (one with MBID, and another with just Name). Any ideas?

Hmm, yeah that's a tricky one...I'm not sure what to do either. Unless there's some kind of unique ID, you might have to fall back to basic name-matching.
Or... and this could be a complete non-starter because privacy/disable-external-services - you have the track MBID, why not hit MBs APIs and query the missing MBIDs?

deluan · 2024-11-06T01:50:03Z

Maybe if this happens, just spit a warning and keep the existing one?

Yeah, that's what I'm inclined to do.

Hmm, yeah that's a tricky one...I'm not sure what to do either. Unless there's some kind of unique ID, you might have to fall back to basic name-matching. Or... and this could be a complete non-starter because privacy/disable-external-services - you have the track MBID, why not hit MBs APIs and query the missing MBIDs?

The fall back would already be in place, as there are artists that don't have MBID, and alos for these other roles we need to create an ID for them. But this will duplicate some artists.

Hitting MB APIs would be a good solution (privacy issues aside), but it would slow down a lot the scanner/import process. And doing it asynchronously after the scan is very complex (as we need the IDs when importing the files.

Not sure what is the right solution here. I may finish the code with only names for now, and when it is merged, I may come back to this.

deluan · 2025-02-19T17:41:11Z

Hey @vs49688 did you try the BFR version (#2709) already? Hoping to get some feedback from you.

arg274 · 2025-02-20T13:20:50Z

@deluan A quick test, it seems to not merge artists in certain scenarios. For example, I have this this comp where track 12 belongs moon, who is credited in the native script (মুন) within the comp. I also have his solo album, where he is credited as "moon". BFR seems to create two separate artist entries for this. I did confirm that both my files have the same artist MBID. This bug does not trigger when the artist is not aliased/credited as something else.

vs49688 · 2025-02-20T13:22:17Z

@deluan I haven't tested it yet, I'll attempt to this weekend

deluan · 2025-02-20T13:34:35Z

@deluan A quick test, it seems to not merge artists in certain scenarios. For example, I have this this comp where track 12 belongs moon, who is credited in the native script (মুন) within the comp. I also have his solo album, where he is credited as "moon". BFR seems to create two separate artist entries for this. I did confirm that both my files have the same artist MBID. This bug does not trigger when the artist is not aliased/credited as something else.

Unfortunately Picard does not add MBIDs for all contributors (composers, performers, remixers...), so if we use MBID to disambiguate artists, you could end up with duplicated artists.

I decided against adding artist disambiguation by MBID for now, but I've been thinking on some solutions that may involve using a sidecar file. I'll give it more thinking after BFR is released.

vs49688 · 2025-02-21T04:43:02Z

I've rebased on top of 0.54.5 here: https://github.com/vs49688/navidrome/tree/v0.54.5-mbz (untested)

~~EDIT: Doesn't compile, stand by~~
EDIT2: Fixed

vs49688 · 2025-03-16T12:36:50Z

@deluan I've finally had a chance to try this. I've run into some problems with the scanner. It's stopping way too early.

time="2025-03-16T22:33:53+10:00" level=info msg="Scanner: Starting scan" fullScan=true numLibraries=1
time="2025-03-16T22:33:53+10:00" level=debug msg="Scanner: Starting phase 1: Scan all libraries and import new/updated files"
time="2025-03-16T22:33:53+10:00" level=info msg="Scanner: Completed processing folder" audioCount=0 elapsed=55.1ms imageCount=1 library="Music Library" plsCount=0 tracksImported=0 tracksMissing=0 folder="Adgio Hutchings/Path of Exile [2016]"
time="2025-03-16T22:33:53+10:00" level=info msg="Scanner: Completed processing folder" audioCount=0 elapsed=57.1ms imageCount=1 library="Music Library" plsCount=0 tracksImported=0 tracksMissing=0 folder="Casting Crowns/Until The Whole World Hears Live"
time="2025-03-16T22:33:53+10:00" level=info msg="Scanner: Completed processing folder" audioCount=0 elapsed=59.1ms imageCount=1 library="Music Library" plsCount=0 tracksImported=0 tracksMissing=0 folder="Chris Nelsen/Chaos Overlords OST [album]"
time="2025-03-16T22:33:53+10:00" level=info msg="Scanner: Completed processing folder" audioCount=23 elapsed=112.9ms imageCount=1 library="Music Library" plsCount=0 tracksImported=23 tracksMissing=0 folder=Compilations/Quadrophenia
time="2025-03-16T22:33:53+10:00" level=info msg="Scanner: Completed processing folder" audioCount=15 elapsed=140.5ms imageCount=1 library="Music Library" plsCount=0 tracksImported=15 tracksMissing=0 folder="Compilations/The Twilight Saga_ Breaking Dawn, Part 1"
time="2025-03-16T22:33:53+10:00" level=info msg="Scanner: Completed processing folder" audioCount=13 elapsed=82.1ms imageCount=1 library="Music Library" plsCount=0 tracksImported=13 tracksMissing=0 folder="Green Day/American Idiot [482]"
time="2025-03-16T22:33:53+10:00" level=info msg="Scanner: Completed processing folder" audioCount=0 elapsed=65.7ms imageCount=0 library="Music Library" plsCount=0 tracksImported=0 tracksMissing=0 folder="Tadayoshi Makino, Inon Zur, 近藤嶺 & 石川チャミィ"
time="2025-03-16T22:33:53+10:00" level=debug msg="Scanner: Finished reading folders" lib="Music Library" numFolders=4114 path=/storage/SyncRoot/Music/music
time="2025-03-16T22:33:53+10:00" level=debug msg="Scanner: Finished loading all folders" numChanged=8 numFolders=4114

deluan · 2025-03-16T14:33:11Z

Are you upgrading from your 1363 DB? Maybe there are changes that were not considered in #2709 migration/upgrade process.

Did you try a full scan?

vs49688 · 2025-03-16T15:54:06Z

Are you upgrading from your 1363 DB? Maybe there are changes that were not considered in #2709 migration/upgrade process.

Yeah, I upgraded straight from the existing one, using:

  "PID.Track": "musicbrainz_trackid|track_legacy",
  "PID.Album": "musicbrainz_albumid|album_legacy",

Did you try a full scan?

I did now. It seemed to work. Everything's still there, including playlists.

Perhaps I'm worrying about nothing, but the primary keys in the media_file table seem inconsistent:

ffea705b-4d01-33eb-9950-daa73c5f447b
ffeadfb2597728055f47fa02b10c578a
ffebd52d-3f4f-3664-aeab-ade706bc376a
ffecb0f2-fcf5-4316-b45c-a1296f304afa
ffecf7a1-c50a-3f42-988e-1ca10cd8ec19
ffedaa76-ec5a-3d89-b7cc-bc5e8e3972be
ffee6146d1dec09ea5b145ae91d7d5f3
ffeeaf748ee726209b6e8ffba3c6ce24
fff12feb-2b9c-3dca-b6f3-50b6ac1cfd89
fff14e07-2072-3891-9ef3-50330b4a20ab
fff1c003-f007-45d9-9bc4-2354ff193240
fff1c6a4-4775-44b4-b24b-509d096d3347
fff7ce9e-272d-31fa-a802-b8d5c4404f63
fffae7dd-1411-34f6-bd82-81659289b1fc
fffb259f-26d4-37ce-81c5-14a04b5ee41d
ffff13c4-000b-31cc-8366-715f1f6a73bf
ffff5d45-9c1b-3bee-bcc5-400ebb84fb57
gBER1fBFjIBIdu0KB2i0dj
ikRLIFGrK9oQtxhSPLEqzl
jb12qWRGmx5yCpoItMwCbT
kivZUEpWZOzLAe7ntMviIh
lE17dXxiXtv7KKOTpMsXtd

Most of them are right - but I've discovered some albums (custom game rips, etc. without musicbrainz ids) are still in the DB, but not in the UI

Happy to jump on chat/Discord if that's easier

vs49688 · 2025-03-16T16:44:08Z

Closing this... It's been a long ride, and it's time for the PR to rest.

To migrate to BFR:

Follow standard migration instructions
- Do NOT use custom values for PID.Track and PID.Album
Trigger another full rescan

deluan · 2025-03-16T16:48:38Z

Most of them are right - but I've discovered some albums (custom game rips, etc. without musicbrainz ids) are still in the DB, but not in the UI

Just a note for posterity. The album_legacy and track_legacy options cannot be used in conjunction with other PID options, so:

  "PID.Track": "musicbrainz_trackid|track_legacy",
  "PID.Album": "musicbrainz_albumid|album_legacy",

This is invalid, and Navidrome is only using the MBIDs part. After removing this config, @vs49688 was able to find the missing albums.

github-actions · 2025-07-15T02:11:47Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

vs49688 mentioned this pull request Sep 23, 2021

Support multiple releases of the same album #489

Closed

vs49688 force-pushed the adedup branch from c6e7632 to 84afe46 Compare September 25, 2021 15:28

vs49688 changed the title ~~[DRAFT] scanner: use musicbrainz release id when generating album id~~ scanner/mapping: take MusicBrainz tags into account when generating IDs Sep 25, 2021

vs49688 force-pushed the adedup branch from 84afe46 to 976d59a Compare September 27, 2021 03:41

vs49688 force-pushed the adedup branch 9 times, most recently from 5298640 to 867aaf4 Compare September 29, 2021 02:39

vs49688 force-pushed the adedup branch from 867aaf4 to ba9d803 Compare September 30, 2021 01:26

vs49688 marked this pull request as ready for review October 2, 2021 15:08

deluan mentioned this pull request Oct 6, 2021

Albums get merged #1324

Closed

vs49688 added 6 commits September 18, 2024 14:58

model/artist: add DeleteMany()

01f42eb

model/mediafile: add DeleteMany()

81d294c

model/user: add GetAll()

323e54c

model/annotation: add CopyAnnotation()

a662a15

scanner/mapping: use musicbrainz ids if configured to

2532136

Not MD5'ing them because they're UUIDs and can be eyeball'd easily in the database.

vs49688 force-pushed the adedup branch from 176a610 to f8eedd9 Compare September 18, 2024 05:58

deluan force-pushed the master branch from d154eee to 8808ead Compare October 20, 2024 23:35

deluan mentioned this pull request Nov 5, 2024

BFR: Big Refactor (new scanner and DB schema changes) #2709

Merged

23 tasks

Szeraax added this to the Big Refactor milestone Nov 12, 2024

vs49688 marked this pull request as draft February 21, 2025 04:42

vs49688 closed this Mar 16, 2025

github-project-automation bot moved this from Done to Released in Navidrome roadmap Mar 16, 2025

github-actions bot added the frozen-due-to-age label Jul 15, 2025

github-actions bot locked as resolved and limited conversation to collaborators Jul 15, 2025

Uh oh!

Support using MusicBrainz IDs instead of other metadata #1363

Support using MusicBrainz IDs instead of other metadata #1363

Uh oh!

Conversation

vs49688 commented Sep 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

certuna commented Sep 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vs49688 commented Sep 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vs49688 commented Sep 25, 2021

Uh oh!

certuna commented Sep 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vs49688 commented Sep 25, 2021

Uh oh!

vs49688 commented Sep 25, 2021

Uh oh!

certuna commented Sep 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vs49688 commented Sep 26, 2021

Uh oh!

deluan commented Sep 26, 2021

Uh oh!

vs49688 commented Sep 27, 2021

Uh oh!

deluan commented Sep 27, 2021

Uh oh!

vs49688 commented Sep 29, 2021

Uh oh!

vs49688 commented Oct 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deluan commented Oct 7, 2021

Uh oh!

github-actions bot commented Sep 18, 2024

Uh oh!

deluan commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vs49688 commented Nov 6, 2024

Uh oh!

deluan commented Nov 6, 2024

Uh oh!

deluan commented Feb 19, 2025

Uh oh!

arg274 commented Feb 20, 2025

Uh oh!

vs49688 commented Feb 20, 2025

Uh oh!

deluan commented Feb 20, 2025

Uh oh!

vs49688 commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vs49688 commented Mar 16, 2025

Uh oh!

deluan commented Mar 16, 2025

Uh oh!

vs49688 commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vs49688 commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deluan commented Mar 16, 2025

Uh oh!

github-actions bot commented Jul 15, 2025

Uh oh!

Uh oh!

vs49688 commented Sep 23, 2021 •

edited

Loading

certuna commented Sep 24, 2021 •

edited

Loading

vs49688 commented Sep 24, 2021 •

edited

Loading

certuna commented Sep 25, 2021 •

edited

Loading

certuna commented Sep 26, 2021 •

edited

Loading

vs49688 commented Oct 6, 2021 •

edited

Loading

deluan commented Nov 5, 2024 •

edited

Loading

vs49688 commented Feb 21, 2025 •

edited

Loading

vs49688 commented Mar 16, 2025 •

edited

Loading

vs49688 commented Mar 16, 2025 •

edited

Loading