-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Support using MusicBrainz IDs instead of other metadata #1363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I think it's a good idea in principle, just some remarks:
|
In this case, I'm of the mind to tell the user to strip the MusicBrainz tags from those files, retaining the current behaviour. By moving them into a "Loose Tracks" album, you're effectively creating a "new" album. If this "new" album isn't on MusicBrainz, it shouldn't have MusicBrainz tags. Also, isn't this what playlists are for? Another option would be to add a "navidrome_mbignore" tag, which if set would tell Navidrome to ignore any MusicBrainz tags it finds when scanning.
Indeed, I never considered this may need to be done with artists until after I'd pushed the commit. |
Pushed new version that should handle track/album/artist/albumartist IDs. |
One issue will be: what does Navidrome do with artists that have the same mbid but different tags in the Artist field, such as "DJ Tiesto" and "Tiesto"? It's not so easy, you might get some strange behaviour, consider this case:
Current behaviour: album 1 and 3 grouped under DJ Tiesto, album 2 and 4 grouped under Tiesto |
Regardless of my changes, albums 3 & 4 should be under different artists (or the artists fixed). |
Actually, thinking about it some more... What do we want the behaviour to be? All I intended was to make it not merge different releases of the same album. If the tags in the files are wrong (thus causing Navidrome to split the artists/albums), then they should be fixed. I'm fine with either option (it doesn't affect me as I run everything through Picard and beets), @deluan what do you think? |
The more I think of it, it's indeed a drastic change. Navidrome currently does all organization based on official tag fields: Moving to organization based on custom Musicbrainz ID tags is good if you consistently use those, but that puts the onus on users to have not only an immaculately tagged collection with 'regular' tags, but also consistently have the mbz tags correct, because you'll create splits between stuff with and without mbz id's. And this decreases transparency to the user why this happens, and we'll end up like Plex where the scanner feels like a black box where "anything can happen". For example:
|
As long as the scanning behaviour is explicitly-and-well-documented, I don't really see the problem.
How about this? Add a When set, if an album/artist has a mbid, Navidrome will use it (and only it). This provides a solution to #489, assuming everything's tagged correctly. If the mbid is NOT set, it'll fall back and create a new/duplicate artist/album. It is up to the user to tag the files correctly and trigger a rescan. Forgive me for being pushy on this, but I need a fix for the #489 - it's affecting large part of my library. |
Hey, sorry being late for the party :) After reading the thread, these are my two cents (mostly agreeing with latest @vs49688 comment):
On top of that, a migration plan is a MUST, or else users that enable this will break their playlists, stars, playcounts, etc... The migration should go over all tables with an |
How'd you prefer the migrations to be done?
|
Whichever is faster. I'd assume it is the later, but it is worth to test it out if you can. |
5298640
to
867aaf4
Compare
It's mostly finished, all that needs to be done now is some cleanups, and to trigger a full rescan. Also, where's the best place for this? I've just put it in the top-level |
Ping on this? Specific issues I'd like feedback on:
The current behaviour:
|
Sorry for the delay. Will take a look tonight |
Not MD5'ing them because they're UUIDs and can be eyeball'd easily in the database.
Converts the library to use MusicBrainz IDs for albums, artists, albumartists, and tracks. All playlists, stars, etc. are kept. This is an irreversible transformation; the user is given a warning and may cancel within 10 seconds by pressing ^C. The mediafiles table is used as the sole source-of-truth to build an entirely new hierarchy, based off the MusicBrainz IDs. Existing entity metadata is applied to the new entities, except in the case where one entity may need to be split (e.g. different releases of the same album). These split entities are left blank. References are then updated for both play queues and playlists. The existing entities are then deleted, and a full-rescan is triggered to fill in any missing information.
Download the artifacts for this pull request: |
Hey @vs49688, as you know, I'm adding support for this in #2709, but I'm struggling with what to do when the user has more than one copy of the same album (let's say a MP3 copy and a FLAC copy), and both have same MBIDs. How are you handling this? Another challenge is that now that we will support multiple roles and artists for each album/track, we may get artists without MBIDs, as there are no tags for Composers, Lyricist, Remixer, Performer, etc... This could cause an artist to be duplicated (one with MBID, and another with just Name). Any ideas? |
I don't. This is very much a "curate/tag your library properly" problem, in my mind. I'd argue that a certain degree of discipline is required for libraries at scale. If I do have any duplicates, Maybe if this happens, just spit a warning and keep the existing one?
Hmm, yeah that's a tricky one...I'm not sure what to do either. Unless there's some kind of unique ID, you might have to fall back to basic name-matching. |
Yeah, that's what I'm inclined to do.
The fall back would already be in place, as there are artists that don't have MBID, and alos for these other roles we need to create an ID for them. But this will duplicate some artists. Hitting MB APIs would be a good solution (privacy issues aside), but it would slow down a lot the scanner/import process. And doing it asynchronously after the scan is very complex (as we need the IDs when importing the files. Not sure what is the right solution here. I may finish the code with only names for now, and when it is merged, I may come back to this. |
@deluan A quick test, it seems to not merge artists in certain scenarios. For example, I have this this comp where track 12 belongs moon, who is credited in the native script (মুন) within the comp. I also have his solo album, where he is credited as "moon". BFR seems to create two separate artist entries for this. I did confirm that both my files have the same artist MBID. This bug does not trigger when the artist is not aliased/credited as something else. |
@deluan I haven't tested it yet, I'll attempt to this weekend |
Unfortunately Picard does not add MBIDs for all contributors (composers, performers, remixers...), so if we use MBID to disambiguate artists, you could end up with duplicated artists. I decided against adding artist disambiguation by MBID for now, but I've been thinking on some solutions that may involve using a sidecar file. I'll give it more thinking after BFR is released. |
I've rebased on top of 0.54.5 here: https://github.com/vs49688/navidrome/tree/v0.54.5-mbz (untested)
|
@deluan I've finally had a chance to try this. I've run into some problems with the scanner. It's stopping way too early.
|
Are you upgrading from your 1363 DB? Maybe there are changes that were not considered in #2709 migration/upgrade process. Did you try a full scan? |
Yeah, I upgraded straight from the existing one, using: "PID.Track": "musicbrainz_trackid|track_legacy",
"PID.Album": "musicbrainz_albumid|album_legacy",
I did now. It seemed to work. Everything's still there, including playlists. Perhaps I'm worrying about nothing, but the primary keys in the
Most of them are right - but I've discovered some albums (custom game rips, etc. without musicbrainz ids) are still in the DB, but not in the UI Happy to jump on chat/Discord if that's easier |
Closing this... It's been a long ride, and it's time for the PR to rest. To migrate to BFR:
|
Just a note for posterity. The
This is invalid, and Navidrome is only using the MBIDs part. After removing this config, @vs49688 was able to find the missing albums. |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Converts the library to use MusicBrainz IDs for albums, artists,
albumartists, and tracks. All playlists, stars, etc. are kept.
This is an irreversible transformation; the user is given a warning
and may cancel within 10 seconds by pressing ^C.
The mediafiles table is used as the sole source-of-truth to build
an entirely new hierarchy, based off the MusicBrainz IDs.
Existing entity metadata is applied to the new entities, except in the
case where one entity may need to be split (e.g. different releases of
the same album). These split entities are left blank.
References are then updated for both play queues and playlists.
The existing entities are then deleted, and a full-rescan is triggered
to fill in any missing information.
Closes #489.