The problem:
My audio collection, which contains mostly FLAC but also some MP3 and maybe a few OGG files, is sourced from 1) ripping my CD collection to disk and 2) purchasing downloads. My music collection looks like artist/album/{audio files,cover,liner notes}.
For some years I have been unhappy with the hot mess that is the range of audio tags in these files, caused by uneven data quality in things like MusicBrainz and also coming from providers of downloads. I've been writing a series on using Groovy on opensource.com and, having learned about JAudiotagger, decided to write some Groovy scripts to do a cleanup operation.
After some testing, I've come up with the following Groovy code to deal with my tags:
new File(musicLibraryDirName).eachDir { artistDir ->
artistDir.eachDir { albumDir ->
albumDir.eachFile { contentFile ->
if (contentFile.name ==~ /.*\.(flac|mp3|ogg)/) {
// Get the tag body
def af = AudioFileIO.read(contentFile)
def tagBody = af.tag
// Save the values for the known and wanted tag fields by FieldKey
def album = tagBody.getFirst(FieldKey.ALBUM)
def artist = tagBody.getFirst(FieldKey.ARTIST)
// etc
// Get a list of all the tag field ids
def originalTagFieldIdList = tagBody.fields.collect {
tagField -> tagField.id
}
// Delete all the tag fields except VENDOR by tag field id
originalTagFieldIdList.each { tagFieldId ->
if (tagFieldId != 'VENDOR')
tagBody.deleteField(tagFieldId)
}
// Set only the non-blank saved wanted tag fields by FieldKey
if (album) tagBody.setField(FieldKey.ALBUM, album)
if (artist) tagBody.setField(FieldKey.ARTIST, artist)
// etc
// commit the changes
af.commit()
}
}
}
}
I have run this on a "test music directory" and as far as I can tell it seems to do the job, but I have two main concerns and I'm looking for advice on them:
- Is this a good general way to approach the problem, or am I completely missing something here? My concern is that saving the wanted tag fields by FieldKey, then deleting all tag fields by tag field id, then re-establishing the wanted tag fields by FieldKey again seems kind of heavy-handed or inefficient or something ugly like that.
- Though
tagBody.delete(tagFieldId)
seems to "delete tag field by tag field id", this seems wrong, given that the interface doesn't define such a method and althoughAbstractTag
does, though it's protected, andFlacTag
doesn't extendAbstractTag
but rather uses composition to define its internal tag property asVorbisCommentTag
which DOES extendAbstractTag
.
Any critiques, advice, suggestions much appreciated.