• Welcome to RadioDJ - Free Radio Automation Software Forum. Please log in or sign up.

Why [ALREADY EXISTS] when folder is different Genre is different ????

Started by spirosk, November 18, 2014, 04:42:35 PM

gstark

Quote from: djginod on December 03, 2014, 05:49:12 PM
your way of working is very organised, but not the best method I guess for this program.

Yes, what your saying is correct, and therein lies the problem: the software should be designed to help us, as the user.

Not the other way around. I'm actually a professional software developer with over 30 years' experience, and it's exactly this sort of issue that I'm frequently called in to address.

For what it's worh, MediaMonkey tell me that I have 27 different versions of Gershwyn's "Summertime", and SAM also tells me that I have 27 different versions of the same song. RadioDJ reports just 15. This means that almost half of the versions of this track that I have, including the great Ella Fitzgerald version, are not available to me within RadioDJ.

Is this a good thing? I don't believe so.

The core issue here is not one of tags, but of file management, and there's a few different ways it could easily be addressed. Very easily.

1: Check for dupes on the basis of the full pathname of the file. By definition, there can not be any duplicates.

Alernatively,

2: Store a MD5 hash of the actual file in the database. By calculating a new MD5 hash on any file as it's being imported, and comparing the hash value with that stored. At any time, if the hash value is the same, then the file will be a duplicate.


AndyDeGroo

Quote from: gstark on December 05, 2014, 02:28:37 AM
1: Check for dupes on the basis of the full pathname of the file. By definition, there can not be any duplicates.
RadioDJ actually is using FULL PATH to decide if track is a duplicate. There is a unique index on `path` column in `songs` table.
It's somewhat strange for an experienced developer not inspect database to understand how it all works.

gstark

Quote from: AndyDeGroo on December 15, 2014, 06:47:53 AM
RadioDJ actually is using FULL PATH to decide if track is a duplicate. There is a unique index on `path` column in `songs` table.
It's somewhat strange for an experienced developer not inspect database to understand how it all works.

Just because there's an index in the database does not mean that it's used for this purpose.  I don't have access to the source code, so I'm not able to verify what is actually happening,  and I would never presume to make the leap that joins the dots as you're suggesting as I simply have no evidence to make such an assertion. I

I'm going on the basis of my own observations,  where non duplicate files (based upon the full file path) have been flagged as duplicates, plus the comments of other people,  including posts by Marius within this thread,  which state that only the file name is used for this check. 

I would respectfully suggest that if Marius says this, then that would be a rather authoritative statement. :)


Marius

DOWNLOADS PAGE

HOW TO FIX RADIODJ DATABASE
----------------
Please don't PM me for support requests. Use the forums instead.

gstark

Ok, Marius,

Unique is good, and it's probably how I'd expect it to be, but now I'm a little bit confused. :)

Quote from: Marius on December 16, 2014, 09:30:07 AM
Path column is set to unique if this helps.

Now, for the purposes of this thread, your earlier comment ...

Quote from: Marius on December 02, 2014, 08:12:50 PM
No matter what path is, if the filename already exists it will not be imported again.

says (to me) that the path is not used to determine the uniqueness of any file, and that the uniqueness of any file is governed solely by the filename ...

But yet, and as pointed out by AndyDeGroo, you have a unique index in the db, which could easily, and more correctly, determine the true uniqueness of any file.

Your earlier comment suggests that this index is not used for this purpose, (I'd expect it's used to find the file for playing purposes, amongst other things) and it's not for me to query your reasons, but it seems that, if there was a way that this index could also be used for this purpose, rather than just the filename, then perhaps there might be a better outcome here for all concerned? 

Thanks for your help.


AndyDeGroo

As a RadioDJ user with one year experience and developer of one plugin which imports files, I can assure you that full path is used for checking if file is a duplicate.

I can see the source code and you could as well, because it is way too easy to disassemble .Net assemblies.
What RadioDJ does in Tracks.ImportTrack method is this:

if (trackData.ID == 0)
{
text = "INSERT IGNORE INTO `songs` SET ";
}
else
{
text = "UPDATE `songs` SET ";
}

The text variable is a string used for SQL query which gets concatenated to form full query for inserting or updating track info.

I hope that it's clear now and we can put this discussion to rest.

Marius

Beside that, in the import code there is a part which it checks also for the filename in the path column. If it's found one, the track will not be imported.
You must understand that this is required because many times the same track is added to more than one album/compilation and if i allow multiple import only based by the path, the tracks will be treated as different and you will end up playing the same track more often than you set in track repeat interval.
DOWNLOADS PAGE

HOW TO FIX RADIODJ DATABASE
----------------
Please don't PM me for support requests. Use the forums instead.

AndyDeGroo

Oh, I spoke too soon. Now I see that rdjInterface.Tracks.TrackExists(string filePath) method is actually checking file name at the end of `path` field:
string sql = "SELECT `path` FROM `songs` WHERE `path` LIKE '%" + rdjInterface.Utils.QueryFix(Path.GetFileName(filePath)) + "' LIMIT 1;";

Similarly rdjInterface.Tracks.TrackExists(string artist, string title, string rrr) is matching `artist` and `title` fields. Whatever the third argument rrr is.

gstark

Quote from: Marius on December 17, 2014, 08:02:48 AM
Beside that, in the import code there is a part which it checks also for the filename in the path column. If it's found one, the track will not be imported.
You must understand that this is required because many times the same track is added to more than one album/compilation and if i allow multiple import only based by the path, the tracks will be treated as different and you will end up playing the same track more often than you set in track repeat interval.

Hi Marius,

Thanks for that clarification.

That solution works for those who use the program in the manner that you've suggested.

But I don't think that all of your users do it in this manner.

As I've said in an earlier post, I will have a large number of tracks, all with the same trackname, but they will all be different tracks. Different artists, for instance.

Or different performances.

Or perhaps the same album, but the original album, and a second (or third) version of the same album, perhaps remastered, or maybe a HD version of it.

In each of the instances that I've outlined here, the tracks may correctly have the same trackname, but they will mostly be different tracks, and they all should, more correctly, be imported is such.

As it stands at the moment, I have (for example) Summertime by Ella Fitzgerald, Summertime by Big Joe Turner, Summertime by Janis Joplin ... Herbie Hancock, Toots Thielemans, Bill Evans, Brian Wilson, George Benson, Billie Holiday, Sarah Vaughan  ... 27 different versions of Summertime in fact, but while they are all versions of the same song, written by George Gershwyn,  they are each very different tracks. Identifying each of these only by the trackname property of the filepath fails to take into account the fact that each of these are very different performances, by, as you can see, very different artists.

Perhaps you could make this part of the import process subject to a setting in the options, so that we can choose whether we wish to define the term "duplicate tracks" by purely the filename, or not?

And yes, I'm perfectly happy to have actual duplicates of some tracks, such as from compilations, in my DB.

Cheers.


power_fm

This might sound like double-work, but I have created a single folder on an external drive that has all the songs we use. It only has the song title as the file name. When we want to add new songs to RadioDJ we put the songs into this folder first to see if Windows detects the same file name as we copy across. I take note of any file name clashes and use mp3tag to change them. All this happens before we load the songs into Radio DJ.

We call it the "superlist" and the bonus is that we have a backup of all our music in a separate location to the Radio DJ library.