Reference file not showing in results when combining dupe-from and dupe-count

Have a suggestion for "Everything"? Please post it here.
unbarredstream
Posts: 5
Joined: Wed Mar 12, 2025 3:26 am

Reference file not showing in results when combining dupe-from and dupe-count

Post by unbarredstream »

When searching for duplicates using dupe-from and there are a lot of results (for example video length duplicate) and combining it with dupe-count, if there's a lot of results there's a possibility that the reference file is omitted from the results.

Also, in this specific case with video length, how does the count limit works? the first X results? the nearest X results?, maybe it needs a new filter that works for length limit
therube
Posts: 5727
Joined: Thu Sep 03, 2009 6:48 pm

Re: Reference file not showing in results when combining dupe-from and dupe-count

Post by therube »

What are the specific search(es) you are using?
therube
Posts: 5727
Joined: Thu Sep 03, 2009 6:48 pm

Re: Reference file not showing in results when combining dupe-from and dupe-count

Post by therube »

It appears that count: will simply limit the number of results returned, regardless of other search criteria.

video: dupe:length count:9


dupe-count: looks like it will return at most the specified "count" - per dupe:length.

video: dupe:length dupe-count:3
(so if you have 99 files of the same length, only 3 will be displayed of that file)


I don't know that there is such a thing as a "reference file".
(Now, maybe you can "create" a reference file kind of condition, where the dupe:lengths / dupe-count: are keyed off of that...)
unbarredstream
Posts: 5
Joined: Wed Mar 12, 2025 3:26 am

Re: Reference file not showing in results when combining dupe-from and dupe-count

Post by unbarredstream »

With reference file I mean the referenced in dupe-from as you're trying to find duplicated from that specific folder, so for example, when using "video: dupe-from:C:\test dupe:length dupe-count:5" If my reference folder have a 1:00 minute video, but I have 30 1:00 minute videos in my database the original video doesn't appear anymore in the results.

In this case when I have a lot of unrelated duplicates I use the file size and the file preview to search reencoded versions, but If the reference file doesn't appear in the results I can't compare the size and preview from the other videos.

So my suggestion is a way to further filter files when using dupe-from, for example: "video:dupe-from:C:\test dupe:length dupe-size:+-5%", meaning "Videos duplicated from this folder where the length is the same and the size is more or less 5% of the reference file", it could use percentage of size or exact number like "dupe-size:+100mb", meaning the same size to the size + 100mb
void
Developer
Posts: 19901
Joined: Fri Oct 16, 2009 11:31 pm

Re: Reference file not showing in results when combining dupe-from and dupe-count

Post by void »

dupe-count:x will only show x dupes.

dupe-from: is still applied to all dupes.
However, a result from the dupe-from: path might be omitted when using dupe-count:

Please consider putting the sizes into buckets, and check for length and bucket dupes, eg:

video: dupe-from:C:\test addcolumn:a a:=INT($size:/5000000) dupe:a-descending;length


adjust 5000000 to be about 5% of your average video size.
unbarredstream
Posts: 5
Joined: Wed Mar 12, 2025 3:26 am

Re: Reference file not showing in results when combining dupe-from and dupe-count

Post by unbarredstream »

Your formula searches for exact duplicate of a 5% variation, I'm looking for a way to filter a range.

I came up with something like this:

Code: Select all

addcolumn:b,c b:=INT($size:*1.05) c:=INT($size:*0.95) b-format:size:kb c-format:size:kb
.

With this I get 2 columns, b with size + 5% and c with size - 5%, How can I filter the results from dupe-from using this values?

This

Code: Select all

size:c..b
doesn't seems to work, How do I get length duplicates with the size in that range?
void
Developer
Posts: 19901
Joined: Fri Oct 16, 2009 11:31 pm

Re: Reference file not showing in results when combining dupe-from and dupe-count

Post by void »

With this I get 2 columns, b with size + 5% and c with size - 5%, How can I filter the results from dupe-from using this values?
Can't be done with Everything 1.5.
The closest option is sorting sizes into buckets.

size:c..b

doesn't seems to work, How do I get length duplicates with the size in that range?
Can't be done with Everything 1.5.
size: needs well defined ranges, before the search executes.