Find Duplicates

Discussion related to "Everything" 1.5 Alpha.
Post Reply
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Find Duplicates

Post by void »

Everything 1.5 adds support for finding unique or duplicates items in your results.

Everything can find files/folders with duplicated/unique names, sizes, dates and other properties.



Usage
Find duplicated results
Find unique results (including first duplicated) AKA: Distinct
Find unique results (not duplicated)
Dupe lines
Group colors
Advanced



Usage

To instantly find files in your results with the same size:
Right click the size column header and click Find size duplicates.
-or-
Include the following in your search:
dupe:size


To find files with the same content:
Include the following in your search:
dupe:size;sha256

The sha256 is only be calculated for files with the same size.
Calculating sha256 will take a very long time.
For the best performance, combine the dupe:size;sha256 search with other search filters.



Find duplicated results

Everything can find files/folders with duplicate properties in the current results.
Finding duplicates is done after your search.



To find duplicates results:
  • Right click a column header and click Find <property> duplicates.
For example, to find size-duplicated files/folders:
  • Right click the size column header and click Find size duplicates.
Everything will remove files/folders from the result list that are not duplicated.
DUPE text is shown in the status bar on the right.
Double click the DUPE text in the status bar or change the search to clear the find-duplicates view.





To find duplicated files/folders from a search: If no property is specified, Everything will default to the name property.

For example, to find files/folders with the same size, include the following in your search:
dupe:size



To find files/folders with multiple duplicated properties:
  • Include the following in your search:
    dupe:<property-list>
    where <property-list> is a semicolon (;) delimited list of property names.
A maximum of 8 properties are supported.
If you want to compare more than 8 properties, use column formulas.

For example, to find files/folders with the same name and size, include the following in your search:
dupe:name;size



To find files with duplicated content:
  • Include the following in your search:
    dupe:size;sha256
Everything will first compare file sizes, if the size matches, Everything will gather the SHA256 hash to compare content.
The SHA256 hash is not indexed and will take a very long time to gather.



To find duplicated files between two folders:
  • Include the following in your search:
    "C:\path\to\folder A" | "C:\path\to\folder B" dupe:size
Everything will first compare file sizes, if the size matches, Everything will gather the SHA256 hash to compare content.
The SHA256 hash is not indexed and will take a very long time to gather.



Find unique results (including first duplicated)

Find unique (including first duplicated) is also known as distinct in Everything.

To find distinct results:
  • Right click a column header, right click Find <property> duplicates and click Find unique (including first duplicated).
For example, to find distinct names in the results:
  • Right click the Name column header, right click Find Name duplicates and click Find unique (including first duplicated).
Everything will remove files/folders from the result list that are duplicated (except the first duplicated item).
DISTINCT text is shown in the status bar on the right.
Double click the DISTINCT text in the status bar or change the search to clear the find-distinct view.



To find distinct files/folders from a search: If no property is specified, Everything will default to the name property.

For example, to find one file/folder from each folder, include the following in your search:
distinct:path



Find unique results (not duplicated)

To find unique results:
  • Right click a column header, right click Find <property> duplicates and click Find unique (not duplicated).
For example, to find unique names in the results:
  • Right click the Name column header, right click Find Name duplicates and click Find unique (not duplicated).
Everything will remove files/folders from the result list that are duplicated.
UNIQUE text is shown in the status bar on the right.
Double click the UNIQUE text in the status bar or change the search to clear the find-unique view.



To find unique files/folders from a search: If no property is specified, Everything will default to the name property.

For example, to find unique filenames, include the following in your search:
unique:name



Dupe lines

Horizontal dividing lines are shown when values change between items.



To customize the line color:
  • In Everything 1.5, from the Tools menu, click Options.
  • Click the Advanced tab on the left.
  • To the right of Show settings containing, search for:
    dupe_line
  • Select dupe_line_color (or dupe_line_dark_color for dark mode).
  • Click the color button.
  • Select a new color and click OK.
  • Click OK.
To hide the dividing line:
  • In Everything 1.5, from the Tools menu, click Options.
  • Click the Advanced tab on the left.
  • To the right of Show settings containing, search for:
    dupe_line
  • Select dupe_lines.
  • Set the value to: false
  • Click OK.


Group colors

Group colors shows results in different colors.
Items sharing the same property values are shown in the same color.





To show files/folders that share the same properties in different colors:
  • Include the following in your search:
    groupcolors:
Everything will now show files/folders in different colors.
files and folders that share the same properties will be colored the same.

Colors are generated by a hash of the property values.
Please use colors as a guide only.
Different colors guarantees the property changes.
However, same colors doesn't always mean the same property values.
Different property values can generate the same colors.



To always show files/folders that share the same properties in different colors when finding duplicates/distinct/unique items:
  • Copy and paste the following into your Everything search box:
    /dupe_group_colors=1
  • Press ENTER in your Everything search box.
  • If successful, dupe_group_colors=1 is shown in the status bar for a few seconds.
dupe_group_colors

group_colors



Advanced

Everything finds duplicated items by:
  • Sorting your results by the desired properties.
  • Walking over each item, comparing the properties between the last item and the current one.
  • If the properties are not duplicated, the item is removed from the results.
Everything will sort by the properties specified in distinct:, dupe: or unique:
To override this sort, use the distinct-sort:, dupe-sort: or unique-sort: search function before your distinct:, dupe: or unique: call.

Examples:

To find the latest modified file/folder in each folder, search for:
distinct-sort:path;date-modified distinct:path


To find the largest file/folder size in each folder, search for:
distinct-sort:path;size distinct:path
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

therube
Posts: 5060
Joined: Thu Sep 03, 2009 6:48 pm

Re: Find Duplicates

Post by therube »

Thank you, thank you, thank you (for the added, dupe_lines) :-).
NotNull
Posts: 5517
Joined: Wed May 24, 2017 9:22 pm

Re: Find Duplicates

Post by NotNull »

Let me 'dupe' that: ;)

Thank you, thank you, thank you
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

Thank you for your feedback.

I feel more tweaking is needed.
Sometimes it can be hard to see the line if there is a selection or focus.



The lines are much easier to see when using a bottom margin on listview items:





To set the listview item margin bottom:
  • In Everything 1.5, from the Tools menu, click Options.
  • Click the Advanced tab on the left.
  • To the right of Show settings containing, search for:
    listview
  • Select listview_item_margin_bottom.
  • Set the value to: 1
    (where 1 is the margin bottom in logical pixels - set to 0 to disable)
  • Click OK.


listview_item_margin_bottom
voidMetal theme
therube
Posts: 5060
Joined: Thu Sep 03, 2009 6:48 pm

Re: Find Duplicates

Post by therube »

Just want to reiterate, that this addition is monumental :-).


My "color" issues, simply are no longer.

When I had to "find duplicates", I then ALSO had to verify (by scanning other columns) that what I found is in fact what I was looking for, I no longer have to take that long & error-prone step.
eagleeyez
Posts: 43
Joined: Thu Mar 17, 2016 3:58 pm

Re: Find Duplicates

Post by eagleeyez »

Just a tiny request, if you don't mind, to make the consecutive colors a bit different from the previous ones to make it easier to spot the different groups since some are so close to each other that it is hard to tell which one is which.
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

Everything calculates the color from the XOR hash of the value as text.

This is currently done for performance reasons.

I will consider building a database of colors for each item to make consecutive colors more contrasting.
Thank you for the suggestion.
therube
Posts: 5060
Joined: Thu Sep 03, 2009 6:48 pm

Re: Find Duplicates

Post by therube »

@eagleeyez

And you have tried with the dupe_lines setting?
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

void wrote: Sat Feb 04, 2023 12:12 am Right click a column header and click Find duplicates.
... change the search to clear the find-duplicates view.
Suggestion for consistency in menus.
FindDuplicates_01.png
FindDuplicates_01.png (41.25 KiB) Viewed 36012 times
FindDuplicates_03.png
FindDuplicates_03.png (49.97 KiB) Viewed 36012 times
FindDuplicates_04.png
FindDuplicates_04.png (20.23 KiB) Viewed 36012 times
The first image shows "Find name Duplicates" in the context menu.
The second image introduces a new context menu item "Clear Find Duplicates" while leaving the "Find name Duplicates" in the context menu.
"Clear Find Duplicates" is transient, whereas "Find name Duplicates" is static.

In the third menu item (not a context menu, granted), tick-marks are used to signal that a menu item is chosen/active.

I would argue for a consistent menu style throughout Everything; my preference would be for tick-marks rather than menu lists that change as the situation warrants.
Thanks, Chris
therube
Posts: 5060
Joined: Thu Sep 03, 2009 6:48 pm

Re: Find Duplicates

Post by therube »

So what, you would want a 'Clear ...' to display regardless of whether there was anything to actually clear?


And it gets even better, where you can context-menu the context-menu (i.e., right-click 'Find Names') for even more options.
Granted, that is certainly not discover-able, but once you know it is there, it is a non-issue.
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

therube wrote: Wed May 31, 2023 3:30 pm So what, you would want a 'Clear ...' to display regardless of whether there was anything to actually clear?
Hi The Rube. Quite the opposite. I would do away with the 'Clear..." item, and reply on a tick-mark appearing to the left of the context menu, just as we see in my third image for the View menu of the main menu system.

And it gets even better, where you can context-menu the context-menu (i.e., right-click 'Find Names') for even more options.
Granted, that is certainly not discover-able, but once you know it is there, it is a non-issue.
To my mind you hit the nail on the head with "discover".
Inconsistency in an application or system makes learning [how to use it] harder. Right now most users have to discover how to use Everything by trial and error or by posing questions in the forums.

Back to my example: Once the idea of a tick-mark is gathered, that idea can be applied and used throughout the system. Having two separate methods burns up a user's learning-time needlessly ("Now I know I used the "clear" menu item last week; where is it gone to? ..."

Cheers, Chris
therube
Posts: 5060
Joined: Thu Sep 03, 2009 6:48 pm

Re: Find Duplicates

Post by therube »

Oh, you're saying if Find Name Duplicates is not enabled, it is not ticked, & if enabled, it is ticked.
And if enabled (so ticked) & you want to clear it, you simply un-tick it.

Oh.

But a confusing, possibly, part would be if Find Name is ticked, & you right clicked Path, you would then need to have a (ticked) Find Name & a not-ticked Find Path.

Where currently there is simply a "generic" 'Clear Find Duplicates' (regardless of what Dup is enabled).


Granted, if you ticked, Find Path Dup, Find Name Dup would automatically be un-ticked, but it might be confusing why a Find Name Dup item was showing up when you were right-clicking on Path.
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

therube wrote: Wed May 31, 2023 4:20 pmOh, you're saying if Find Name Duplicates is not enabled, it is not ticked, & if enabled, it is ticked. And if enabled (so ticked) & you want to clear it, you simply un-tick it.
Quite so! But it gets more confusing ...
But a confusing, possibly, part would be if Find Name is ticked, & you right clicked Path, you would then need to have a (ticked) Find Name & a not-ticked Find Path.
Are you suggesting that we can have more than one "Find ... Duplicates" in action at any one time?
Where currently there is simply a "generic" 'Clear Find Duplicates' (regardless of what Dup is enabled).
... which suggests that this "Clear Find Duplicates" is indeed generic and so must be clearing all other "Find ... Duplicates" since it does not specify which "Find ... Duplicates" is being cleared?
Granted, if you ticked, Find Path Dup, Find Name Dup would automatically be un-ticked, but it might be confusing why a Find Name Dup item was showing up when you were right-clicking on Path.
You mean like this, where I have right-clicked "Find ... Duplicates" on-who-knows-how-many column headings (well, How many?!!) before trying "Find ... Duplicates" on the Size column.
FindDuplicates_05.png
FindDuplicates_05.png (45.21 KiB) Viewed 35988 times
For those of you who like learning-by-trial-and-error, answer this question before trying it out on your own system:-
Suppose that your system holds a great many MSWord DOC files. (You can use XLS or MP3 or TXT files if appropriate.
(1) Use Everything to list all your files
Ext:doc

(2) Note the number of items (16,254)
(3) Apply “Find ... Duplicates” to the Name column
(4) Note the number of items (4,584)
(5) Apply “Find ... Duplicates” to the Path column
(6) Note the number of items (3,7,34)
(7) Apply “Find ... Duplicates” to the DateModified column
(8) Note the number of items (1,871)
(9) Apply “Find ... Duplicates” to the Size column
(10) Note the number of items (1,794)
Exercise (1) Explain why the number of reported items is steadily decreasing as we proceed from step 1 through step 10.
(11) Exit Everything
(12) Use Everything to list all your files
Ext:doc

Exercise (2) PREDICT the number of reported items BEFORE finding duplicates by Size, Datemodified, Path and finally Name. (Steps 10 through 2 in that sequence)

Cheers, Chris
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

Find <property> duplicates will remove items that are not duplicated from your current results.

You can apply Find <property> duplicates to different properties.



Clear find duplicates will perform a research to restore all your results.
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

void wrote: Sat Feb 04, 2023 12:12 am Everything 1.5 adds support for finding unique or duplicates items in your results.
Group colors shows results in different colors.
Items sharing the same property values are shown in the same color.
...
Colors are generated by a hash of the property values.
Please use colors as a guide only.
Different colors guarantees the property changes.
However, same colors doesn't always mean the same property values.
Different property values can generate the same colors.

Hello Void.
I am confused by the four underlined statements in the extract above.
It seems to me that Everything will TRY to show a different colour for items that differ in the hash of the property value, but there is no guarantee that different property values will generate distinct hash values, so the distinct items may be assigned the same colour.

(Also it doesn't help that I can't differentiate several shades of pastel green, several shades of pastel blue etc.)

Your disclaimer "Please use colors as a guide only." suggests to a beginner that the mechanism isn't working as intended.

Cheers, Chris
therube
Posts: 5060
Joined: Thu Sep 03, 2009 6:48 pm

Re: Find Duplicates

Post by therube »

Also it doesn't help that I can't differentiate several shades of pastel green, several shades of pastel blue etc.
That's why viewtopic.php?p=57283#p57283, dupe_lines, is so important, IMHO.
However, same colors doesn't always mean the same property values.
Because of (color) collisions.
You could have two "sets" of "blue", together, yet the Properties of 1 set is different from the Properties of the other.
Even if that should happen, dupe_lines demarcates the two sets, so you then have |blue|dupe_lines|blue|.
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

therube wrote: Fri Jun 02, 2023 2:59 pmThat's why viewtopic.php?p=57283#p57283, dupe_lines, is so important, IMHO.
Agreed! In fact, after lunch drained some of the knowledge out of my head I had enough free space to go back and continue my reading. Came across Lines, gave them a quick whirl and now, still very much a junior at dupes, I feel that Lines is much better than colouring.
You could have two "sets" of "blue", together, yet the Properties of 1 set is different from the Properties of the other.
Even if that should happen, dupe_lines demarcates the two sets, so you then have |blue|dupe_lines|blue|.
Disagree. I think of a "set" of results as being files that satisfy a defined sequence (e.g. name, size, date match)

Thinking [still] only of colours, we are talking here of a computer program, whose job it is to walk through a Result List and, at break-in-sequence - assign a different colour (to indicate the start of a new group.
Untitled.png
Untitled.png (15.35 KiB) Viewed 35886 times
The programmer dictates the program's action, so using the list above, the PROGRAM says "The most recent colour I assigned was RED and the next colour in sequence is ORANGE, so I shall switch to orange now ...

Each break in sequence is dictated by the user, who says something like "I consider two (or more) files to be duplicates if the name, size, and date match" (in increasingly excruciating detail). The user having defined "break", the programmer writes code to detect "break", and when a break in sequence is detected, the programmer writes code to advance to the next colour.

Cheers, Chris
therube
Posts: 5060
Joined: Thu Sep 03, 2009 6:48 pm

Re: Find Duplicates

Post by therube »

FastStone Capture uses Tick-marks for some of its' menu items.
Once, if, I figure out how to capture a capture of it capturing said tick-mark, I'll post it.)
NotNull
Posts: 5517
Joined: Wed May 24, 2017 9:22 pm

Re: Find Duplicates

Post by NotNull »

I probably missed something important, but why not just use 2 alternating colours, like the Alternate row color setting for regular results?

EDIT:
Never mind; there is a whole thread dedicated to that ...
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

I did experiment with alternating colors.

The issue is picking the color for the first result.
Making the first color always off makes scrolling feel janky as the whole window needs to be repainted and items will not stay the same color.

Assigning a color to each item is not practical.
I am trying to avoid doing this as I want to keep finding duplicates instant.

Maybe the improved visuals will be worth the slowdown..
I will keep experimenting..
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

void wrote: Sun Jun 11, 2023 11:46 pm... makes scrolling feel janky as the whole window needs to be repainted and items will not stay the same color. ... I am trying to avoid doing this as I want to keep finding duplicates instant.
Now this starts to make sense to me.
If the Result List is being built as-you-go, then by definition the first item loaded into the Result list MUST be unique; that is, not {yet} a member of any group of duplicates.
But then, I imagine that there is no need to load any item into the Result List until it has already been shown to be a duplicate.
That is, the first item loaded into the Result List must turn out be a member of a group of duplicates.

If a set of Found Items are identified by number, and the first twenty found items are all unique, except for 3,5,7 and 9 - which will turn out to be members of a group of four duplicates - then as soon as as 5 is found, we need to go back and load 3 & 5 into the Result List and colour them {say} Red. So far so good. And as 7, and then 9 are found we say "Ahhah! Two more members of the Red Group" and load 7 and then 9 into the group coloured Red.

But this supposes that in doing all of this that we are working through a Sorted list of results.

And if the list is in sequence, then from where does this {stuttering?} Jankiness arise?

I am back to being confused.

{{{ after a few more hours of thought }}} I suspect that my confusion lies in the foundation for my process.
If the Coloring phase starts with a sorted list of items, then I see no problem in grouping. (I think that all these arguments apply equally to lines that separate groups of similar items).
But if the coloring phase starts with an unsorted list of items, then the Result List will be a continual work-in-progress until the last of the unsorted items has been processed.
In the latter case, were our eye/brain fast enough to recognize the individual creation of a Result List, we might see this effect; perhaps a collection of millions of files and a search for a rare Content: would do the trick?

If the Jinking/Stuttering is caused by using as input an unsorted list of items - that is, by using items the instant that they are found - then a solution for the colouring/lines Jinking/Stuttering problem might be a Boolean switch that allows the user to inhibit colouring/lines until the result list is assembled in full.
How does Everything deal with a user like me, who assembles a Result List showing Duplicates, sees 500 items and then decides to ask for colouring/lines? That is, a means to delay colouring is a means to inhibit Junking/Stuttering.

Cheers, Chris
NotNull
Posts: 5517
Joined: Wed May 24, 2017 9:22 pm

Re: Find Duplicates

Post by NotNull »

I think I understand (a bit) ..

When dupes are coloured (two colours) and you start scrolling, you would have to keep track of all the dupes currently shown *and* their current colour.
Example:
List shows
A
A
A
B
B
C
C
D
D
D
D
D
E
E

First entry is black, second is white. So A,C and E's are black; B and D's are white

Now Scroll 4 lines down. B is the first entry shown. Should B be in black or in white?
When not keeping track, all B,C,D and E colours will flip, causing stroboscopic effects

Now scroll 4 lines up ...
Or drag the scrollbar 500.000 records down. And up again... Should A still be in black?


Now I understand the calculated hash to decide the colour: that won't change wherever you scroll.
(Very clever!!)


FWIW: For me personally the dupe_lines setting solves this more than adequate.
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

NotNull wrote: Mon Jun 12, 2023 9:15 pmI think I understand (a bit) ... FWIW: For me personally the dupe_lines setting solves this more than adequate.
Hi NotNull. My understanding of your comments is that the sorting, finding of duplicates and so on is executing perfectly.
But the decision to implement colour-grouping might be seen as an ineffective method of distinction of groups.

The hash-solution you mention might be brilliant, but if it solves an unnecessary problem, well then ...

Cheers, Chris
NotNull
Posts: 5517
Joined: Wed May 24, 2017 9:22 pm

Re: Find Duplicates

Post by NotNull »

ChrisGreaves wrote: Mon Jun 12, 2023 9:27 pm The hash-solution you mention might be brilliant, but if it solves an unnecessary problem, well then ...
Agreed. For me the separation line between dupe A and dupe B works better than colouring. Even when using just 2 colours.



(But I really like these clever, elegant solutions. Makes me smile)
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

Everything 1.5.0.1353a adds distinct-sort:, dupe-sort: and unique-sort:.

Everything 1.5.0.1353a will no longer use the sort order specified with sort: to presort as this was causing too much confusion.
(Users expected the results to be sorted as specified)

You will now need to specify distinct-sort:, dupe-sort: or unique-sort: to override the presorting when finding duplicates.



For example, to find files with a distinct name and the largest file size, search for:
distinct-sort:name;size distinct:name

The distinct-sort:name;size search function will presort your results by name then size before Everything finds distinct names.
This will result in files with the largest sizes being returned.



Everything 1.5.0.1353a also removed the ugly dupe line at the bottom.
NotNull
Posts: 5517
Joined: Wed May 24, 2017 9:22 pm

Re: Find Duplicates

Post by NotNull »

So distinct-sort: works different from sort: ?

sort: Sort order is primary sort, secondary sort
distinct-sort: Sort order is secondary sort, primary sort

Correct or wrong?


Would it be to an option to make the distinct: (and family) do an implicit temporary sort when needed?
So distinct:name does an automatic temporary sort by name to find the distinct items. And sort: just affects the output of these search results without changing the results itself. When no sort: is specified, results are kept sorted in the distinct: order.
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

So distinct-sort: works different from sort: ?
Both functions sort the same way, except:
distinct-sort: will sort BEFORE the distinct: call and sort: will now sort AFTER the distinct: call.

Using distinct-sort: will also override the presort done by distinct: giving the user full control on how distinct: finds duplicated items.
Without using distinct-sort: the properties specified with distinct: are used to perform the presort.


sort: Sort order is primary sort, secondary sort
distinct-sort: Sort order is secondary sort, primary sort

Correct or wrong?
sort:primary-sort;secondary-sort;...
distinct-sort:primary-sort;secondary-sort;...


Would it be to an option to make the distinct: (and family) do an implicit temporary sort when needed?
So distinct:name does an automatic temporary sort by name to find the distinct items. And sort: just affects the output of these search results without changing the results itself. When no sort: is specified, results are kept sorted in the distinct: order.
This is how it should work at the moment.

The current sort will be lost.
Results will be sorted by the properties specified in distinct-sort: or distinct:

For example, if I sort by date modified descending, use distinct:name, results will be sorted by name.
Are you asking for results to remain sorted by date modified descending?
(I did avoid doing this as it's extra work for Everything / slower results and the sort now being by name is quite useful)

You can now use sort: to specify the sort when using distinct:
For example, the following will perform a distinct search on name and sort by date modified descending:
distinct:name sort:dm
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

This is the first of five? replies to a single posting by Void.
void wrote: Sat Feb 04, 2023 12:12 am Everything 1.5 adds support for finding unique or duplicates items in your results.
Everything can find files/folders with duplicated/unique names, sizes, dates and other properties. ...
Group colors
To show files/folders that share the same properties in different colors:
Include the following in your search:
groupcolors:
I had reason to dive into duplicates again, and used group by colours. I got lost after using "groupcolors:", so dived back into research.
Starting at this page and this section, I expected to be able to turn OFF "groupcolors:" with some sort of NOT switch , thus "!groupcolors:" or perhaps "~groupcolors:". I know that I have turned them off a year ago, but here arises a suggestion:-

Where a setting can be made by the use of a string that turns ON that setting, as in "groupcolors:", might there be a universal action, such as ~ or !, or even a general suffix so: "groupcolors:default" that would restore the setting to the installation default?
This convention might apply to ALL sorts of switches, so that a beginner who is exploring Everything can at least crawl out of the pond and start afresh?
I add that this morning I have found the forum posts to turn OFF "groupcolors:", but it took me a lot of searching.
(Perhaps because I am 77 and still on my first coffee of the day)
Thanks, Chris
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

void wrote: Sat Feb 04, 2023 12:12 am Everything 1.5 adds support for finding unique or duplicates items in your results.
Everything can find files/folders with duplicated/unique names, sizes, dates and other properties.
... Double click the DUPE text in the status bar or change the search to clear the find-duplicates view. ...
Indeed, Duplicates are cleared and the search box returns to my original "147 objects found", but why doesn't removing "dupes" also clear away the colour bands that I introduced when using "groupcolors:"?
I have probably misunderstood something here, but I have been thinking of "groupcolors:" as being applied to, or useful with, Duplicates, so that clearing the duplicates effect ought to clear the associated "groupcolors:".
Thanks, Chris
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

void wrote: Sat Feb 04, 2023 12:12 am Everything 1.5 adds support for finding unique or duplicates items in your results.
...
To always show files/folders that share the same properties in different colors when finding duplicates/distinct/unique items:
  • Copy and paste the following into your Everything search box:
    /dupe_group_colors=1
  • Press ENTER in your Everything search box.
  • If successful, dupe_group_colors=1 is shown in the status bar for a few seconds.
I reasoned that "/dupe_group_colors=0" should disable the setting, and following the links to dupe_group_colors and group_colors confirmed my supposition.

So I pasted "/dupe_group_colors=0" in the search box, tapped <Enter>, saw the brief note in the status-bar, and reran my search:-
20240225_05.png
20240225_05.png (96.72 KiB) Viewed 25044 times
Now I am back to my original 147 DOCument objects in the folder tree.
Nothing stops me!
I reset again with "/dupe_group_colors=0", observed the status-bar, exited Everything, reran the search and got the same results as hown above.
Again, I suspect that I am doing something wrong, still, with these colours, but what?
Thanks, Chris
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

void wrote: Sat Feb 04, 2023 12:12 am Everything 1.5 adds support for finding unique or duplicates items in your results.
...
To customize the line color:
  • In Everything 1.5, from the Tools menu, click Options.
  • Click the Advanced tab on the left.
  • To the right of Show settings containing, search for:
    dupe_line
  • Select dupe_line_color (or dupe_line_dark_color for dark mode).
  • Click the color button.
  • Select a new color and click OK.
  • Click OK.
I am going to get a reputation, I know, but someone has to have it :D
20240225_06.png
20240225_06.png (98.35 KiB) Viewed 25043 times
Has the advanced tab disappeared sometime in the past five months?
Thanks, Chris
horst.epp
Posts: 1458
Joined: Fri Apr 04, 2014 3:24 pm

Re: Find Duplicates

Post by horst.epp »

No for me, in the actual version 1.5.0.1367a
Time to update your outdated version :)
Screenshot - 25.02.2024 , 16_12_21.png
Screenshot - 25.02.2024 , 16_12_21.png (37.29 KiB) Viewed 25040 times
NotNull
Posts: 5517
Joined: Wed May 24, 2017 9:22 pm

Re: Find Duplicates

Post by NotNull »

ChrisGreaves wrote: Sun Feb 25, 2024 3:07 pm Has the advanced tab disappeared sometime in the past five months?
(If I had to guess: Higher powers have concluded that some people can not be given the nuclear codes of Everything :D ;) )

Joking aside: the Advanced Options section was introduced 1n 1344a.
ChrisGreaves
Posts: 697
Joined: Wed Jan 05, 2022 9:29 pm

Re: Find Duplicates

Post by ChrisGreaves »

NotNull wrote: Sun Feb 25, 2024 6:22 pm ...the Advanced Options section was introduced 1n 1344a.
Thank you NotNull; I shall upgrade today and then rerun my searches.
Cheers, Chris
kazzybash
Posts: 110
Joined: Mon Mar 02, 2020 9:55 pm

Re: Find Duplicates

Post by kazzybash »

hi all, just to check if I am not doing something wrong: no filters applied, being 100% that there are name- and size duplicates in my drives, they should come up in Everything when querying by 'dupe:' , right? I don't understand how I do not get results (rightclicking the name column does provide me with duplicates). I am using Everything Alpha 1367. Kind regards, Kazzy
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

Please try the following search to find name duplicates:

dupe:



Please try the following search to find size AND name duplicates:

dupe:size;name



There might be an existing find duplicates search.
Please try clearing your existing find duplicates.
To clear your find duplicates:
Double click the DUPE text in the status bar.



To find name duplicates from the column header:
Right click the Name column header and click Find Name Duplicates.



(Does your column header context menu look different?)
kazzybash
Posts: 110
Joined: Mon Mar 02, 2020 9:55 pm

Re: Find Duplicates

Post by kazzybash »

hello void, my header looks the same as yours in the image. Thanks for your reply! Just as in this thread the problem seems to solve itself... nice, but strange! I don't know how Everything works, but maybe the program needs a full computer restart to function properly sometimes.

Anyway, I do have two further questions :mrgreen: .

1)

I was wondering, maybe this has been addressed, if it would be possible/how it is possible to select duplicates (in order to further process them - deletion, renaming, etc) in the results. I am doing this by hand now, but are there ways to achieve this by code already?

2)

When I search for duplicate folders dupe:size I get results of folder-parent sized X bytes and it's nested folder sized identically. However I do not consider those two identical (since when I delete the nested folder that contains the true concent, the size of the parent folder is dramatically reduced to 0 bytes :P ). I would need to add the hierarchical level (say: folder: depth:2..2 dupe:size) to the query, am I correct?


My program-to-go in this department is the (not unlike Everything) excellent Alldup. Even though Alldup is very good, I already found application for the dupe-function in Everything.


Kind regards, kazzy
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

I was wondering, maybe this has been addressed, if it would be possible/how it is possible to select duplicates (in order to further process them - deletion, renaming, etc) in the results. I am doing this by hand now, but are there ways to achieve this by code already?
I recommend checking each file before deleting.

Deleting/renaming must be done manually.



The following will list files in Folder A, where they exist with the same name and size in folder B:

"c:\folder\A\" $size:==getproperty("C:\folder\B\"$name:,"size")

Use this as a guide only, as the file content may differ.



The following will list files in Folder A, where they exist with the same name, size and sha256 in folder B:

"c:\folder\A\" $size:==getproperty("C:\folder\B\"..$name:,"size") $sha256:==getproperty("C:\folder\B\"..$name:,"sha256")

Please note: this search will be really slow.


When I search for duplicate folders dupe:size I get results of folder-parent sized X bytes and it's nested folder sized identically. However I do not consider those two identical (since when I delete the nested folder that contains the true concent, the size of the parent folder is dramatically reduced to 0 bytes :P ). I would need to add the hierarchical level (say: folder: depth:2..2 dupe:size) to the query, am I correct?
Include your folders of interest in your search, for example:

parent:"c:\folder\A" | parent:"c:\folder\B" dupe:size
kosp
Posts: 9
Joined: Sat Mar 23, 2024 9:49 pm

Re: Find Duplicates

Post by kosp »

hello
can you find duplicates with multiple properties
for example find duplicates by size and also by dimension
so it will only show you the files that have the same size and the same dimension

is that possible and how?

also can you automatically select the first choice of all results or the second choice of all results and delete them?
NotNull
Posts: 5517
Joined: Wed May 24, 2017 9:22 pm

Re: Find Duplicates

Post by NotNull »

kosp wrote: Mon Mar 25, 2024 2:33 pm can you find duplicates with multiple properties
Yes, as described in the opening post:
void wrote: Sat Feb 04, 2023 12:12 am To find files/folders with multiple duplicated properties:

Include the following in your search:
dupe:<property-list>
where <property-list> is a semicolon (;) delimited list of property names.

kosp wrote: Mon Mar 25, 2024 2:33 pm also can you automatically select the first choice of all results or the second choice of all results and delete them?
Everything will not delete files automatically.
raccoon
Posts: 1017
Joined: Thu Oct 18, 2018 1:24 am

Re: Find Duplicates

Post by raccoon »

kosp wrote: Mon Mar 25, 2024 2:33 pm also can you automatically select the first choice of all results or the second choice of all results and delete them?
Click a column header, sort the results according to your desires (ie, Path), and then select the files and press the Delete key, and then the Enter key to confirm. You can also choose the Dupe function to exclude first match, allowing you to Select All (Ctrl+A) and Delete.
void
Developer
Posts: 17158
Joined: Fri Oct 16, 2009 11:31 pm

Re: Find Duplicates

Post by void »

Everything 1.5.0.1384a improves finding duplicates.

Added a dupe-from: search function.

Find duplicates where there is at least one duplicate from the specified folder. (includes subfolders)

For example:
*.jpg dupe-from:"C:\My Photos" dupe:size




Added a dupe-min: search function.

Find duplicates where the number of duplicates is greater than or equal to the specified minimum count.

Examples:
dupe-min:3 dupe:size



Added a dupe-max: search function.

Limit the number of duplicates to the specified count.

Examples:
dupe-max:30 dupe:size
Post Reply