Why Don't Other Softwares Use the MFT?

Off-topic posts of interest to the "Everything" community.
therube
Posts: 1595
Joined: Thu Sep 03, 2009 6:48 pm

Why Don't Other Softwares Use the MFT?

Postby therube » Fri Mar 04, 2016 4:57 pm

Why Don't Other Softwares (generally) Use the MFT?

I say this as I'm running my duplicate file cleaner, that I know will take a good 15+ minutes to scan a directory tree of 1M files (including 30K directories).

Doing the same, including indexing sizes, in Everything is a matter of seconds.
Once that is done, sizedupe: is already done, then its all down to computing & comparing hashes.

(In my case, I'm doing a name+size+content compare.)

So why don't they?
If they did, just think how much more efficient they could be.

void
Site Admin
Posts: 3122
Joined: Fri Oct 16, 2009 11:31 pm

Re: Why Don't Other Softwares Use the MFT?

Postby void » Sat Mar 05, 2016 2:00 am

Why Don't Other Softwares (generally) Use the MFT?


A couple reason I can think of:
  • Existing file scans (FindFirstFile) while slow, work well enough (support for network drives and non-ntfs volumes, easy to implement).
  • Lack of portability
  • Basic API, the current FSCTL_ENUM_USN_DATA API only allows indexing file names (no sizes).
  • You must build the full directory tree with FSCTL_ENUM_USN_DATA which can require large amounts of RAM and can be expensive to sort/lookup parent directories.
  • If you want to index sizes you need to implement your own NTFS driver (not an easy task, there are a few helpful resources)
  • There is no MFT on ReFS volumes (Microsofts new file system).

Link
Posts: 14
Joined: Thu Nov 03, 2011 10:08 pm

Re: Why Don't Other Softwares Use the MFT?

Postby Link » Wed Mar 09, 2016 3:34 pm

  • Why would you need to write your own NTFS driver? Yes, it may be more compatible but a real pita to write and debug.
  • There are existing apis for getting filesize, very slow. Can be done in a background thread lazily and lazily update the virtual listview with the new information. Only expensive (excessive hard drive usage) during the initial scan.
    Alternately you can read the MFT table combined with the data given via the NTFS journal directly to get filesize. That's super fast.FAT table can also be read directly.
  • Who use ReFS volumes? Isn't it still experimental?


Return to “Off-topic discussion”