Regexp matching of foreign characters

Have a suggestion for "Everything"? Please post it here.
Post Reply
Jerry
Posts: 51
Joined: Wed May 05, 2010 8:32 pm

Regexp matching of foreign characters

Post by Jerry » Wed May 05, 2010 8:41 pm

Hi all,
I think I mentioned this once privately to David but what is the status of getting the regexp facility to match on foreign characters, for example, umlauts, diacriticis, etc.? I guess it might be a general issue of matching unicode, not sure, and I realize it may be dependent on a third-party regexp library. But I really really would like to see this capability in what is otherwise a fantastic program.

As an example, say I have music files by Handel with some spelled Handel and others as Händel. I want to be able to just enter the regexp H.ndel and catch them all. Similarly, Jan.cek to match Janacek or Janácek.

Jerry

therube
Posts: 2579
Joined: Thu Sep 03, 2009 6:48 pm

Re: Regexp matching of foreign characters

Post by therube » Thu May 06, 2010 2:08 am

Just to point out that the wildcard (i.e. not regex) '?' character also will not work.

So H?andel will find Handel, but not Händel.

void
Site Admin
Posts: 5479
Joined: Fri Oct 16, 2009 11:31 pm

Re: Regexp matching of foreign characters

Post by void » Fri May 07, 2010 10:38 am

The regex parser only supports ASCII characters.
Adding support for unicode regex is on my "Things to do" list.
Just to point out that the wildcard (i.e. not regex) '?' character also will not work.

So H?ndel will find Handel, but not Händel.
By default, "Everything" will only search for ASCII characters.
You will need to disable fast ASCII search for H?ndel to work correctly.

To disable fast ASCII search with "Everything" 1.2.1.371:
  • Exit "Everything".
  • Add the following line to the end of your Everything.ini file:
  • Code: Select all

    disable_fast_ascii_search=1
  • Restart "Everything".
To disable fast ASCII search with "Everything" 1.2.1.351a or later:
  • In "Everything", from the Tools menu, click Options.
  • Click the View tab.
  • In the Advanced settings, from the General folder, uncheck fast ASCII search.
  • Click OK.
I am considering disabling fast ASCII search by default for the next release of "Everything".

Post Reply