Opened 8 years ago

Closed 19 months ago

Last modified 16 months ago

#3432 closed defect (fixed)

wxDir::FindFirst/Next() unexpectedly match pattern against short file names too

Reported by: mentat_kk Owned by:
Priority: normal Milestone:
Component: wxMSW Version: stable-latest
Keywords: wxDir base simple Cc: mentat_kk, biol75
Blocked By: Blocking:
Patch: yes

Description

filespec parameter does not seem to work properly (as I
understand).

wxString spec("*.bin");
wxDir::GetAllFiles(path, &fileList, spec, wxDIR_FILES);

also adds files with extensions bin2, bin234 etc...

I checked the tracker but couldn't see any reports,
sorry if I missed anything..

I use wx 2.6.2 (I know I should update) on windows.

Attachments (2)

fix3432.diff download (2.4 KB) - added by catalin 19 months ago.
3432_testcase.diff download (1.8 KB) - added by catalin 19 months ago.

Download all attachments as: .zip

Change History (24)

comment:1 Changed 8 years ago by biol75

Is this on MSW ? If so it is a well known MS windows
"feature"; they never expected files with more than 3
letters in the extensions from old DOS days

:-(
chris

comment:2 Changed 8 years ago by mentat_kk

Yes it is on MS Windows (MSW), I see the problem..

but I guess there is a workaround, you can check the
extension with whatever win32 method returned and force to
exactly match with the spec and add it to the returned list.
Should be easy.. am I right?

comment:3 Changed 8 years ago by biol75

Is this on MSW ? If so it is a well known MS windows
"feature"; they never expected files with more than 3
letters in the extensions from old DOS days

:-(
chris

comment:4 Changed 8 years ago by mentat_kk

can you please re-read my previous response, looks like you
copy/pasted yours, I won't..

comment:5 Changed 8 years ago by vadz

I think this is a bug in Borland CRT, do you use Borland by
chance?

comment:6 Changed 8 years ago by mentat_kk

nope, I use VC++ 2003, sorry I forgot to tell..

comment:7 Changed 6 years ago by wojdyr

  • Component set to wxMSW
  • Keywords wxDir base added

comment:8 Changed 21 months ago by oneeyeman

Vadim,
This is a legitimate bug which I just confirmed by doing this:

C:\wxWidgets\samples\widgets\icons>dir *.xpm

Volume in drive C has no label.
Volume Serial Number is E48C-AC10

Directory of C:\wxWidgets\samples\widgets\icons

01/18/2013 05:50 PM 678 bmpbtn.xpm
01/18/2013 05:50 PM 1,556 bmpcombobox.xpm
01/18/2013 05:50 PM 1,551 button.xpm
01/18/2013 05:50 PM 1,553 checkbox.xpm
01/18/2013 05:50 PM 561 choice.xpm
01/18/2013 05:50 PM 1,553 choicebk.xpm
01/18/2013 05:50 PM 4,946 clrpicker.xpm
01/18/2013 05:50 PM 1,553 combobox.xpm
01/18/2013 05:50 PM 5,091 datepick.xpm
01/18/2013 05:50 PM 1,552 dirctrl.xpm
01/18/2013 05:50 PM 5,303 dirpicker.xpm
01/18/2013 05:50 PM 5,321 filepicker.xpm
01/18/2013 05:50 PM 4,824 fontpicker.xpm
01/18/2013 05:50 PM 1,550 gauge.xpm
01/18/2013 05:50 PM 1,553 hyperlnk.xpm
01/18/2013 05:50 PM 1,553 listbook.xpm
01/18/2013 05:50 PM 1,552 listbox.xpm
01/18/2013 05:50 PM 1,553 notebook.xpm
01/18/2013 05:50 PM 1,555 odcombobox.xpm
01/18/2013 05:50 PM 1,550 radiobox.xpm
01/18/2013 05:50 PM 1,547 scrolbar.xpm
01/18/2013 05:50 PM 1,551 slider.xpm
01/18/2013 05:50 PM 1,340 spinbtn.xpm
01/18/2013 05:50 PM 1,294 statbmp.xpm
01/18/2013 05:50 PM 1,552 statbox.xpm
01/18/2013 05:50 PM 1,553 stattext.xpm
01/18/2013 05:50 PM 1,549 text.xpm
01/18/2013 05:50 PM 4,997 timepick.xpm
01/18/2013 05:50 PM 1,551 toggle.xpm

29 File(s) 63,842 bytes

0 Dir(s) 11,869,315,072 bytes free

C:\wxWidgets\samples\widgets\icons>dir *.xpm1

Volume in drive C has no label.
Volume Serial Number is E48C-AC10

Directory of C:\wxWidgets\samples\widgets\icons

01/18/2013 05:50 PM 5,091 datepick.xpm1

1 File(s) 5,091 bytes
0 Dir(s) 11,869,315,072 bytes free

This from running "cmd" on Windows XP SP2.
And this means we should eliminate unneeded files with different extension.

Will look into it.

comment:9 Changed 21 months ago by oneeyeman

But maybe all it needs is mentioning this in the documentation as it's just too many possibilities exist.

Something like:

"wxDir::GetAllFiles() will return all matching file names. In case of the matching pattern, like "*.bin" it will return "*.bin", "*.bin2", etc."

It is much easier to deal with situation like this in the user code as the user knows which files (s)he is looking for.

comment:10 Changed 21 months ago by vadz

  • Cc vadz removed
  • Keywords simple added

The simplest fix is certainly to check that the files returned by the (buggy) CRT function do have the correct extension. This shouldn't be difficult to do.

comment:11 Changed 21 months ago by catalin

  • Version set to 2.9-svn

FindFirstFile and FindNextFile search "includes the long and short file names", as MS docs say.
For a file named test.bin2 the short name will be something like test~1.bin so will be found when searching for *.bin.
Maybe this needs only a documentation update to mention that it includes short names, or in case of a code change, that it does *not* include short names.

comment:12 follow-up: Changed 21 months ago by vadz

  • Status changed from new to confirmed
  • Summary changed from wxDir::GetAllFiles filespec problem to wxDir::FindFirst/Next() unexpectedly match pattern against short file names too

Thanks for the explanation, this at least explains why does this happens, so it's good to know.

But I think short names are an anachronism which nobody really uses any more, so I'd still be for checking the returned file name with a simple wxString::Matches() (this should be enough for Windows file patterns, otherwise we also have wxMatchWild() in source:wxWidgets/trunk/include/wx/filefn.h#L619).

IMO nobody expects to get "foo.binary" when searching for "*.bin", so no code should be broken by this change.

comment:13 in reply to: ↑ 12 Changed 21 months ago by catalin

  • Patch set

Replying to vadz:

But I think short names are an anachronism which nobody really uses any more, so I'd still be for checking the returned file name with a simple wxString::Matches() (this should be enough for Windows file patterns, otherwise we also have wxMatchWild() in source:wxWidgets/trunk/include/wx/filefn.h#L619).

Patch attached using wxMatchWild() because its code looks better. Otherwise they both had the same results.

comment:14 Changed 19 months ago by catalin

It would be good if somebody can check the attached patch. I have some time if it needs to be changed.

comment:15 follow-up: Changed 19 months ago by vadz

The code of wxString::Matches() appears to have been at least somewhat optimized, so I think it should be faster (but I didn't measure it). More importantly, looking at wxMatchWild() more carefully, I don't think it does what we need here. First of all, it ignores anything starting with dot by default. This could be disabled by passing it false for dot_special argument but this isn't done by the patch. Second, it handles backslashes in some way whereas I'm pretty sure that they're not special at all for Windows here. Third, I'm not sure about whether "better looking" should apply to the code which uses goto to jump inside an else section of a conditional, this looks pretty gross. So I'd use wxString::Matches() finally...

Other than that, why does the patch do AfterLast('\\')? It's not clear why is this necessary exactly and it seems suspicious that it's called from FindFirst() but not when FindNext() is called from Read().

Also, for consistency, I'd add "spec" parameter before "finddata" in FindNext() to use the same order as in FindFirst().

And, finally, there should be no problem with moving GetNameFromFindData() before FindFirst() to avoid the need to forward declare it.

comment:16 in reply to: ↑ 15 Changed 19 months ago by catalin

Replying to vadz:

I'm not sure about whether "better looking" should apply to the code which uses goto to jump inside an else section of a conditional, this looks pretty gross.

Oh, yes, I didn't notice it. Probably because I had already noticed the goto in wxString::Matches(), though it doesn't look as bad.. That's a surprising frequency of goto use..

So I'd use wxString::Matches() finally...

Ok.

Other than that, why does the patch do AfterLast('\\')?

That must be a leftover. I probably thought that ::FindFirstFile() finds the file including its path but it's not like that, it only gets the file name.

Also, for consistency, I'd add "spec" parameter before "finddata" in FindNext() to use the same order as in FindFirst().

Ok.

And, finally, there should be no problem with moving GetNameFromFindData() before FindFirst() to avoid the need to forward declare it.

There is. In the current code only GetNameFromFindData() calls FindNext(). After the changes both FindFirst() and FindNext() call GetNameFromFindData().

I'll update the patch with the above changes.

Changed 19 months ago by catalin

comment:17 Changed 19 months ago by catalin

Patch updated. I've moved the functions around and it might be more difficult to read.

comment:18 Changed 19 months ago by vadz

Thanks, I'm applying this with some minor changes, please recheck them and let me know if I broke anything:

  1. Added FreeFindData() call to FindFirst() as I think we were leaking the find handle otherwise in case of a search which was successful for Windows but unsuccessful from our point of view.
  2. Corrected check for fd validity in FindFirst(), it must be compared with INVALID_HANDLE_VALUE, not 0 (and we already have a helper function to do it).
  3. Extracted the check into a separate function and check filter.empty() there as I couldn't convince myself that it was always non-empty even inside FindNext().
  4. Got rid of recursion in FindNext(), it just seems unnecessary.
  5. Also got rid of spec argument there (idem).

Considering the complexity of this code I think it would be really great to add a test for this case to DirTestCase::Enum() in tests/file/dir.cpp. If you could please do it, it would be great. TIA!

comment:19 Changed 19 months ago by VZ

  • Resolution set to fixed
  • Status changed from confirmed to closed

(In [73790]) Check that files returned from wxDir::FindXXX() match the filter.

Native Windows functions used by wxDir check the filter against both the short
and the long name resulting in unexpected results, e.g. searching for
"foo.baz" would find "foo.bazaar".

Fix this by explicitly rechecking that we have a valid match ourselves.

Closes #3432.

Changed 19 months ago by catalin

comment:20 Changed 19 months ago by catalin

Replying to vadz:

I'm applying this with some minor changes

They all look fine to me.

Considering the complexity of this code I think it would be really great to add a test for this case to DirTestCase::Enum() in tests/file/dir.cpp.

I've added a patch with a test case but to DirTestCase::Traverse() because from what I could see DirTestCase::Enum() deals only with dirs..

comment:21 Changed 18 months ago by VZ

(In [73836]) Add test for correct short/long file names in wxDir.

Verify that enumerating the files using the pattern *.foo doesn't match
*.foo.bar files, as it used to do under MSW.

See #3432.

comment:22 Changed 16 months ago by VZ

(In [74243]) Restore case-insensitivity for file name matching under Windows.

This was broken by the changes of r73790, see #3432. Fix this by converting
both the file name and the wildcard mask to the upper case before checking
whether the former matches the latter.

Closes #15243.

Note: See TracTickets for help on using tickets.