Opened 7 years ago

Closed 6 years ago

Last modified 6 years ago

#4115 closed defect (fixed)

wxExecute fails to escape spaces in argv on MS Windows

Reported by: bdimm Owned by:
Priority: normal Milestone:
Component: wxMSW Version:
Keywords: Cc: bdimm
Blocked By: Blocking:
Patch: yes

Description

The "argv" version of wxExecute, i.e. the one with signature:
wxExecute(char , int, wxProcess *)
does not quote the strings or escape the spaces in the strings passed in the first argument on Windows, but it does on Linux. This creates problems for writing code that works on both platforms.

For example:
char *argv[] = {"myprog", "without quotes", "\"with quotes\"", NULL};
wxExecute(argv, wxEXEC_ASYNC, this);

On Linux, myprog will be invoked with argv:

0 -> myprog
1 -> without quotes
2 -> "with quotes"

In contrast, Windows will have argv:

0 -> myprog
1 -> without
2 -> quotes
3 -> with quotes

On Windows, the single argument "without quotes" is split into two arguments by the time myprog receives it, because wxWidgets 2.8.4 on Windows doesn't quote the arguments or escape spaces. If you try to remedy this problem by quoting all arguments before calling wxExecute, your code will no longer work on Linux since myprog will receive extraneous quotes.

I would argue that the Linux behavior is the correct one -- argv indicates which items should be considered to be separate arguments, so wxWidgets should handle any platform-dependent manipulations like escaping spaces or adding quotes (and escaping any quotes that those quotes enclose) that are necessary. The Windows version should be changed to behave consistently with the Linux version.

Attachments (2)

wxexecute.diff download (601 bytes) - added by briis 6 years ago.
wxexecute2.diff download (643 bytes) - added by briis 6 years ago.

Download all attachments as: .zip

Change History (24)

comment:1 Changed 6 years ago by briis

  • Patch set
  • Status changed from new to confirmed

This is still the case with wxMSW in 2.9-trunk. Attached patch fixes this.

Changed 6 years ago by briis

comment:2 Changed 6 years ago by neis

See also #4607, especially, what happens to "with\" some \"quotes" before and after your patch? And what is actually the expected behaviour?

comment:3 Changed 6 years ago by briis

"with\" some \"quotes" will be passed on as-is in my patch. #4607 raises a point, though: Are filenames with quotes allowed on FAT and/or NTFS? In any case, I think filenames containing quotes are a bad practice, but if they're legal it should of course work. If they're illegal wxExecute should probably assert.

This is an area where cross-platform is a bit tricky. If we want all the users' filenames to be truly portable we will have to allow only a very limited subset of characters. I do believe it is reasonable, though, to have wxExecute for different platforms behave differently in that regard, as long as the behavior is legal on each system.

Come to think of it, it should probably not even assert, it should report an error (in my code, I'd throw an exception); the filename-with-quotes might come from a user, i.e. outside the developer's control...

Opinions?

comment:4 follow-up: Changed 6 years ago by vadz

Quotes are not allowed in Windows file names, see http://support.microsoft.com/kb/177506 and AFAIK this applies to all file systems.

I agree that wxExecute() should return an error (and log the error message explaining what's wrong exactly) when an invalid file name is used.

comment:5 Changed 6 years ago by bdimm

I'm puzzled by all of this talk about whether or not quotes are allowed in filenames. Bug #4607 is dealing with a situation where a filename is passed to a function. This bug, however, is passing arbitrary data to a program on the command line. There is no reason why the strings in argv have to be filenames (aside from argv[0], of course).

The patch by briis doesn't apply the quotes if the first and last characters of the string are quotes. I suppose this is done for backward compatibility, so as not to break old code that manually inserted quotes on MSW because wxExecute() wasn't behaving correctly. However, that means that MSW won't behave the same as Linux, i.e. we won't have cross-platform compatibility. The Linux rule is very simple -- whatever we pass as argv to wxExecute is exactly what the receiving program should receive in its argv, including quote and spaces. If we are going to have cross-platform compatibility, I don't think we can also have backward compatibility. Of course, true cross-platform compatibility won't be possible if MSW can't pass quotes in command line arguments. I realize you can't have quotes in filenames, but command line arguments aren't filename, so I'm not sure what the situation is there (i.e. I don't know if it is possible to escape a quote character in all versions of MSW -- I have a very vague memory that it may be possible on NT/XP but not on Win98).

comment:6 in reply to: ↑ description ; follow-up: Changed 6 years ago by briis

  • Status changed from confirmed to infoneeded_new

Oh, dear. I got totally distracted by the reference to #4607. The arguments on the command line are, of course, not necessarily filenames.

The logic in the patch is relatively simple: If there is no space in the argument OR the argument is already quoted, don't add quotes, otherwise (spaces and no quotes) add them. Given your original example:

char *argv[] = {"myprog", "without quotes", "\"with quotes\"", NULL};
wxExecute(argv, wxEXEC_ASYNC, this);

On Linux, myprog will be invoked with argv:

0 -> myprog
1 -> without quotes
2 -> "with quotes"

In contrast, Windows will have argv:

0 -> myprog
1 -> without
2 -> quotes
3 -> with quotes

With the patch, argv in myprog will look like:

0 -> myprog
1 -> without quotes
2 -> with quotes

Are you saying that you believe it should be

0 -> myprog
1 -> without quotes
2 -> "with quotes"

instead?

That is doable, at least with cmd.exe ... I'm not sure about COMMAND.COM ...

But you are right that the filename business I and others started mixing up in this is not really a wxExecute problem, but a problem that myprog must handle when executed. argv[1] and argv[2] above may very well not be filenames...

comment:7 Changed 6 years ago by briis

Just ran a test program through COMMAND.COM and it too accepts the escaped version. Of course argv[0] is MYPROG in that case, but otherwise no problems...

comment:8 in reply to: ↑ 6 ; follow-up: Changed 6 years ago by bdimm

  • Status changed from infoneeded_new to new

Replying to briis:

Are you saying that you believe it should be

0 -> myprog
1 -> without quotes
2 -> "with quotes"

instead?

Yes, that's what I'm saying, but others may disagree. My reasons for saying it should be that way are:
1) The MSW and Linux/Unix versions won't be compatible otherwise.
2) The rule for the Linux approach is simple, and therefore easy to document and unlikely to lead to surprises.
3) If the quotes are dropped for argv[2] in my example, how would the user call wxExecute in a situation where the quotes really did need to be included?

comment:9 in reply to: ↑ 4 Changed 6 years ago by neis

Replying to vadz:

Quotes are not allowed in Windows file names, see
http://support.microsoft.com/kb/177506 and AFAIK this applies to all file systems.

The point I was trying to make is that Windows doesn't care where exactly you put the quotes, '"some file"', 'some" "file', and 'some" file"' all refer to the same file, see this windows session (language specific part of "dir" output snipped):

C:\>copy con some" file"
Z

1 Datei(en) kopiert.

C:\>dir some" "file

30.06.2008 12:11 0 some file

C:\>dir "some file"

30.06.2008 12:11 0 some file


Too me, this means, if you start ignoring quotes, you should ignore all of them,
not only those surrounding an argument but even those contained in "the middle".

comment:10 in reply to: ↑ 8 Changed 6 years ago by neis

Replying to bdimm:

Replying to briis:

Are you saying that you believe it should be

0 -> myprog
1 -> without quotes
2 -> "with quotes"

instead?

Yes, that's what I'm saying, but others may disagree. My reasons for saying it should be that way are:
1) The MSW and Linux/Unix versions won't be compatible otherwise.
2) The rule for the Linux approach is simple, and therefore easy to document and unlikely to lead to surprises.
3) If the quotes are dropped for argv[2] in my example, how would the user call wxExecute in a situation where the quotes really did need to be included?

The problem is, if the argument happens to be a filename (which is not that rare) and if it was entered by the user "with quotes" (because he remembered that spaces need quoting), this will result in an error on windows but succeed on Linux/Unix (though with a somewhat strange filename), which I wouldn't call very compatible behaviour either.
But I have to agree with 2) and 3), so I guess that's still the best way to go, even if it doesn't really help for #4607.

comment:11 Changed 6 years ago by briis

I don't think quotes should be ignored. wxExecute is in a different position from the #4607 issue here, because the arguments are likely not filenames, though they may be. wxExecute has no way to tell, so it can't start tampering.

I redid the patch, so any existing quotes are escaped before considering adding quotes for spaces. This results in the original quotes appearing in the executed program's argv, as is the case on linux (discussed above).

Changed 6 years ago by briis

comment:12 follow-up: Changed 6 years ago by vadz

  • Resolution set to fixed
  • Status changed from new to closed

I agree that argv version should preserve the argument contents exactly, if you use user-entered file names to construct the wxExecute() argument you can just use the string version and hope the user didn't forget the quotes (BTW to help with this we could provide some wxQuoteArgument() function).

So I applied wxexecute2.diff as r54441, thanks!

P.S. briis: please let us know your name if you want to have a proper attribution in docs/changes.txt, thanks.

comment:13 in reply to: ↑ 12 ; follow-ups: Changed 6 years ago by briis

Replying to vadz:

P.S. briis: please let us know your name if you want to have a proper attribution in docs/changes.txt, thanks.

Shouldn't you be able to get that from trac somehow? Anyway, my name is Brian Ravnsgaard Riis, which is also entered into my trac account. Is it possible to enable others to see my name in trac?

comment:14 in reply to: ↑ 13 Changed 6 years ago by vadz

Replying to briis:

Replying to vadz:

P.S. briis: please let us know your name if you want to have a proper attribution in docs/changes.txt, thanks.

Shouldn't you be able to get that from trac somehow?

I'd love to but I don't see how.

Anyway, my name is Brian Ravnsgaard Riis, which is also entered into my trac account. Is it possible to enable others to see my name in trac?

No idea but if you find it, please let me know.

Anyhow, I've updated docs/changes.txt, thanks!

comment:15 follow-up: Changed 6 years ago by bdimm

This is probably obvious, but I'll say it just in case: It's going to be very important to warn users to modify their code to be compatible with this patch when it is released. Anyone who recognized this problem and kludged their code to insert quotes on MSW to avoid problems with spaces is going to need to remove the kludge because the patched version is going to escape those added quotes and really mess things up. Don't get me wrong, escaping the quotes (as the patch does) is the right thing to do, but it does create a backwards compatibility problem.

This raises another issue -- how would someone deal with the backwards compatibility problem if they dynamically linked wxWidgets? All of the version tools:
http://docs.wxwidgets.org/stable/wx_versionfunctions.html
seem to be macros, so they reflect the version that is present at compile-time, not the version that is available at run-time, so I don't see how you could write code that adapts its behavior (e.g. whether or not to insert extra quotes) to the version of the DLL found at run-time. Am I missing something?

comment:16 in reply to: ↑ 13 Changed 6 years ago by wojdyr

Replying to briis:

Shouldn't you be able to get that from trac somehow? Anyway, my name is Brian Ravnsgaard Riis, which is also entered into my trac account. Is it possible to enable others to see my name in trac?

No, not without hacking Trac
it's discussed here: http://trac.edgewall.org/ticket/3737

comment:17 in reply to: ↑ 15 ; follow-up: Changed 6 years ago by vadz

Replying to bdimm:

This is probably obvious, but I'll say it just in case: It's going to be very important to warn users to modify their code to be compatible with this patch when it is released. Anyone who recognized this problem and kludged their code to insert quotes on MSW to avoid problems with spaces is going to need to remove the kludge because the patched version is going to escape those added quotes and really mess things up.

I didn't realize this patch could create compatibility problems. What is an example of a (more or less realistic) code which worked before but wouldn't work now?

Don't get me wrong, escaping the quotes (as the patch does) is the right thing to do, but it does create a backwards compatibility problem.

If the potential for breakage is not vanishingly small the patch should be unapplied from 2.8 branch.

This raises another issue -- how would someone deal with the backwards compatibility problem if they dynamically linked wxWidgets? All of the version tools:
http://docs.wxwidgets.org/stable/wx_versionfunctions.html
seem to be macros, so they reflect the version that is present at compile-time, not the version that is available at run-time, so I don't see how you could write code that adapts its behavior (e.g. whether or not to insert extra quotes) to the version of the DLL found at run-time. Am I missing something?

All ABI-compatible wx versions (i.e. the ones you can link to dynamically) are supposed to behave in the same way so there should be no need for run-time version checking. Of course, maybe we should still introduce it, just in case...

comment:18 in reply to: ↑ 17 Changed 6 years ago by bdimm

Replying to vadz:

I didn't realize this patch could create compatibility problems. What is an example of a (more or less realistic) code which worked before but wouldn't work now?

Compatibility problems will occur if the programmer has basically implemented the patch directly in their own code to compensate for wxExecute not working correctly. Addition of the patch will cause the same fix to be applied twice. If wxExecute will be applied with any arguments that are filenames (which is inevitable since argv[0] is a filename), the programmer must have already kludged in the equivalent of the patch or nothing would work if the filename contains spaces, and spaces are very common on Windows thanks to spaces in numerous top-level directories (e.g. "C:\Program Files\").

wxFileDialog fname_dialog(this, wxT("Select a file to operate on"), ...);
if (fname_dialog.ShowModal() == wxID_OK)
   {
   wxString fname = fname_dialog.GetPath();
#ifdef WIN32
   fname = "\"" + fname + "\"";  // kludge to make wxExecute() work for filenames containing spaces on MSW
#endif
   char *argv[3];
   argv[0] = "myprogram";
   argv[1] = fname.c_str();
   argv[2] = NULL;
   wxExecute(argv, wxEXEC_ASYNC, this);
   }

If the patch is NOT applied, and fname is "file with spaces" you get (on Windows):

myprogram "file with spaces"

If the patch IS applied, you get:

myprogram "\"file with spaces\""

which I believe will fail since I don't think filenames are allowed to contain quotes (and even if it didn't fail, it would be wrong).

comment:19 Changed 6 years ago by vadz

Thanks for the explanation, I agree this is a problem but I hope that keeping the argument unchanged if it's already quoted should be enough to preserve the compatibility in the great majority of cases while still making the behaviour more useful than before. This means we still don't handle quoted parameters in the same way under MSW and Unix but IMO it's a reasonable trade-off.

Any comments about r54695 are welcome, thanks!

comment:20 follow-up: Changed 6 years ago by bdimm

  • Resolution fixed deleted
  • Status changed from closed to reopened

First, I want to say that my example above was based on the wxexecute2.diff code provided by briis. The r54695 code is different -- it skips quoting the argument if the argument already contains quotes as the leftmost and rightmost characters, so it is actually more similar to wxexecute.diff than wxexecute2.diff, and it suffers from the same problems that we've already discussed for wxexecute.diff:

1) As I mentioned in comment 8, it is virtually impossible to call the function with an argument that is supposed to be enclosed in quotes when it is received by the program being called. Are we actually going to have documentation that says "You must manually escape quotes and spaces if the first and last characters are quotes, but otherwise you shouldn't escape quotes."?

2) As mentioned by neis in comment 9, you are assuming that anyone that kludged around the problems in the old version of wxExecute did so by wrapping quotes around the outside, but that's not the only way to do it. These are equivalent:

dir "with spaces"
dir with" "spaces

So your patch works fine in the first case, but not the second.

vadz, I really disagree with your philosophy here. Your patch is looking at the function's inputs and guessing at what the programmer intended (i.e. were the quotes intended to be quotes, or were they intended to escape spaces). Guessing is simply not acceptable. I don't care whether your guess is right 99% of the time, or 99.999% of the time. I aim, even if unsuccessfully, for my application to work correctly 100% of the time, and I can't possibly achieve that if the library I depend on fails to do what is expected 1% or even 0.001% of the time. Functions in a good library should be very clear about their behavior, and they should be as simple as possible, so programs built upon them have minimal risk of errors. It should be clear how to take arbitrary input from the keyboard and pass it through the functions without the data being mutilated.

The old version of wxExecute would fail often, since spaces are common, but the kludge to work around the problem was clear and 100% reliable -- escape quotes and spaces on Windows only. With patch r54695, failures are far less frequent, but it is virtually impossibly to work around the failures. To write code that works correctly 100% of the time, you essentially have to subvert the guessing that is done by the patch.

Here is my proposal: If you want to keep backward compatibility for those that have already kludged around the problem with spaces on Windows, keep the argv version of wxExecute as it originally was, but deprecate it. Add a new function called wxExecuteArgv that has the correct, fully cross-platform, behavior -- i.e. the user NEVER escapes any characters in the argv array on any operating system. As people migrate from wxExecute to wxExecuteArgv their code will get simpler (no more kludges needed), and it will be 100% clear what their inputs are intended to mean with wxExecuteArgv (i.e. no guessing about what quotes were intended to mean). Agree?

comment:21 in reply to: ↑ 20 ; follow-up: Changed 6 years ago by vadz

  • Resolution set to fixed
  • Status changed from reopened to closed

Replying to bdimm:

1) As I mentioned in comment 8, it is virtually impossible to call the function with an argument that is supposed to be enclosed in quotes when it is received by the program being called.

I think this is sufficiently rare to be able to live with this limitation in 2.8. I'd be surprised if people needing this were not already using the wxString overload of wxExecute anyhow considering the hoops they'd need to jump through to make it work under different platforms.

2) As mentioned by neis in comment 9, you are assuming that anyone that kludged around the problems in the old version of wxExecute did so by wrapping quotes around the outside, but that's not the only way to do it. These are equivalent:

dir "with spaces"
dir with" "spaces

So your patch works fine in the first case, but not the second.

I really, really don't think that anybody would take care to tokenize the string being passed to wxExecute() just in order to quote spaces in the middle. This is a nice theoretical point but I do wonder if you seriously think that anybody has existing code doing anything else than prepending and appending the quote to a value potentially containing spaces.

vadz, I really disagree with your philosophy here. Your patch is looking at the function's inputs and guessing at what the programmer intended

I'm not sure if you realize that this guessing is only done in 2.8 version. There is none in the svn trunk. And in 2.8 it is unavoidable to have this hack because we obviously can't just break the vast majority (the 99.99% of your argument) of the existing code in a stable branch. So the only alternative to doing what I did is to revert the fix completely and just document that you need to use quotes under Windows but not under Unix and only in versions of wx up to 2.8 but not 2.9+ which doesn't need quotes anywhere. I don't think this the ideal thing to do, do you?

But, again, this is the only other choice for 2.8. Adding new function (wxExecuteArgv) there is useless because relatively few people are going to switch to it in 2.8 lifetime and I'm reluctant to introduce it when we have so many wxExecute() overloads already.

So for 2.8 the choice is between doing nothing and applying this. I realize the problems with my changes but I believe that on balance it's still better than doing nothing. If anybody has solid arguments against it, I'd like to know about them.

comment:22 in reply to: ↑ 21 Changed 6 years ago by neis

Replying to vadz:

I really, really don't think that anybody would take care to tokenize the string being passed to wxExecute() just in order to quote spaces in the middle. This is a nice theoretical point but I do wonder if you seriously think that anybody has existing code doing anything else than prepending and appending the quote to a value potentially containing spaces.

Just from the theoretical point of view, if I need to escape spaces, I tend to replace spaces by backslash followed by space on linux and a space surrounded by quotes on Windows (nicely minimizes the #ifdef to just an argument of the replace function). But then, this argument doesn't really apply here, since you don't want/need the escaping on Linux anyway.

Note: See TracTickets for help on using tickets.