Ticket #10064 (closed defect: fixed)
OutputString() in xml module crashes if mb_str() fails
| Reported by: | jgeorgal | Owned by: | |
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | base | Version: | 2.8.8 |
| Keywords: | OutputString, xml, encoding, conversion, crash | Cc: | |
| Blocked By: | Patch: | no | |
| Blocking: |
Description
The Write command in the inline function "OutputString" at xml/xml.cpp:762 crashes if wxString::mb_str() function (at the previous line) fails.
Specifically, the affected code is the following:
const wxWX2MBbuf buf(str.mb_str(*(convFile ? convFile : &wxConvUTF8)));
stream.Write((const char*)buf, strlen((const char*)buf));
The problem is that if mb_str() fails to convert the string to/from the appropriate encoding, it returns NULL. This means that strlen() causes the application to crash as the result of NULL pointer dereference.
The solution to the above is obvious. However, I think, it would be better for encoding conversions to "never" fail. In my case wxMBConv failed to convert just a single character ("RIGHT SINGLE QUOTATION MARK" (0x2019)) from UNICODE (UTF-16) to iso-8859-7 ... and because of this single failure, the whole string conversion failed - returning NULL. I'd rather have gotten back a string with the non-convertible characters replaced with e.g. '?' than NULL.
So, I'd propose to have wxMBConv classes take a:
int (*failed_char_conversion_func_ptr)(int character, const wxString& sourceEncoding, const wxString& targetEncoding)
callback (or something similar) to be called whenever a character conversion fails, returning the character(s) that should be used instead of the nonconvertible one. This can also be a global callback - I guess - since character conversion fall-back logic is application specific.
What do you guys think?
Thanks,
Giannis
PS. I forgot to mention that I observed this behavior on a Windows XP SP3 machine. I don't know if the above is reproducible on GNU/Linux or Macs.
