Opened 9 months ago

Closed 9 months ago

Last modified 9 months ago

#15901 closed defect (fixed)

Bug in conversion of UTF8 to wchar

Reported by: andyr Owned by:
Priority: normal Milestone:
Component: base Version: dev-latest
Keywords: utf8 Cc:
Blocked By: Blocking:
Patch: yes

Description

See patch... the present code fails to decrement srcLen in the inner loop, which results in psz being incremented past the end of the input buffer, garbage characters being added to the output, and the returned size being too large. Of course this only happens when there are multi-byte utf8 chars present.

The reason no-one's noticed is that:

  • although the returned string is too long, it does contain a null byte in the right place, so the extra garbage chars may not be noticed.
  • psz reads beyond the end of the input but this is only noticed if it causes an illegal address crash, which it frequently doesn't.

Attachments (1)

strconv.patch download (835 bytes) - added by andyr 9 months ago.
Patch to fix this.

Download all attachments as: .zip

Change History (3)

Changed 9 months ago by andyr

Patch to fix this.

comment:1 Changed 9 months ago by VZ

  • Resolution set to fixed
  • Status changed from new to closed

(In [75728]) Fix bug with non-NUL-terminaed inputs in wxMBConvUTF8.

We read beyond the provided maximal length as we didn't update the remaining
length while parsing the remaining bytes of an UTF-8-encoded code point.

Fix this and add a test for it.

Closes #15901.

comment:2 Changed 9 months ago by VZ

(In [75733]) Fix bug with non-NUL-terminaed inputs in wxMBConvUTF8.

We read beyond the provided maximal length as we didn't update the remaining
length while parsing the remaining bytes of an UTF-8-encoded code point.

Fix this and add a test for it.

Closes #15901.

Note: See TracTickets for help on using tickets.