Opened 9 years ago

Closed 8 years ago

#9638 closed defect (fixed)

wxCharBuffer doesn't copy wxString::mb_str() return value

Reported by: crjjrc Owned by: vaclavslavik
Priority: high Milestone: 2.9.0
Component: base Version: stable-latest
Keywords: Cc: net147@…
Blocked By: Blocking:
Patch: yes

Description

The other day I tried running ./myapp -h to get a usage statement from wxCmdLineParser, but no message was printed. I tried other apps and received various results: some statements appeared, others were interrupted by a non-printable character, some even worked okay. Of the samples, only sockets/baseserver exhibited the problem.

I've used gdb to track down the problem to wxMessageOutputStderr::Output. It appears that this line is causing trouble:

const wxWX2MBbuf buf = AppendLineFeedIfNeeded(str).mb_str();

When the problem manifests itself, gdb shows that buf's content is maligned. Perhaps since the temporary variable on the RHS goes out of scope sooner than it should? To test a workaround, I tried changing the code to:

wxString strn = AppendLineFeedIfNeeded(str);
const wxWX2MBbuf buf = strn.mb_str();
const wxWX2MBbuf buf2 = AppendLineFeedIfNeeded(str).mb_str();

Now, gdb shows that buf is valid while buf2 (the original, assigned via a temporary) is not:

(gdb)
wxMessageOutputStderr::Output (this=0x8073da0, str=@0xbfea6b5c)

at ../src/common/msgout.cpp:146

146 const wxWX2MBbuf buf = strn.mb_str();
(gdb) n
147 const wxWX2MBbuf buf2 = AppendLineFeedIfNeeded(str).mb_str();
(gdb)
149 if ( buf ) {
(gdb) print buf
$1 = {<wxCharTypeBuffer<char>> = {

m_str = 0x8074864 "Usage: baseserver [-h] [--verbose] [-t] [-e] [-m <num>] [-p <num>]\n -h, --help \tshow this help message\n --verbose \tgenerate verbose log messages\n -t, --threads \tUse thread based worker"...,
m_owned = false}, <No data fields>}

(gdb) print buf2
$2 = {<wxCharTypeBuffer<char>> = {m_str = 0x8074a24 "",

m_owned = false}, <No data fields>}

I don't why, but it appears like buf2's string is going out of scope and the buffer is getting overwritten. If the temporary's causing the problem, I've attached my patch to create a named string that stays in scope.

$ g++ --version
g++ (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
$ wx-config --version
2.9.0

  • Chris

Attachments (2)

msgout.patch download (458 bytes) - added by crjjrc 9 years ago.
Possible fix for temporary string going out of scope too early
buffer_copy_constructor_warning_fix_r59887.patch download (409 bytes) - added by net147 8 years ago.
Compiler warning fix.

Download all attachments as: .zip

Change History (19)

Changed 9 years ago by crjjrc

Possible fix for temporary string going out of scope too early

comment:1 Changed 9 years ago by vadz

  • Keywords worksforme added
  • Status changed from new to infoneeded_new

I really don't see anything wrong with the code (except that it may forget to print '\n' if conversion fails) and I don't see the problem in the baseserver sample, could you please provide a minimal example or a patch to the console sample reproducing it?

comment:2 Changed 9 years ago by crjjrc

  • Status changed from infoneeded_new to new

Hmm... How can I add a patch that produces the error when the error is already present? But it doesn't matter, because your patch r54373 involves doing what my own patch did: introduces a named string. Is it not the case that the storage for the conversion belongs to the wxString being converted? If so, what exactly is the scope of a temporary (anonymous) variable? Is there any guarantee that the storage will be available in the next statement?

  const wxWX2MBbuf buf = AppendLineFeedIfNeeded(str).mb_str();

The temporary here holds the value returned by AppendLifeFeedIfNeeded(str). I wager that this temporary is being released immediately after the statement finishes. A quick example confirms my suspicion:

#include <iostream>

class Test {
   public:
      Test() : string(NULL) {
         std::cout << "calling constructor" << std::endl;
      };
      ~Test() {
         std::cout << "calling destructor" << std::endl;
         delete[] string;
      }
      const char *GetString() {
         string = new char[101];
         for (int i = 0; i < 101; ++i) {
            string[i] = '0' + i % 10;
         }
         return string;
      }
      char *string;
};

int main(int argc, char **argv) {

   const char *str = Test().GetString();
   std::cout << "str: " << str << std::endl;

   return 0;

}

When I compile with my g++, I get:

calling constructor
calling destructor
str: 01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890

The temporary is destroyed before its storage is printed. By the time str is printed, it points to freed memory. By coincidence, the buffer hasn't been overwritten. Is not this the same setup in wxMessageStderr::Output()?

I'd be happy to test on the console sample, but in current trunk it doesn't compile with TEST_ALL set to 1.

  • Chris

comment:3 Changed 9 years ago by vadz

  • Keywords worksforme removed
  • Resolution set to fixed
  • Status changed from new to closed

Wait, I think I understand the problem: you're using the ANSI build, right?

Because wxWX2MBbuf is wxCharBuffer in Unicode build so the contents of the (temporary) value returned by mb_str() is copied into this variable. However in ANSI build it is indeed char * because mb_str() there just returns a pointer to the internal string buffer and in this case it indeed doesn't work.

comment:4 Changed 9 years ago by crjjrc

  • Resolution fixed deleted
  • Status changed from closed to reopened

I have removed --enable-unicode from my configure options, but I understand that Unicode is the default. In fact:

$ wx-config --list

    Default config is gtk2-unicode-debug-2.9

  Default config will be used for output

So, no, I'm not using the ANSI build. For what it's worth, here's a page discussing anonymous variables' expression scope:

http://www.learncpp.com/cpp-tutorial/814-anonymous-variables-and-objects/

  • Chris

comment:5 Changed 9 years ago by vadz

  • Resolution set to fixed
  • Status changed from reopened to closed

I don't understand how it may not work in Unicode build. And I don't understand what does it have to do with temporaries scope neither: of course the temporary buffer returned by mb_str() is only alive until the next sequence point but, again, its contents, not a reference to it, is copied into the local variable buf. And, FWIW, creating a named wxString object shouldn't change anything at all, it's the buffer object returned by mb_str() which we're interested in.

So while I'm still closing this (as you said it was fixed) I remain rather worried by the fact that there is probably something we're both missing here.

comment:6 Changed 9 years ago by crjjrc

Well, if this helps at all (and if you can follow), here's a gdb session that shows the converted buffer getting modified after the statement finishes.

(gdb) break 'wxMessageOutputStderr::Output(wxString const&)'
Breakpoint 1 at 0xb7eb65f3: file ../src/common/msgout.cpp, line 143.
(gdb) run
Starting program: /home/cjohnson/cj_tmp/tarballs/wxWidgets/wxWidgets/gtk_build_debug/samples/sockets/baseserver -h
[Thread debugging using libthread_db enabled]
[New Thread 0xb7a656c0 (LWP 18055)]
[Switching to Thread 0xb7a656c0 (LWP 18055)]

Breakpoint 1, wxMessageOutputStderr::Output (this=0x80743d0, str=@0xbfba905c) at ../src/common/msgout.cpp:143
143         const wxString strWithLF = AppendLineFeedIfNeeded(str);
(gdb) n
144         const wxWX2MBbuf buf = strWithLF.mb_str();
(gdb) n
145         const wxWX2MBbuf buf2 = AppendLineFeedIfNeeded(str).mb_str();

I'm using your patched version with this last problematic line added in.

(gdb) s
wxGet_wxConvLibc () at ../include/wx/strconv.h:565
565     WX_DECLARE_GLOBAL_CONV(wxMBConv, wxConvLibc)
(gdb) s
wxMessageOutputStderr::AppendLineFeedIfNeeded (this=0x80743d0, str=@0xbfba905c) at ../src/common/msgout.cpp:134
134         wxString strLF(str);
(gdb) s
wxString (this=0xbfba8fd8, stringSrc=@0xbfba905c) at ../include/wx/string.h:900
900       wxString(const wxString& stringSrc) : m_impl(stringSrc.m_impl) { }
(gdb) finish
Run till exit from #0  wxString (this=0xbfba8fd8, stringSrc=@0xbfba905c) at ../include/wx/string.h:900
wxMessageOutputStderr::AppendLineFeedIfNeeded (this=0x80743d0, str=@0xbfba905c) at ../src/common/msgout.cpp:135
135         if ( strLF.empty() || *strLF.rbegin() != '\n' )
(gdb) n
139     }
(gdb) s
wxString::mb_str (this=0xbfba8fd8, conv=@0xb7ef1778) at ../src/common/string.cpp:442
442         if ( conv.IsUTF8() )
(gdb) 
wxMBConvLibc::IsUTF8 (this=0xb7ef1778) at ../include/wx/strconv.h:190
190         virtual bool IsUTF8() const { return wxLocaleIsUtf8; }
(gdb) 
wxString::mb_str (this=0xbfba8fd8, conv=@0xb7ef1778) at ../src/common/string.cpp:443
443             return wxCharBuffer::CreateNonOwned(m_impl.c_str());
(gdb) 
wxCharTypeBuffer<char>::CreateNonOwned (
    str=0x8075054 "Usage: baseserver [-h] [--verbose] [-t] [-e] [-m <num>] [-p <num>]\n  -h, --help      \tshow this help message\n  --verbose       \tgenerate verbose log messages\n  -t, --threads   \tUse thread based worker"...) at ../include/wx/buffer.h:51
51              wxCharTypeBuffer buf;
(gdb) 
wxCharTypeBuffer (this=0xbfba8f8c, str=0x0) at ../include/wx/buffer.h:38
38                m_owned(true)
(gdb) 
39          {
(gdb) 
wxCharTypeBuffer<char>::CreateNonOwned (
    str=0x8075054 "Usage: baseserver [-h] [--verbose] [-t] [-e] [-m <num>] [-p <num>]\n  -h, --help      \tshow this help message\n  --verbose       \tgenerate verbose log messages\n  -t, --threads   \tUse thread based worker"...) at ../include/wx/buffer.h:52
52              buf.m_str = wx_const_cast(CharType*, str);
(gdb) 
53              buf.m_owned = false;
(gdb) 
54              return buf;
(gdb) 
wxCharBuffer (this=0xbfba9000, buf=@0xbfba8f8c) at ../include/wx/buffer.h:167
167             : wxCharTypeBufferBase(buf) {}
(gdb) 
wxCharTypeBuffer (this=0xbfba9000, src=@0xbfba8f8c) at ../../../include/wx/buffer.h:97
97              CopyFrom(src);
(gdb) 
wxCharTypeBuffer<char>::CopyFrom (this=0xbfba9000, src=@0xbfba8f8c) at ../../../include/wx/buffer.h:150
150             m_owned = src.m_owned;
(gdb) 
151             m_str = src.DoRelease();
(gdb) 
wxCharTypeBuffer<char>::DoRelease (this=0xbfba8f8c) at ../../../include/wx/buffer.h:143
143             CharType *p = m_str;
(gdb) 
144             ((wxCharTypeBuffer *)this)->m_str = NULL;
(gdb) print p
$1 = 0x8075054 "Usage: baseserver [-h] [--verbose] [-t] [-e] [-m <num>] [-p <num>]\n  -h, --help      \tshow this help message\n  --verbose       \tgenerate verbose log messages\n  -t, --threads   \tUse thread based worker"...

All is good up to here. The temporary gets its buffer NULLed and the wxCharTypeBuffer takes it over. Now I set a watchpoint on the buffer.

(gdb) watch *((char *)0x8075054)
Hardware watchpoint 2: *(char *) 134697044
(gdb) s
145             return p;
(gdb) 
~wxCharTypeBuffer (this=0xbfba8f8c) at ../../../include/wx/buffer.h:60
60              if ( m_owned)
(gdb)
wxString::mb_str (this=0xbfba8fd8, conv=@0xb7ef1778) at ../src/common/string.cpp:458
458     }
(gdb)
~wxString (this=0xbfba8fd8) at ../include/wx/string.h:405
405     {
(gdb)
~ConvertedBuffer (this=0xbfba8fe0) at ../../../include/wx/string.h:2661
2661              { free(m_buf); }
(gdb)
~ConvertedBuffer (this=0xbfba8fdc) at ../../../include/wx/string.h:2661
2661              { free(m_buf); }
(gdb)
Hardware watchpoint 2: *(char *) 134697044

Old value = 85 'U'
New value = 0 '\0'
0xb7ad257c in ?? () from /lib/tls/i686/cmov/libc.so.6
Current language:  auto; currently asm


And there we have something altering the buffer. The U from Usage is zeroed out.

  • Chris

comment:7 Changed 9 years ago by crjjrc

Ah, I forgot the backtrace at the time of the watchpoint being trigged:

{{{(gdb) bt
#0 0xb7ad257c in ?? () from /lib/tls/i686/cmov/libc.so.6
#1 0xb7ad64f0 in free () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7c91b11 in operator delete () from /usr/lib/libstdc++.so.6
#3 0xb7c6d79d in std::string::_Rep::_M_destroy () from /usr/lib/libstdc++.so.6
#4 0xb7c6f571 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string () from /usr/lib/libstdc++.so.6
#5 0xb7d3fd1b in ~wxString (this=0xbfba8fd8) at ../include/wx/string.h:405
#6 0xb7e4a678 in wxMessageOutputStderr::Output (this=0x80743d0, str=@0xbfba905c) at ../src/common/msgout.cpp:145
#7 0xb7e4a943 in wxMessageOutput::DoPrintfUtf8 (this=0x80743d0, format=0x80742d0 "%s") at ../src/common/msgout.cpp:100
#8 0xb7d4e91f in wxMessageOutputBase::Printf<wxCStrData> (this=0x80743d0, f1=@0xbfba90f8, a1=@0xbfba9130) at ../include/wx/msgout.h:40
#9 0xb7d484cd in wxCmdLineParser::Usage (this=0xbfba9188) at ../src/common/cmdline.cpp:1019
#10 0xb7d3d7b7 in wxAppConsoleBase::OnCmdLineHelp (this=0x806e500, parser=@0xbfba9188) at ../src/common/appbase.cpp:520
#11 0xb7d3dc20 in wxAppConsoleBase::OnInit (this=0x806e500) at ../src/common/appbase.cpp:228
#12 0x0804fc97 in Server::OnInit (this=0x806e500) at ../../../samples/sockets/baseserver.cpp:307
#13 0x08050dc8 in wxAppConsoleBase::CallOnInit (this=0x806e500) at ../../../include/wx/app.h:77
#14 0xb7da4880 in wxEntry (argc=@0xb7ef03ac, argv=0x805a028) at ../src/common/init.cpp:450
#15 0xb7da4b42 in wxEntry (argc=@0xbfba92e0, argv=0xbfba9364) at ../src/common/init.cpp:478
#16 0x08050af0 in main (argc=2105344, argv=0x179) at ../../../samples/sockets/baseserver.cpp:218
}}}

msgout.cpp:145 is the buf2 assignment seen in my previous post.

  • Chris

comment:8 Changed 9 years ago by vadz

  • Priority changed from normal to high
  • Resolution fixed deleted
  • Status changed from closed to reopened
  • Summary changed from Failing wxMessageOutputStderr/wxCmdLineParser -h to wxCharBuffer doesn't copy wxString::mb_str() return value

OK, there is a real big problem here (which arose with the switch to using std::string by default), we need to fix it.

Thanks a lot for digging into it!

comment:9 Changed 9 years ago by vaclavslavik

  • Owner set to vaclavslavik
  • Status changed from reopened to accepted

comment:10 Changed 9 years ago by byte

I can't repeat problem now, but I see that in wxCharTypeBuffer class, in method "DoRelease" we must set "m_owned" to false:

CharType *DoRelease() const

{

CharType *p = m_str;
((wxCharTypeBuffer *)this)->m_str = NULL;
((wxCharTypeBuffer *)this)->m_owned = false;
return p;

}

bcz there are tests "if ( m_owned )".

comment:11 Changed 9 years ago by byte

Sorry, I looked at old version.

comment:12 follow-up: Changed 9 years ago by byte

hmm, new implementation:

wxCharTypeBuffer,

void DecRef()
{

if ( m_data == &NullData ) exception, not ref-counted

return;

if ( --m_data->m_ref == 0 )

delete m_data;

m_data = &NullData;

}

must be

void DecRef()
{

if ( m_data == &NullData ) exception, not ref-counted

return;

if ( --m_data->m_ref == 0 )
{

delete m_data;
m_data = &NullData;

}

}

??

comment:13 Changed 9 years ago by byte

And may be better remove "self->m_data->m_str = NULL;" from wxCharTypeBuffer::release() ?

comment:14 in reply to: ↑ 12 Changed 9 years ago by vadz

Replying to byte:

hmm, new implementation:
...
must be
...

Sorry, I absolutely don't see why should it be changed like this (in fact I'm pretty sure it shouldn't). Please explain what problem exactly are you trying to solve here.

comment:15 Changed 8 years ago by vaclavslavik

  • Resolution set to fixed
  • Status changed from accepted to closed

Should be fixed by r59887.

comment:16 follow-up: Changed 8 years ago by net147

  • Cc net147@… added
  • Resolution fixed deleted
  • Status changed from closed to reopened

Compiling with MinGW GCC 4.3.3 TDM-1 gives a warning for every file that includes buffer.h:

..\..\include/wx/buffer.h: In copy constructor 'wxCharTypeBuffer<T>::wxCharTypeB
uffer(const wxCharTypeBuffer<T>&) [with T = char]':
..\..\include/wx/buffer.h:318:   instantiated from here
..\..\include/wx/buffer.h:247: warning: base class 'class wxScopedCharTypeBuffer
<char>' should be explicitly initialized in the copy constructor

Changed 8 years ago by net147

Compiler warning fix.

comment:17 in reply to: ↑ 16 Changed 8 years ago by vaclavslavik

  • Resolution set to fixed
  • Status changed from reopened to closed

Replying to net147:

Compiling with MinGW GCC 4.3.3 TDM-1 gives a warning for every file that includes buffer.h:

Thanks, fixed in r59900 (differently, the warning exposed a real issue).

Note: See TracTickets for help on using tickets.