Qt4-preview-feedback Archive, May 2005
Possible GCC bug affecting Qt 4 on FC3
Message 1 in thread
I've encountered a strange problem with recent snapshots on my Fedora Core 3
system. I was using the following code sequence (I initially had other code in
between the two lines but the problem remained after I'd removed it):
QCoreApplication app(argc, argv);
return app.exec();
yet Qt was complaining that in order to call exec() I had to instantiate
QCoreApplication first, which I'd obviously done (actually the error message
refers to QApplication - perhaps this should be changed?).
By inserting some debugging code, I noticed that although
QCoreApplication::self was set to a valid value immediately before returning
from the QCoreApplication constructor, immediately after it had somehow been
set to zero, thus exec() was failing because this is the variable it checks.
A watchpoint in gdb detects the initialisation to the correct value, but
apparently not the zeroing.
I could only conclude that this was a bug in the compiler
(gcc-c++-3.4.3-22.fc3) so I tried the experimental GCC 4.0 compiler that
also ships with FC3 and this doesn't suffer from the same problem.
Could my conclusion be correct? Annoyingly, a minimal test containing only
the two lines above doesn't reproduce the problem so there must be something
about my code that the compiler doesn't like, however surely no other code
could be executing immediately after the constructor (I do use threads but
none have been initialised at this point) so it can't be due to a bug in my
code.
--
[ signature omitted ]
Message 2 in thread
On Friday 27 May 2005 14:14, Mark Sawle wrote:
> By inserting some debugging code, I noticed that although
> QCoreApplication::self was set to a valid value immediately before returning
> from the QCoreApplication constructor, immediately after it had somehow been
> set to zero, thus exec() was failing because this is the variable it checks.
> A watchpoint in gdb detects the initialisation to the correct value, but
> apparently not the zeroing.
Amazing, I just encountered the same bug.
QCoreApplication::init: self=0xbfffeff0
QCoreApplication::init: QCoreApplication::instance()=(nil)
Those two debug lines being added right after self is assigned in QCoreApplication::init().
Which is really amazing given that instance() simply returns self (!)
> I could only conclude that this was a bug in the compiler
> (gcc-c++-3.4.3-22.fc3)
Yes, that was my conclusion too. I use gcc (GCC) 3.4.3 (Mandrakelinux 10.2 3.4.3-7mdk)
In a way I'm happy to see that it's not Mandrake-specific, and that RedHat's gcc-3.4.3 has the same bug :)
Obviously this isn't something TT can fix, but you're right about posting it here,
so that others can save some debugging time if they hit this problem...
Now we must get RedHat-FC and Mandrake to fix their compilers...
--
[ signature omitted ]
Message 3 in thread
David Faure wrote:
> On Friday 27 May 2005 14:14, Mark Sawle wrote:
[...]
>>I could only conclude that this was a bug in the compiler
>>(gcc-c++-3.4.3-22.fc3)
>
>
> Yes, that was my conclusion too. I use gcc (GCC) 3.4.3 (Mandrakelinux 10.2 3.4.3-7mdk)
> In a way I'm happy to see that it's not Mandrake-specific, and that RedHat's gcc-3.4.3 has the same bug :)
>
> Obviously this isn't something TT can fix, but you're right about posting it here,
> so that others can save some debugging time if they hit this problem...
>
I don't know anything about this specific problem.... However....
The behavior described reminds me exactly of the types of problems I've
seen when I've screwed up the independence of static objects (e.g.
inadvertently using one static object to initialize another one).
Presumably the instance() is a model of the soliton pattern - and
presumably it uses some static data to facilitate this. But that's not
necessarily significant.
Often, I've had static object initialization problems show up as a crash
immediately after main() begins executing - e.g. just blows out on
whatever happens to be in the first few lines of main().
Also, crash vs. nocrash typically depends on what other code is linked
(e.g. other static objects being included in the compilation)
irrespective if the nonstatic parts of that code are ever executed.
I.e. a crash mystically goes away or reappears depending what
combination of other modules are compiled into the application.
Such problems tend to be virtually impossible to reproduce with a
minimal implementation (because it is the combinatorial effect that is
causing them). Whereas a compiler bug, would probably maybe perhaps
more likely depend on the syntax of a few specific lines of code.
The works/doesn't might change from compiler to compiler and version to
version since order of static object init is undefined and is free for
the implementation to decide.
It's interesting however, that it happens in two (presumably)
independent programs. Although there is likely a common set of Qt
headers/modules and/or some other common library(ies) involved. Perhaps
there's a way to compare what files/modules/libraries are in common?
So maybe it is a compiler bug... or something else...
Dave
Message 4 in thread
On Friday 27 May 2005 20:21, Dave Knopp wrote:
> The behavior described reminds me exactly of the types of problems I've
> seen when I've screwed up the independence of static objects (e.g.
> inadvertently using one static object to initialize another one).
Hmm, sorry but this has nothing to do with static objects :)
I know about problems due to undefined static object initialization.
But here it's about one line setting a static pointer to a value, and the
next line retrieving that value (via a static method) and getting 0.
All that from a normal method, not from any global static objects.
That *is* a compiler bug for sure.
However it does depend on whether I run a gui qt app or a core-only qt app
indeed, for some reason.
--
[ signature omitted ]
Message 5 in thread
Hi,
> I've encountered a strange problem with recent snapshots on my Fedora Core 3
> system. I was using the following code sequence (I initially had other code in
> between the two lines but the problem remained after I'd removed it):
Which exact version of gcc? I'm running Fedora Core 3 and I haven't seen
anything like that so far.
$ gcc --version
gcc (GCC) 3.4.3 20050227 (Red Hat 3.4.3-22.fc3)
--
[ signature omitted ]
Message 6 in thread
Dimitri <dimitri@xxxxxxxxxxxxx> wrote:
> > I've encountered a strange problem with recent snapshots on my Fedora
> > Core 3 system. I was using the following code sequence (I initially had
> > other code in between the two lines but the problem remained after I'd
> > removed it):
>
> Which exact version of gcc? I'm running Fedora Core 3 and I haven't seen
> anything like that so far.
>
> $ gcc --version
> gcc (GCC) 3.4.3 20050227 (Red Hat 3.4.3-22.fc3)
It's the same version you're using.
--
[ signature omitted ]
Message 7 in thread
Hi,
I can't reproduce any problem with:
$ cat foo.cc
#include <QtCore/QCoreApplication>
int main(int argc, char *argv[]) {
QCoreApplication app(argc, argv);
return app.exec();
}
$
$ qmake -project
$ qmake
$ make
$ ./foo
^C
$
This is on a fully up-to-date Fedora core 3 workstation.
Valgrind doesn't complain either.
--
[ signature omitted ]
Message 8 in thread
Dimitri <dimitri@xxxxxxxxxxxxx> wrote:
> I can't reproduce any problem with:
>
> $ cat foo.cc
> #include <QtCore/QCoreApplication>
> int main(int argc, char *argv[]) {
> QCoreApplication app(argc, argv);
> return app.exec();
> }
> $
> $ qmake -project
> $ qmake
> $ make
> $ ./foo
> ^C
> $
It works fine for me too (I tried this before posting) - clearly there's
something else that contributes to the problem but I have no idea what.
--
[ signature omitted ]
Message 9 in thread
Dimitri wrote:
> Hi,
>
> I can't reproduce any problem with:
>
> $ cat foo.cc
> #include <QtCore/QCoreApplication>
> int main(int argc, char *argv[]) {
> QCoreApplication app(argc, argv);
> return app.exec();
> This is on a fully up-to-date Fedora core 3 workstation.
g++ (GCC) 3.4.4 20050314 (prerelease) (Debian 3.4.3-12)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
works fine here (GNU/debian). So it must be redhat/mandrake specific, maybe
they share some patches in 3.4 gcc tree :-)
--
[ signature omitted ]
Message 10 in thread
Hi,
> Could my conclusion be correct? Annoyingly, a minimal test containing only
> the two lines above doesn't reproduce the problem so there must be something
> about my code that the compiler doesn't like, however surely no other code
> could be executing immediately after the constructor (I do use threads but
> none have been initialised at this point) so it can't be due to a bug in my
> code.
Have you tried Valgrind on your program?
--
[ signature omitted ]
Message 11 in thread
On Saturday 28 May 2005 00:25, Dimitri wrote:
> Hi,
>
> > Could my conclusion be correct? Annoyingly, a minimal test containing only
> > the two lines above doesn't reproduce the problem so there must be something
> > about my code that the compiler doesn't like, however surely no other code
> > could be executing immediately after the constructor (I do use threads but
> > none have been initialised at this point) so it can't be due to a bug in my
> > code.
>
> Have you tried Valgrind on your program?
Personally I did, and of course it doesn't tell us anything, given that it's a compiler bug :
$ valgrind --tool=addrcheck --num-callers=50 assistant
QCoreApplication::init: setting self to 0x9c5fe090
QCoreApplication::init: self=0x9c5fe090
QCoreApplication::init: QCoreApplication::instance()=(nil)
QPaintDevice: Must construct a QApplication before a QPaintDevice
==12303==
==12303== Process terminating with default action of signal 6 (SIGABRT): dumping core
==12303== at 0x34C40B86: kill (in /lib/tls/libc-2.3.4.so)
==12303== by 0x34143BE6: gsignal (vg_intercept.c:93)
==12303== by 0x34C42048: abort (in /lib/tls/libc-2.3.4.so)
==12303== by 0x34997671: qt_message_output(QtMsgType, char const*) (qglobal.cpp:1281)
==12303== by 0x34997AD2: qFatal(char const*, ...) (qglobal.cpp:1493)
==12303== by 0x343153C4: QPaintDevice::QPaintDevice() (qpaintdevice_x11.cpp:75)
==12303== by 0x342B0E6E: QPixmap::QPixmap() (qpixmap.cpp:77)
Debug lines added by myself in QCoreApplication::init() :
--- kernel/qcoreapplication.cpp (revision 418644)
+++ kernel/qcoreapplication.cpp (working copy)
@@ -394,6 +394,9 @@
Q_ASSERT_X(!self, "QCoreApplication", "there should be only one application object");
QCoreApplication::self = this;
+ qDebug( "QCoreApplication::init: setting self to %p", this );
+ qDebug( "QCoreApplication::init: self=%p", QCoreApplication::self );
+ qDebug( "QCoreApplication::init: QCoreApplication::instance()=%p", QCoreApplication::instance() );
QThread::initialize();
I believe that if Mandrake's gcc has this bug, it's quite likely that the FC3 bug is the same,
given that the symptoms are identical.
--
[ signature omitted ]
Message 12 in thread
Hi,
>>Have you tried Valgrind on your program?
>
> Personally I did, and of course it doesn't tell us anything, given that it's a compiler bug :
Even if it's a compiler bug, Valgrind can still tell interesting things.
Unfortunately in this case it doesn't help much more than the debugger.
I guess one would have to look at the generated assembler code or
simplify the source code until a test case can be sent to Mandrake, Red
Hat, and gcc.
Does this happen in release mode or debug mode too?
--
[ signature omitted ]
Message 13 in thread
On Saturday 28 May 2005 14:40, Dimitri wrote:
> Hi,
>
> >>Have you tried Valgrind on your program?
> >
> > Personally I did, and of course it doesn't tell us anything, given that it's a compiler bug :
>
> Even if it's a compiler bug, Valgrind can still tell interesting things.
> Unfortunately in this case it doesn't help much more than the debugger.
> I guess one would have to look at the generated assembler code or
> simplify the source code until a test case can be sent to Mandrake, Red
> Hat, and gcc.
Yeah (but my abilities to read x86 assembler are quite limited).
> Does this happen in release mode or debug mode too?
I just checked and it happens in both.
--
[ signature omitted ]
Message 14 in thread
On Monday 30 May 2005 20:28, David Faure wrote:
> On Saturday 28 May 2005 14:40, Dimitri wrote:
> > Hi,
> >
> > >>Have you tried Valgrind on your program?
> > >
> > > Personally I did, and of course it doesn't tell us anything, given that it's a compiler bug :
> >
> > Even if it's a compiler bug, Valgrind can still tell interesting things.
> > Unfortunately in this case it doesn't help much more than the debugger.
> > I guess one would have to look at the generated assembler code or
> > simplify the source code until a test case can be sent to Mandrake, Red
> > Hat, and gcc.
Ok Thiago Macieira investigated it and made a testcase. The bug is only in
gcc-3.4.x with -fvisibility patch, i.e. the RedHat/Mandriva version, not in upstream gcc.
We'll try to make the report propagate up to whoever can fix the visibility patch.
I think this closes the discussion for this list, there's nothing TT can do about a bad gcc patch, IMHO :)
--
[ signature omitted ]
class __attribute__((visibility("default"))) QCoreApplication
{
friend class QCoreApplicationPrivate;
public:
static QCoreApplication *instance() { return self; }
private:
void init();
static QCoreApplication *self;
};
class __attribute__((visibility("default"))) QCoreApplicationPrivate
{
public:
static bool checkInstance(const char *method);
};
bool QCoreApplicationPrivate::checkInstance(const char *function)
{
bool b = (QCoreApplication::self != 0);
return b;
}
QCoreApplication *QCoreApplication::self = 0;
void QCoreApplication::init()
{
QCoreApplication::self = this;
}
// Compile with:
// g++ -c -pipe -g -fvisibility=hidden -fvisibility-inlines-hidden -Wall -fPIC something.ii
Message 15 in thread
On Monday 30 May 2005 22:06, David Faure wrote:
> there's nothing TT can do about a bad gcc patch, IMHO :)
Hmm, there is, of course:
By default, Qt shouldn't use -fvisibility with gcc-3.4.x (even when it claims to support it),
since it's broken there.
Can this be added to Qt's configure?
If distros fix the gcc bug, they should still be able to force the use of -fvisibility,
but by default, we can save many users from hitting this bug by simply not defaulting
to -fvisibility with gcc-3.4.x (it's fine to still do it for gcc-4.x of course).
--
[ signature omitted ]