Trolltech Home | Qt-interest Home | Recent Threads | All Threads | Author | Date
All threads index page 3

Qt-interest Archive, December 2007
Qt default optimisation flags


Message 1 in thread

The default compiler optimisation flags set by Qt are:

-O1 (Visual Studio)
-Os (GCC)

But I did some experiments with my genetic algorithm code and found 
these weren't the best settings for me. If you are interested the 
results are here:

http://successfulsoftware.net/2007/12/18/optimising-your-application/

best regards

Andy Brice
http://www.perfecttableplan.com

--
 [ signature omitted ] 

Message 2 in thread

Are the binary msvc releases on my download place at trolltech also
build with O1 flag? Then one probably should build the qt manually for
the releases?
Regards Knut

On 12/18/07, Andy Brice <andy@xxxxxxxxxxxxxxx> wrote:
> The default compiler optimisation flags set by Qt are:
>
> -O1 (Visual Studio)
> -Os (GCC)
>
> But I did some experiments with my genetic algorithm code and found
> these weren't the best settings for me. If you are interested the
> results are here:
>
> http://successfulsoftware.net/2007/12/18/optimising-your-application/
>
> best regards
>
> Andy Brice
> http://www.perfecttableplan.com
>
> --
> To unsubscribe - send a mail to qt-interest-request@xxxxxxxxxxxxx with "unsubscribe" in the subject or the body.
> List archive and information: http://lists.trolltech.com/qt-interest/
>
>


-- 
 [ signature omitted ] 

Message 3 in thread

Andy Brice wrote:
>The default compiler optimisation flags set by Qt are:
>
>-O1 (Visual Studio)
>-Os (GCC)
>
>But I did some experiments with my genetic algorithm code and found
>these weren't the best settings for me. If you are interested the
>results are here:
>
>http://successfulsoftware.net/2007/12/18/optimising-your-application/
>
>best regards
>
>Andy Brice
>http://www.perfecttableplan.com

Thanks Andy,

That was a very interesting read. Ten times speed improvement is something 
to be proud of.

I did some optimisation of QtDBus between Qt 4.2 and 4.3, but I only 
achieved 6x speed improvement :-)

In any case, for Linux, I recommend using valgrind --tool=callgrind to 
profile your application. It runs the application entirely within an 
emulated environment. If you turn on some more information collection, 
you'll be able to determine if you're having cache misses too.

Another good profiler for Linux is oprofile. Valgrind does estimates based 
on the instruction count, while oprofile is time-based. Both tools can be 
useful and be able to find issues the other won't.

Finally, when you're investigating a multi-threaded approach to 
PerfectTablePlan, I suggest you take a look at QtConcurrent. Distributing 
a genetic algorithm should be more-or-less simple with QtConcurrent. For 
example, each individual in a generation gets a thread and you wait for 
them all to be done.

-- 
 [ signature omitted ] 

Attachment: signature.asc
Description: This is a digitally signed message part.


Message 4 in thread

Thiago Macieira wrote:
>Andy Brice wrote:
>>The default compiler optimisation flags set by Qt are:
>>
>>-O1 (Visual Studio)
>>-Os (GCC)
>>
>>But I did some experiments with my genetic algorithm code and found
>>these weren't the best settings for me. If you are interested the
>>results are here:
>>
>>http://successfulsoftware.net/2007/12/18/optimising-your-application/
>>
>>best regards
>>
>>Andy Brice
>>http://www.perfecttableplan.com
>
>Thanks Andy,
>
>That was a very interesting read. Ten times speed improvement is
> something to be proud of.

Of course, I forgot to say the other thing I was going to say:

Any Qt application that uses the Tulip classes is extremely slow if built 
without optimisation. Especially in gcc, where -O0 means no optimisation 
at all (even some warnings are disabled in -O0 because the code that 
detects the issue is tied to the optimiser).

The reason for that is that the Tulip classes are using mostly inlined 
code. There are only a few non-inline functions to be called, mostly only 
when needing to reallocate the collection or rehash it. But, when not 
using optimisation, all function calls are out-of-line, which means 
there's a huge overhead of function calls.

[gcc's -O0 should be understood as "generate dumb code" -- if you read the 
assembly output, you know you can do better than gcc. I remember looking 
into the disassembly of a portion of Qt on one of our Itanium boxes and I 
found a code sequence that was doing the following to a value: copy from 
r33 to r14, save from r14 to memory, load from memory to r37, call other 
function.]

-- 
 [ signature omitted ] 

Attachment: signature.asc
Description: This is a digitally signed message part.


Message 5 in thread

Thiago Macieira wrote:

> Thanks Andy,
> 
> That was a very interesting read. Ten times speed improvement is something 
> to be proud of.

Or maybe I should be embarrassed it was 10 times slower than it should 
have been in the first place. ;0)

> 
> I did some optimisation of QtDBus between Qt 4.2 and 4.3, but I only 
> achieved 6x speed improvement :-)

I once managed an improvement of over 4 or 5 orders of magnitude (I 
forget) in some raster processing code I wrote. I started with a very 
basic approach and then added lots of shortcuts until it was fast enough.

> 
> In any case, for Linux, I recommend using valgrind --tool=callgrind to 
> profile your application. It runs the application entirely within an 
> emulated environment. If you turn on some more information collection, 
> you'll be able to determine if you're having cache misses too.
> 
> Another good profiler for Linux is oprofile. Valgrind does estimates based 
> on the instruction count, while oprofile is time-based. Both tools can be 
> useful and be able to find issues the other won't.

I have used valgrid in the past and it is great. But I don't support 
PerfectTablePlan on Linux. Only Windows and Mac. Linux doesn't pay the 
bills!

> 
> Finally, when you're investigating a multi-threaded approach to 
> PerfectTablePlan, I suggest you take a look at QtConcurrent. Distributing 
> a genetic algorithm should be more-or-less simple with QtConcurrent. For 
> example, each individual in a generation gets a thread and you wait for 
> them all to be done.

Interesting idea. I will look into that at some point.

best regards

Andy Brice
http://www.perfecttableplan.com
http://www.successfulsoftware.net

--
 [ signature omitted ] 

Message 6 in thread

Andy Brice wrote:
> Interesting idea. I will look into that at some point.

Intel's Thread Building Blocks may be something to use for algorithms
like this.

 http://www.intel.com/cd/software/products/asmo-na/eng/294797.htm
 http://www.intel.com/software/products/tbb/

TBB seems to scale very nicely with cores.

As for Qt optimization, O1 or O2 doesn't really matter unless you are
building something that is very CPU needy. But then the bottleneck is
generally your algorithm, not Qt.

- Adam

--
 [ signature omitted ] 

Message 7 in thread

> I have used valgrid in the past and it is great. But I don't support
> PerfectTablePlan on Linux. Only Windows and Mac. Linux doesn't pay the
> bills!

Having Qt apps unofficially work on Linux just so that you the developer can 
run Valgrind on it in my opinion worth it.  The ease that it catches memory 
leaks and the amount it can help speed up your code helps pay the bills by 
reducing your development time and making a better product.

-Benjamin Meyer

--
 [ signature omitted ]