| Trolltech Home | Qt-interest Home | Recent Threads | All Threads | Author | Date | |
| All threads index page 3 | |
The default compiler optimisation flags set by Qt are: -O1 (Visual Studio) -Os (GCC) But I did some experiments with my genetic algorithm code and found these weren't the best settings for me. If you are interested the results are here: http://successfulsoftware.net/2007/12/18/optimising-your-application/ best regards Andy Brice http://www.perfecttableplan.com -- [ signature omitted ]
Are the binary msvc releases on my download place at trolltech also build with O1 flag? Then one probably should build the qt manually for the releases? Regards Knut On 12/18/07, Andy Brice <andy@xxxxxxxxxxxxxxx> wrote: > The default compiler optimisation flags set by Qt are: > > -O1 (Visual Studio) > -Os (GCC) > > But I did some experiments with my genetic algorithm code and found > these weren't the best settings for me. If you are interested the > results are here: > > http://successfulsoftware.net/2007/12/18/optimising-your-application/ > > best regards > > Andy Brice > http://www.perfecttableplan.com > > -- > To unsubscribe - send a mail to qt-interest-request@xxxxxxxxxxxxx with "unsubscribe" in the subject or the body. > List archive and information: http://lists.trolltech.com/qt-interest/ > > -- [ signature omitted ]
Andy Brice wrote: >The default compiler optimisation flags set by Qt are: > >-O1 (Visual Studio) >-Os (GCC) > >But I did some experiments with my genetic algorithm code and found >these weren't the best settings for me. If you are interested the >results are here: > >http://successfulsoftware.net/2007/12/18/optimising-your-application/ > >best regards > >Andy Brice >http://www.perfecttableplan.com Thanks Andy, That was a very interesting read. Ten times speed improvement is something to be proud of. I did some optimisation of QtDBus between Qt 4.2 and 4.3, but I only achieved 6x speed improvement :-) In any case, for Linux, I recommend using valgrind --tool=callgrind to profile your application. It runs the application entirely within an emulated environment. If you turn on some more information collection, you'll be able to determine if you're having cache misses too. Another good profiler for Linux is oprofile. Valgrind does estimates based on the instruction count, while oprofile is time-based. Both tools can be useful and be able to find issues the other won't. Finally, when you're investigating a multi-threaded approach to PerfectTablePlan, I suggest you take a look at QtConcurrent. Distributing a genetic algorithm should be more-or-less simple with QtConcurrent. For example, each individual in a generation gets a thread and you wait for them all to be done. -- [ signature omitted ]
Attachment:
signature.asc
Description: This is a digitally signed message part.
Thiago Macieira wrote: >Andy Brice wrote: >>The default compiler optimisation flags set by Qt are: >> >>-O1 (Visual Studio) >>-Os (GCC) >> >>But I did some experiments with my genetic algorithm code and found >>these weren't the best settings for me. If you are interested the >>results are here: >> >>http://successfulsoftware.net/2007/12/18/optimising-your-application/ >> >>best regards >> >>Andy Brice >>http://www.perfecttableplan.com > >Thanks Andy, > >That was a very interesting read. Ten times speed improvement is > something to be proud of. Of course, I forgot to say the other thing I was going to say: Any Qt application that uses the Tulip classes is extremely slow if built without optimisation. Especially in gcc, where -O0 means no optimisation at all (even some warnings are disabled in -O0 because the code that detects the issue is tied to the optimiser). The reason for that is that the Tulip classes are using mostly inlined code. There are only a few non-inline functions to be called, mostly only when needing to reallocate the collection or rehash it. But, when not using optimisation, all function calls are out-of-line, which means there's a huge overhead of function calls. [gcc's -O0 should be understood as "generate dumb code" -- if you read the assembly output, you know you can do better than gcc. I remember looking into the disassembly of a portion of Qt on one of our Itanium boxes and I found a code sequence that was doing the following to a value: copy from r33 to r14, save from r14 to memory, load from memory to r37, call other function.] -- [ signature omitted ]
Attachment:
signature.asc
Description: This is a digitally signed message part.
Thiago Macieira wrote: > Thanks Andy, > > That was a very interesting read. Ten times speed improvement is something > to be proud of. Or maybe I should be embarrassed it was 10 times slower than it should have been in the first place. ;0) > > I did some optimisation of QtDBus between Qt 4.2 and 4.3, but I only > achieved 6x speed improvement :-) I once managed an improvement of over 4 or 5 orders of magnitude (I forget) in some raster processing code I wrote. I started with a very basic approach and then added lots of shortcuts until it was fast enough. > > In any case, for Linux, I recommend using valgrind --tool=callgrind to > profile your application. It runs the application entirely within an > emulated environment. If you turn on some more information collection, > you'll be able to determine if you're having cache misses too. > > Another good profiler for Linux is oprofile. Valgrind does estimates based > on the instruction count, while oprofile is time-based. Both tools can be > useful and be able to find issues the other won't. I have used valgrid in the past and it is great. But I don't support PerfectTablePlan on Linux. Only Windows and Mac. Linux doesn't pay the bills! > > Finally, when you're investigating a multi-threaded approach to > PerfectTablePlan, I suggest you take a look at QtConcurrent. Distributing > a genetic algorithm should be more-or-less simple with QtConcurrent. For > example, each individual in a generation gets a thread and you wait for > them all to be done. Interesting idea. I will look into that at some point. best regards Andy Brice http://www.perfecttableplan.com http://www.successfulsoftware.net -- [ signature omitted ]
Andy Brice wrote: > Interesting idea. I will look into that at some point. Intel's Thread Building Blocks may be something to use for algorithms like this. http://www.intel.com/cd/software/products/asmo-na/eng/294797.htm http://www.intel.com/software/products/tbb/ TBB seems to scale very nicely with cores. As for Qt optimization, O1 or O2 doesn't really matter unless you are building something that is very CPU needy. But then the bottleneck is generally your algorithm, not Qt. - Adam -- [ signature omitted ]
> I have used valgrid in the past and it is great. But I don't support > PerfectTablePlan on Linux. Only Windows and Mac. Linux doesn't pay the > bills! Having Qt apps unofficially work on Linux just so that you the developer can run Valgrind on it in my opinion worth it. The ease that it catches memory leaks and the amount it can help speed up your code helps pay the bills by reducing your development time and making a better product. -Benjamin Meyer -- [ signature omitted ]