Trolltech Home | Qt-interest Home | Recent Threads | All Threads | Author | Date
All threads index page 1

Qt-interest Archive, May 2007
Painting a large number of rectangles as fast as possible [Qt 4.2.3]

Pages: Prev | 1 | 2 | Next

Message 1 in thread

Hi,

what is the fastest way to get a large number of rectangles on the screen on
unaccelerated hardware?

I have to display files of an internal vector format. Average pictures
consist of about 10 million rectangles. I have considered to use
QGraphicsView but the memory consumption is too high. To create 10 million
QGraphicsRectItem objects one would need 1200 MB of memory. The current
viewer needs only 220 MB.

My current solution is the following one:

After each window resize or zoom event I paint the current visible picture
on a QImage, convert it to a QPixmap and use QPainter::drawPixmap() to draw
the needed regions of my widget. I cache the QPixmap and use it till the
next zoom or resize event occurs. One optimization I do is to disregard all
rectangles that have a size smaller than 50% of a pixel.

The problem is that painting on the QImage is quite slow. Painting 11
million rectangles takes about 16 seconds on my machine. This is too much
for a viewer.

Directly painting to a QPixmap is slower. It takes about 50 seconds.

Painting directly on the widget takes about 100 seconds. This is the slowest
solution. Here the time is paid for each update. I have appended a OProfile
profile that shows where the runtime is spent when painting to the widget
directly. It seems as if a lot of runtime is wasted in memory allocation
and deallocation.

3015     10.4325  libc-2.4.so              _int_malloc
2507      8.6747  libc-2.4.so              _int_free
1501      5.1938  libQtGui.so.4.2.3        QPolygonClipper<qt_float_point,
qt_float_point, float>::clipPolygon(qt_float_point const*, int,
qt_float_poin
t**, int*, bool)
1277      4.4187  libQtGui.so.4.2.3       
QPainterPath::toSubpathPolygons(QMatrix const&) const
1228      4.2491  libc-2.4.so              malloc_consolidate
940       3.2526  libQtGui.so.4.2.3       
QPainterPath::toFillPolygons(QMatrix const&) const
897       3.1038  libc-2.4.so              free
887       3.0692  libc-2.4.so              malloc
857       2.9654  libQtCore.so.4.2.3       QListData::detach()
724       2.5052  libc-2.4.so              memcpy
669       2.3149  libc-2.4.so              _int_realloc
640       2.2145  libQtGui.so.4.2.3        QList<QPolygonF>::detach_helper()
612       2.1176  libQtGui.so.4.2.3        QVector<QPointF>::realloc(int,
int)
492       1.7024  libc-2.4.so              realloc


To sum it up:

QImage + QPixmap conversion:  16 seconds
QPixmap:                      50 seconds
Widget:                      100 seconds

My question is now, whether there are ways to speed the whole thing up?
Using OpenGL is not an option because our X11 terminals do not have the
required hardware. 

There has to be a way to make it faster because the legacy app paints the
whole window within 3 seconds and as far as I can see it draws directly to
X11.

Greetings
Christoph Bartoschek

--
 [ signature omitted ] 

Message 2 in thread

> QImage + QPixmap conversion:  16 seconds
> QPixmap:                      50 seconds
> Widget:                      100 seconds
> 
> My question is now, whether there are ways to speed the whole thing
up?
> Using OpenGL is not an option because our X11 terminals do not have
the
> required hardware.
> 
> There has to be a way to make it faster because the legacy app paints
the
> whole window within 3 seconds and as far as I can see it draws
directly to
> X11.

Without knowing more details about the structure of the rectangles its
impossible to say.

maybe there is some hierarchy in the data that can be used to speed
things up? maybe it's possible to discard big chunks of the data because
they are off-screen?

painting 11 million rectangles every refresh seems to be a lot - what
kind of application is this?

Cheers,
Peter

--
 [ signature omitted ] 

Message 3 in thread

Peter Prade wrote:

> Without knowing more details about the structure of the rectangles its
> impossible to say.

The vector format can be used for any data. However we use it mostly for
displaying a subset of the routing wires of VLSI designs. Therefore more
than 99.9% of all objects are rectangles.

> maybe there is some hierarchy in the data that can be used to speed
> things up? maybe it's possible to discard big chunks of the data because
> they are off-screen?

There is no imposed hierarchy but I already discard all objects that is
off-screen during painting.

> painting 11 million rectangles every refresh seems to be a lot - what
> kind of application is this?

I cache the painting result in a QPixmap and copy the needed portions to the
widget.
I mentioned painting on each refresh only to show that this is too slow. But
if the user wants to see the whole picture one has to paint all objects.
That could possibly be large enough.

To speed up higher zoom levels I could use a BSP Tree or something similar
to find the relevant objects faster. However my experiments with the Qt BSP
implementation had the problem thaz orderings in the z-dimension were
unstable.

Greetings
Christoph

--
 [ signature omitted ] 

Message 4 in thread

Christoph Bartoschek wrote:
> Peter Prade wrote:
> 
>> Without knowing more details about the structure of the rectangles its
>> impossible to say.
> 
> The vector format can be used for any data. However we use it mostly for
> displaying a subset of the routing wires of VLSI designs. Therefore more
> than 99.9% of all objects are rectangles.
> 
>> maybe there is some hierarchy in the data that can be used to speed
>> things up? maybe it's possible to discard big chunks of the data because
>> they are off-screen?
> 
> There is no imposed hierarchy but I already discard all objects that is
> off-screen during painting.
> 
>> painting 11 million rectangles every refresh seems to be a lot - what
>> kind of application is this?
> 
> I cache the painting result in a QPixmap and copy the needed portions to the
> widget.
> I mentioned painting on each refresh only to show that this is too slow. But
> if the user wants to see the whole picture one has to paint all objects.
> That could possibly be large enough.
> 
> To speed up higher zoom levels I could use a BSP Tree or something similar
> to find the relevant objects faster. However my experiments with the Qt BSP
> implementation had the problem thaz orderings in the z-dimension were
> unstable.

How about, drawing only a few rectangles during resizing and in a 2.
Thread drawing all in QPixmap and when the 2. Thread is ready draw the
result in the widget...?

Stefan

--
 [ signature omitted ] 

Message 5 in thread

Weinzierl Stefan wrote:

> How about, drawing only a few rectangles during resizing and in a 2.
> Thread drawing all in QPixmap and when the 2. Thread is ready draw the
> result in the widget...?

This is a nice idea. Maybe I will try it.

Thanks
Christoph

--
 [ signature omitted ] 

Message 6 in thread

Christoph Bartoschek wrote:
> Weinzierl Stefan wrote:
>
>   
>> How about, drawing only a few rectangles during resizing and in a 2.
>> Thread drawing all in QPixmap and when the 2. Thread is ready draw the
>> result in the widget...?
>>     
>
> This is a nice idea. Maybe I will try it.
>
> Thanks
> Christoph
>   


Except -- QPixmap is not thread safe. If you do that you have to use
QImage.

Anyway, you should seriously consider using OpenGL directly, because the
scenario you are describing (drawing millions of quads) simply begs for
hardware acceleration. Then you may not even need to do the culling
yourself, OpenGL will do this for you.

    Paul.

--
 [ signature omitted ] 

Message 7 in thread

Paul Koshevoy wrote:

> Except -- QPixmap is not thread safe. If you do that you have to use
> QImage.

I thought of painting a part of the picture in each thread and then merging
them together. But I would also use a QImage because my data says that it
is faster than QPixmap.

> Anyway, you should seriously consider using OpenGL directly, because the
> scenario you are describing (drawing millions of quads) simply begs for
> hardware acceleration. Then you may not even need to do the culling
> yourself, OpenGL will do this for you.

I would like to do that. But our X11 terminals do not support OpenGL and
some notebooks have graphicchips without proper support from the vendors.

Christoph

--
 [ signature omitted ] 

Message 8 in thread

Hi,

> I have to display files of an internal vector format. Average pictures
> consist of about 10 million rectangles. I have considered to use
> QGraphicsView but the memory consumption is too high. To create 10 million
> QGraphicsRectItem objects one would need 1200 MB of memory. The current
> viewer needs only 220 MB.

Indeed QGraphicsView is not intended for general purpose graphics:
	http://doc.trolltech.com/4.2/graphicsview.html
Very often users think that QGraphicsView (or QCanvas in Qt 3) is the widget 
of choice for drawing lines, rectangles, etc. It's not, just use a plain 
QWidget for that.

Also note that a 1600 x 1200 display is less than 2 million pixels. There must 
be some way to pre-process these  10 million rectangles to display only those 
that can be seen.

--
 [ signature omitted ] 

Message 9 in thread

Dimitri wrote:

> Also note that a 1600 x 1200 display is less than 2 million pixels. There
> must
> be some way to pre-process these  10 million rectangles to display only
> those that can be seen.

That's why I omit rectangles that are smaller than a pixel during painting.
The numbers I show are with this optimization enabled. If I do not paint
anything but only apply this check on all rectangles it takes 0.4 Seconds.
This means that all other runtime is spent in painting.

In addition to preprocessing the data, are there any other tricks to speed
up painting?

Christoph

--
 [ signature omitted ] 

Message 10 in thread

Christoph Bartoschek wrote:
> Dimitri wrote:
> 
>> Also note that a 1600 x 1200 display is less than 2 million pixels. There
>> must
>> be some way to pre-process these  10 million rectangles to display only
>> those that can be seen.
> 
> That's why I omit rectangles that are smaller than a pixel during painting.
> The numbers I show are with this optimization enabled. If I do not paint
> anything but only apply this check on all rectangles it takes 0.4 Seconds.
> This means that all other runtime is spent in painting.
> 
> In addition to preprocessing the data, are there any other tricks to speed
> up painting?

The drawing speed will also depend on how exactly you draw the 
rectangles with Qt, and if you have both a pen and a brush set. If you 
have a QPainter::drawRect() call for every single rectangle, it could be 
an idea to batch them up and use QPainter::drawRects() instead.
Make sure you turn the pen off if you only need to draw filled 
rectangles, instead of setting the pen color to the same as the fill. 
Alot of time might be spent on stroking the rectangles.
Another thing that will influence the drawing speed is whether or not 
you use colors with alpha components. Drawing without any alpha 
components is considerably faster under X11, since the core X11 library 
calls can be used directly.

--
 [ signature omitted ] 

Message 11 in thread

Trond Kjernaasen wrote:

> The drawing speed will also depend on how exactly you draw the
> rectangles with Qt, and if you have both a pen and a brush set. If you
> have a QPainter::drawRect() call for every single rectangle, it could be
> an idea to batch them up and use QPainter::drawRects() instead.

The speedup of this is not significant.

> Make sure you turn the pen off if you only need to draw filled
> rectangles, instead of setting the pen color to the same as the fill.

This is 5% faster but most of the rectangles vanish from the picture.
Drawing them with a pen results in what the users expect

> Alot of time might be spent on stroking the rectangles.
> Another thing that will influence the drawing speed is whether or not
> you use colors with alpha components. Drawing without any alpha
> components is considerably faster under X11, since the core X11 library
> calls can be used directly.

Drawing to a QPixmap uses X11. I've measured that my testprogramm sents
300MB of data through the network. Even with FastEthernet that takes more
time than I want to spend.

Drawing to an QImage uses no X11 and oprofile says that 70% of the time is
spent in qt_memfill32_sse2. I've measured how the count distribution of
this function is and was surprised to see that 98% of all calls are with
count=1.

I do not use alpha and I can see that this is much faster than a testcase
with alpha. I've also disabled antialiasing for the rectangles which gives
me a big speedup.

If QPainter could say to me which pixels are set by a drawRect operation, I
could use this information to quickly determine which rectangles are
covered by others and omit them for painting. However I do not see a method
that can give me the information. Is there anything?

Greetings
Christoph Bartoschek

--
 [ signature omitted ] 

Message 12 in thread

On Sunday 06 May 2007 12:12, Christoph Bartoschek wrote:
> Drawing to an QImage uses no X11 and oprofile says that 70% of the time is
> spent in qt_memfill32_sse2. I've measured how the count distribution of
> this function is and was surprised to see that 98% of all calls are with
> count=1.

qt_memfill32_sse2 was introduced for 4.3, so I guess you're using the beta 
package? Could you try the attached patch and let me know whether it makes a 
difference?

Thanks,
--
 [ signature omitted ] 
--- src/gui/painting/qdrawhelper_p.h~	2007-03-15 11:46:50.000000000 +0100
+++ src/gui/painting/qdrawhelper_p.h	2007-05-07 09:31:20.000000000 +0200
@@ -321,12 +321,30 @@
 
 template<> inline void qt_memfill(quint32 *dest, quint32 color, int count)
 {
+    if (count <= 4) {
+        switch (count) {
+        case 4: *dest++ = color;
+        case 3: *dest++ = color;
+        case 2: *dest++ = color;
+        case 1: *dest++ = color;
+        }
+        return;
+    }
     extern void (*qt_memfill32)(quint32 *dest, quint32 value, int count);
     qt_memfill32(dest, color, count);
 }
 
 template<> inline void qt_memfill(quint16 *dest, quint16 color, int count)
 {
+    if (count <= 4) {
+        switch (count) {
+        case 4: *dest++ = color;
+        case 3: *dest++ = color;
+        case 2: *dest++ = color;
+        case 1: *dest++ = color;
+        }
+        return;
+    }
     extern void (*qt_memfill16)(quint16 *dest, quint16 value, int count);
     qt_memfill16(dest, color, count);
 }

Message 13 in thread

HÃvard Wall wrote:

> On Sunday 06 May 2007 12:12, Christoph Bartoschek wrote:
>> Drawing to an QImage uses no X11 and oprofile says that 70% of the time
>> is spent in qt_memfill32_sse2. I've measured how the count distribution
>> of this function is and was surprised to see that 98% of all calls are
>> with count=1.
> 
> qt_memfill32_sse2 was introduced for 4.3, so I guess you're using the beta
> package? 

Yes, I've changed to 4.3 to test the newest code.

> Could you try the attached patch and let me know whether it makes 
> a difference?

No there is no difference. The runtimes first seemed to be better for the
old code (106 sec instead of 113 sec) but some iterations of compiling the
old and new code revealed that both codes run from 104 up to 116 seconds.
The strange thing is that for a single compilation the numbers seem to be
stable.

I guess the current code already hits the limit of the memory bus. It seems
as if the only way to speed it up is to reduce the memory access. 

Thanks
Christoph


--
 [ signature omitted ] 

Message 14 in thread

Christoph Bartoschek wrote:
[snip]

> I do not use alpha and I can see that this is much faster than a testcase
> with alpha. I've also disabled antialiasing for the rectangles which gives
> me a big speedup.

Looking at the profiler output you posted in your initial mail, I can 
see that functions like QPolygonClipper::clipPolygon(), 
QPainterPath::toSubPathPolygons()/toFillPolygons() are high on the list. 
Note that those functions are only called by the X11 engine if it can't 
use the XDrawRect/FillRect calls directly. That's an indication that you 
are using some painter properties that causes the engine to fallback to 
drawing polygons, which you must try to avoid at all costs when 
optimizing for speed. What pen/brush settings are you using? Are you 
using any kind of transforms?

> If QPainter could say to me which pixels are set by a drawRect operation, I
> could use this information to quickly determine which rectangles are
> covered by others and omit them for painting. However I do not see a method
> that can give me the information. Is there anything?

Right, there is no such functionality in QPainter.

Regards,
--
 [ signature omitted ] 

Message 15 in thread

Trond Kjernaasen wrote:

> Looking at the profiler output you posted in your initial mail, I can
> see that functions like QPolygonClipper::clipPolygon(),
> QPainterPath::toSubPathPolygons()/toFillPolygons() are high on the list.
> Note that those functions are only called by the X11 engine if it can't
> use the XDrawRect/FillRect calls directly. That's an indication that you
> are using some painter properties that causes the engine to fallback to
> drawing polygons, which you must try to avoid at all costs when
> optimizing for speed. What pen/brush settings are you using? Are you
> using any kind of transforms?

The profile was an example what happens, if I draw directly to the widget.
This was the slowest solution. That's why I now paint to a QImage. The main
benefit is, that no data is sent over the network for painting. The
pen/brush settings are the default ones. I just set the color. I use a
transformation matrix that handles zooming into the picture.

>> If QPainter could say to me which pixels are set by a drawRect operation,
>> I could use this information to quickly determine which rectangles are
>> covered by others and omit them for painting. However I do not see a
>> method that can give me the information. Is there anything?
> 
> Right, there is no such functionality in QPainter.

Ok, how is painting of QRectF rectangle specified? How is QRectF(2.5, 3.5,
6, 7) painted?  Given this information one could deduce the painted area.

Christoph

--
 [ signature omitted ] 

Pages: Prev | 1 | 2 | Next