Qt-interest Archive, February 2008
Re: Parse a huge XML file
Message 1 in thread
OK, I think I will forget this tag index idea.
However, I am still very interested by a way to decompress a gz stream "on
the fly" to feed QXmlStreamReader. By now I receive the 5GB by HTTP, store
it in a file, decompress it in a file (60GB) and parse it. A all-in-one
process would be very appreciated... Maybe I can pipe through a QProcess
calling gunzip but this is not a satisfactory solution.
Thanks,
Etienne
Message 2 in thread
On Fri, February 1, 2008 11:03, Etienne Sandré wrote:
> OK, I think I will forget this tag index idea.
>
> However, I am still very interested by a way to decompress a gz stream "on
> the fly" to feed QXmlStreamReader. By now I receive the 5GB by HTTP, store
> it in a file, decompress it in a file (60GB) and parse it. A all-in-one
> process would be very appreciated... Maybe I can pipe through a QProcess
> calling gunzip but this is not a satisfactory solution.
Why not, that's a _good_ solution!
Another possibility would be to use "minizip" for example. I use it in one
of my Qt projects (http://www.mameworld.net/mamecat) and it's fast and good
enough even for these interactive purposes (which means it should be fine
for you as well :):
http://www.winimage.com/zLibDll/minizip.html
HTH, René
--
[ signature omitted ]
Message 3 in thread
On Fri, February 1, 2008 11:23, R. Reucher wrote:
> On Fri, February 1, 2008 11:03, Etienne Sandré wrote:
>> OK, I think I will forget this tag index idea.
>>
>> However, I am still very interested by a way to decompress a gz stream
>> "on
>> the fly" to feed QXmlStreamReader. By now I receive the 5GB by HTTP, store
>> it in a file, decompress it in a file (60GB) and parse it. A all-in-one
>> process would be very appreciated... Maybe I can pipe through a QProcess
>> calling gunzip but this is not a satisfactory solution.
> Why not, that's a _good_ solution!
Sorry, not clear enough... what I meant was to us a "pipe" through QProcess!
http://doc.trolltech.com/4.3/qprocess.html#readyReadStandardOutput
René
--
[ signature omitted ]
Message 4 in thread
That's a good solution for unix apps, but I would like Windows and Mac users
to use it as well. This would require to include an executable with the
application package, and to check for the executable name (gzip, gzip.exe,
etc..)
Regards,
Etienne
2008/2/1, R. Reucher <rene.reucher@xxxxxxxxxxxxx>:
>
> On Fri, February 1, 2008 11:23, R. Reucher wrote:
> > On Fri, February 1, 2008 11:03, Etienne SandrÃ(c) wrote:
> >> OK, I think I will forget this tag index idea.
> >>
> >> However, I am still very interested by a way to decompress a gz stream
> >> "on
> >> the fly" to feed QXmlStreamReader. By now I receive the 5GB by HTTP,
> store
> >> it in a file, decompress it in a file (60GB) and parse it. A
> all-in-one
> >> process would be very appreciated... Maybe I can pipe through a
> QProcess
> >> calling gunzip but this is not a satisfactory solution.
> > Why not, that's a _good_ solution!
> Sorry, not clear enough... what I meant was to us a "pipe" through
> QProcess!
>
> http://doc.trolltech.com/4.3/qprocess.html#readyReadStandardOutput
>
> RenÃ
> --
> Renà Reucher
> Tel: +49 160 7115802
> FAX: +49 6359 205423
> rene.reucher@xxxxxxxxxxxxx
> http://www.batcom-it.net/
>
> --
> To unsubscribe - send a mail to qt-interest-request@xxxxxxxxxxxxx with
> "unsubscribe" in the subject or the body.
> List archive and information: http://lists.trolltech.com/qt-interest/
>
>
Message 5 in thread
Hi,
Etienne Sandrà wrote:
> That's a good solution for unix apps, but I would like Windows and Mac
> users to use it as well. This would require to include an executable
> with the application package, and to check for the executable name
> (gzip, gzip.exe, etc..)
I'm not sure if anyone mentioned this yet:
http://trolltech.com/products/qt/addon/solutions/catalog/4/Utilities/qtiocompressor/
Tim
----------------------------------------------------------------------
dr. t. dewhirst [t] +44 (0)1738 450 465
director [w] www.bugless.co.uk
bugless software development ltd.
[a] algo business centre, glenearn road, perth, PH2 0NJ
--
[ signature omitted ]
Message 6 in thread
On Fri, February 1, 2008 13:41, Etienne Sandré wrote:
> That's a good solution for unix apps, but I would like Windows and Mac
> users to use it as well. This would require to include an executable with
> the application package, and to check for the executable name (gzip,
> gzip.exe, etc..)
I don't see your point, but anyway, then use my other suggestion. Minizip
works under Windows as well. You can integrate and freely redistribute it
with your app - as long as it's an open source project, that is.
Regards, René
--
[ signature omitted ]
Message 7 in thread
Maybe you can use QtIOCompressor :
The class works on top of a QIODevice subclass, compressing data before it is written and decompressing it when it is read. Since QtIOCompressor works on streams, it does not have to see the entire data set before compressing or decompressing it. This can reduce the memory requirements when working on large data sets.
I plan to use it on networked XML streams...
Julien.
-----Message d'origine-----
De : etienne.sandre.chardonnal@xxxxxxxxx [mailto:etienne.sandre.chardonnal@xxxxxxxxx]De la part de Etienne SandrÃ
Envoyà : vendredi 1 fÃvrier 2008 13:42
à : qt-interest@xxxxxxxxxxxxx
Objet : Re: Parse a huge XML file
That's a good solution for unix apps, but I would like Windows and Mac users to use it as well. This would require to include an executable with the application package, and to check for the executable name (gzip, gzip.exe, etc..)
Regards,
Etienne
2008/2/1, R. Reucher <rene.reucher@xxxxxxxxxxxxx>:
On Fri, February 1, 2008 11:23, R. Reucher wrote:
> On Fri, February 1, 2008 11:03, Etienne SandrÃÂ wrote:
>> OK, I think I will forget this tag index idea.
>>
>> However, I am still very interested by a way to decompress a gz stream
>> "on
>> the fly" to feed QXmlStreamReader. By now I receive the 5GB by HTTP, store
>> it in a file, decompress it in a file (60GB) and parse it. A all-in-one
>> process would be very appreciated... Maybe I can pipe through a QProcess
>> calling gunzip but this is not a satisfactory solution.
> Why not, that's a _good_ solution!
Sorry, not clear enough... what I meant was to us a "pipe" through QProcess!
http://doc.trolltech.com/4.3/qprocess.html#readyReadStandardOutput
RenÃ
--
Renà Reucher
Tel: +49 160 7115802
FAX: +49 6359 205423
rene.reucher@xxxxxxxxxxxxx
http://www.batcom-it.net/
--
To unsubscribe - send a mail to qt-interest-request@xxxxxxxxxxxxx with "unsubscribe" in the subject or the body.
List archive and information: http://lists.trolltech.com/qt-interest/
Message 8 in thread
Unfortunately:
"Only available for Qt Solutions license holder with a valid Support and
Maintenance agreement (authentication required)."
Etienne
Julien MONAT wrote:
> Maybe you can use QtIOCompressor :
> The class works on top of a QIODevice subclass, compressing data
> before it is written and decompressing it when it is read. Since
> QtIOCompressor works on streams, it does not have to see the entire
> data set before compressing or decompressing it. This can reduce the
> memory requirements when working on large data sets.
> I plan to use it on networked XML streams...
>
> Julien.
--
[ signature omitted ]
Message 9 in thread
Hi Dimitri,
why don't you suggest the third way:
http://doc.trolltech.com/4.3/qxmlstreamreader.html ??
I understood reading the doc about this class that it's a solution to
replace dom and faster than sax
is it right?
Veronique.
Dimitri a écrit :
> Hi,
>
>> Since this is too big to fit in memory, I want to build an index on
>> it. Is there a way with Qt classes to parse XML data without having
>> all document in memory, for instance with a non blocking parser that
>> will send events at each xml tag he encounters, but without storing
>> data incrementally in a QDomDocument?
>
> Use the SAX parser instead of DOM:
> http://doc.trolltech.com/4.3/qtxml.html#the-qt-sax2-classes
>
> --
> Dimitri
>
> --
> To unsubscribe - send a mail to qt-interest-request@xxxxxxxxxxxxx with
> "unsubscribe" in the subject or the body.
> List archive and information: http://lists.trolltech.com/qt-interest/
>
>
--
[ signature omitted ]