Qt-interest Archive, February 2008
[qt4-interest] QXmlStreamWriter puts strange bytes inside xml directive
Message 1 in thread
Hi,
I wrote a single source program which uses QXmlStreamWriter to write a
simple XML file.
// main.cpp
#include <QCoreApplication>
#include <QtXml>
#include <QFile>
#include <iostream>
using namespace std;
int main(int argc, char *argv[])
{
QCoreApplication app(argc, argv);
QFile file("test.txt");
if( !file.open(QIODevice::WriteOnly) ) {
cerr << "Unable to open output file" << endl;
return 1;
}
QXmlStreamWriter out(&file);
out.setCodec("UTF-8");
out.writeStartDocument();
out.writeStartElement("tag");
out.writeEndElement();
out.writeEndDocument();
return 0;
}
# test.pro
TEMPLATE = app
TARGET =
DEPENDPATH += .
INCLUDEPATH += .
QT += xml
QT -= gui
CONFIG += console
# Input
SOURCES += main.cpp
///////////////////////////////////////////////////////////////////////
The output of g++ -v is
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-
1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile
--enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar--with-cpu=generic
--host=i386-redhat-linux
Thread model: posix
gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)
The output of uname -a is
Linux localhost.localdomain 2.6.23.8-63.fc8 #1 SMP Wed Nov 21 18:51:08 EST
2007 i686 i686 i386 GNU/Linux
I use Qt 4.3.3
The hex dump of the xml directive is
0000:0000 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 ef <?xml version="ï
0000:0010 bb bf 31 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d »¿1.0" encoding=
0000:0020 22 55 54 46 2d 38 22 3f 3e 3c 74 61 67 3e 3c 2f "UTF-8"?><tag></
I don't know why there are some bytes between the " and the 1.0 value inside
the version attribute.
Also, if I remove the statement
out.setCodec("UTF-8")
those bytes disappear.
Message 2 in thread
> The same problem under windows... (I use QT 4.3.2)
> As far as I understood there is a problem in QXmlStreamWriter.
> Method writeStartDocument looks like:
>
> void QXmlStreamWriter::writeStartDocument()
> {
> writeStartDocument(QLatin1String("1.0"));
> }
>
> Where 1.0 is declard as QLatin1String.
>
> Lateron in writeStartDocument this String is treated as a unicode
> string.
> What You see in the outputfile is the prefix for unicode for 1.0!
>
> Regards
> karl-heinz
> www.techdrivers.de
>
--
[ signature omitted ]
Message 3 in thread
Thank you for the quick answer, even if I don't understand what do you mean
for Unicode prefix. If you remove the setCodec("UTF-8"), do those bytes
disappear?
Message 4 in thread
As far as I see, You can omit the setCodec command, as (at least under
Windows)
QXmlStreamWriter writes UTF-8 anyway ...
What I mean with unicode prefix is the so called BOM (Byte order mark).
see Wikipedia..http://en.wikipedia.org/wiki/Byte-order_mark
best regards
karl-heinz
>
> -------- Original-Nachricht --------
> Datum: Thu, 28 Feb 2008 12:45:21 +0100
> Von: "Manuel Fiorelli" <manuel.fiorelli@xxxxxxxxx>
> An: qt-interest@xxxxxxxxxxxxx
> Betreff: Re: [qt4-interest] QXmlStreamWriter puts strange bytes inside
> xml directive
>
>
> Thank you for the quick answer, even if I don't understand what do you
> mean for Unicode prefix. If you remove the setCodec("UTF-8"), do those
> bytes disappear?
>
>
--
[ signature omitted ]
Message 5 in thread
---------- Forwarded message ----------
From: Manuel Fiorelli <manuel.fiorelli@xxxxxxxxx>
Date: 28-feb-2008 13.58
Subject: Re: [qt4-interest] QXmlStreamWriter puts strange bytes inside xml
directive
To: Karl-Heinz Reichel <khReichel@xxxxxx>
Now I understand perfectly what you meant. Shouldn't BOM occur at the
beginning of the Unicode file? May we consider this behavior a bug, or at
least a deprecable behavior?
I know that the default encoding for QXmlStreamWriter is UTF-8, but I saw
that if one removes the call to setCodec, then the BOM disappears: does
anyone confirm that?