Discussion:
[Interest] Faster QXmlStreamWriter?
Vadim Peretokin
2018-04-22 14:30:29 UTC
Permalink
I'm looking at optimizing an applications save performance - it is too slow
right now and causes too much of a freeze. We're using QXmlStreamWriter,
here's the implementation
<https://github.com/Mudlet/Mudlet/blob/development/src/XMLexport.cpp>.

Profiling suggests that Qt's write element / attribute method is too slow (
screenshot <https://imgur.com/a/UUVSp4I>, results attached). Is there
anything I can do it speed it up? I tried giving it a QBuffer instead of a
QFile, but the difference was marginal.

A lot of time seems to be spent resizing the buffer, and unfortunately
there doesn't seem to be a way to pre-set the QBuffer size to something
reasonable to begin with.

Anyone have tips?
Giuseppe D'Angelo
2018-04-22 14:53:07 UTC
Permalink
Post by Vadim Peretokin
A lot of time seems to be spent resizing the buffer, and unfortunately
there doesn't seem to be a way to pre-set the QBuffer size to something
reasonable to begin with.
Anyone have tips?
Didn't check the actual code, but you can always resize a QBuffer
internal bytearray:

QBuffer b;
b.buffer().resize(1234); // or reserve()

HTH,
--
Giuseppe D'Angelo | ***@kdab.com | Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com
KDAB - The Qt, C++ and OpenGL Experts
Vadim Peretokin
2018-04-23 05:15:48 UTC
Permalink
Thanks! That helped remove the resize operation from the profiling data,
but QBuffer::writeData is still there, and the overall time didn't change
much.

Is there anything else I can look at, or is this a dead end then - requires
a different serialization solution altogether (a lot of work)?
Post by Giuseppe D'Angelo
Post by Vadim Peretokin
A lot of time seems to be spent resizing the buffer, and unfortunately
there doesn't seem to be a way to pre-set the QBuffer size to something
reasonable to begin with.
Anyone have tips?
Didn't check the actual code, but you can always resize a QBuffer
QBuffer b;
b.buffer().resize(1234); // or reserve()
HTH,
--
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53 <+33%204%2090%2084%2008%2053>,
http://www.kdab.com
KDAB - The Qt, C++ and OpenGL Experts
_______________________________________________
Interest mailing list
http://lists.qt-project.org/mailman/listinfo/interest
Thiago Macieira
2018-04-23 06:32:02 UTC
Permalink
Post by Vadim Peretokin
Thanks! That helped remove the resize operation from the profiling data,
but QBuffer::writeData is still there, and the overall time didn't change
much.
Is there anything else I can look at, or is this a dead end then - requires
a different serialization solution altogether (a lot of work)?
Correct, QBuffer is not optimised for speed. I ran into that while developing
QCborStreamWriter. The solution was to bypass it for speed.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Vadim Peretokin
2018-04-24 05:02:20 UTC
Permalink
Post by Vadim Peretokin
Post by Vadim Peretokin
Thanks! That helped remove the resize operation from the profiling data,
but QBuffer::writeData is still there, and the overall time didn't change
much.
Is there anything else I can look at, or is this a dead end then -
requires
Post by Vadim Peretokin
a different serialization solution altogether (a lot of work)?
Correct, QBuffer is not optimised for speed. I ran into that while developing
QCborStreamWriter. The solution was to bypass it for speed.
OK, that's a pity. I'll look into using pugixml instead, and if anyone else
knows a faster XML serializer, I'd he happy to hear about it.
Thiago Macieira
2018-04-24 05:24:13 UTC
Permalink
Post by Vadim Peretokin
Post by Thiago Macieira
Correct, QBuffer is not optimised for speed. I ran into that while developing
QCborStreamWriter. The solution was to bypass it for speed.
OK, that's a pity. I'll look into using pugixml instead, and if anyone else
knows a faster XML serializer, I'd he happy to hear about it.
Any chance you can choose a different serialisation format?
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Vadim Peretokin
2018-04-24 08:39:56 UTC
Permalink
Post by Thiago Macieira
Post by Vadim Peretokin
Post by Thiago Macieira
Correct, QBuffer is not optimised for speed. I ran into that while developing
QCborStreamWriter. The solution was to bypass it for speed.
OK, that's a pity. I'll look into using pugixml instead, and if anyone
else
Post by Vadim Peretokin
knows a faster XML serializer, I'd he happy to hear about it.
Any chance you can choose a different serialisation format?
We're thinking on the consequences of that - if pugixml isn't fast enough
still, will look at a different format. Recommendations welcome! :)
Jason H
2018-04-24 14:36:20 UTC
Permalink
_______________________________________________
Interest mailing list
***@qt-project.org
http://lists.qt-project.org/mailman/listinfo/interest
Konstantin Tokarev
2018-04-24 14:39:08 UTC
Permalink
Post by Thiago Macieira
Post by Vadim Peretokin
Post by Thiago Macieira
Correct, QBuffer is not optimised for speed. I ran into that while developing
QCborStreamWriter. The solution was to bypass it for speed.
OK, that's a pity. I'll look into using pugixml instead, and if anyone else
knows a faster XML serializer, I'd he happy to hear about it.
Any chance you can choose a different serialisation format?
We're thinking on the consequences of that - if pugixml isn't fast enough still, will look at a different format. Recommendations welcome! :)
If your serialized data is intended to be read by Qt applications only, QDataStream may be a good choice
,
_______________________________________________
Interest mailing list
http://lists.qt-project.org/mailman/listinfo/interest
-- 
Regards,
Konstantin
Thiago Macieira
2018-04-24 15:52:53 UTC
Permalink
Post by Konstantin Tokarev
On Tue, Apr 24, 2018 at 8:14 AM Thiago Macieira
Post by Thiago Macieira
Post by Vadim Peretokin
Post by Thiago Macieira
Correct, QBuffer is not optimised for speed. I ran into that while developing
QCborStreamWriter. The solution was to bypass it for speed.
OK, that's a pity. I'll look into using pugixml instead, and if anyone else
knows a faster XML serializer, I'd he happy to hear about it.
Any chance you can choose a different serialisation format?
We're thinking on the consequences of that - if pugixml isn't fast enough
still, will look at a different format. Recommendations welcome! :)
If your serialized data is intended to be read by Qt applications only,
QDataStream may be a good choice
Also slow.

Binary QJsonDocument is the fastest, followed closely by QCborValue (new in
5.12).
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Thiago Macieira
2018-04-24 16:10:42 UTC
Permalink
Post by Thiago Macieira
Post by Konstantin Tokarev
If your serialized data is intended to be read by Qt applications only,
QDataStream may be a good choice
Also slow.
Binary QJsonDocument is the fastest, followed closely by QCborValue (new in
5.12).
Followed by regular text-form JSON.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Julien Cugnière
2018-04-24 16:39:37 UTC
Permalink
Post by Thiago Macieira
Post by Konstantin Tokarev
If your serialized data is intended to be read by Qt applications only,
QDataStream may be a good choice
Also slow.
Binary QJsonDocument is the fastest, followed closely by QCborValue (new in
5.12).
Out of curiosity, what makes QDataStream slow? From your other comment
on QBuffer, it sounds like it might be related to the QIODevice
interface. Does that mean it's impossible to get best performance
through QIODevice, because of some design flaw? Or is there something
else that makes QDataStream and QBuffer slow?
Thiago Macieira
2018-04-24 17:49:29 UTC
Permalink
Post by Julien Cugnière
Post by Thiago Macieira
Post by Konstantin Tokarev
If your serialized data is intended to be read by Qt applications only,
QDataStream may be a good choice
Also slow.
Binary QJsonDocument is the fastest, followed closely by QCborValue (new in
5.12).
Out of curiosity, what makes QDataStream slow? From your other comment
on QBuffer, it sounds like it might be related to the QIODevice
interface. Does that mean it's impossible to get best performance
through QIODevice, because of some design flaw? Or is there something
else that makes QDataStream and QBuffer slow?
I benchmarked the full result, but didn't investigate what makes QDataStream
slow. But it suffers from the same problem that QXmlStreamWriter does: it uses
a QBuffer to write to a QDataStream (see [1] and [2]). QCborStreamWriter does
the same, actually -- I've only fixed QCborStreamReader to bypass it.

QDataStream's format is also quite big, storing all QStrings as UTF-16 with a
4-byte length prefix. So "Hello World" takes 4 + 2*11 = 26 bytes, whereas in
CBOR it takes 1 + 11. The larger format could mean we reallocate more often
and that causes delays.

Another issue is that QDataStream tries to store all integers in big endian,
unless you tell it otherwise. Since most machines are little-endian, that's a
waste of CPU cycles, as you're doing the endian conversion twice, for little
gain, however fast it is.

Finally, yes, QIODevice is not designed for performance. It's quite slow and
there's little we can do about it. There have been a lot of behind-the-scenes
improvements, by way of smarter buffering and sharing of more code between the
main QIODevice classes (QFile, QProcess, QTcpSocket, QNetworkReply
implementations). QBuffer is usually forgotten in that. Still, the main
problem is QIODevice's *design* and we're unable to change it.

[1] https://code.woboq.org/qt5/qtbase/src/corelib/serialization/
qdatastream.cpp.html#_ZN11QDataStreamC1EP10QByteArray6QFlagsIN9QIODevice12OpenModeFlagEE
[2] https://code.woboq.org/qt5/qtbase/src/corelib/serialization/
qdatastream.cpp.html#_ZN11QDataStreamC1ERK10QByteArray
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Maurice Kalinowski
2018-04-25 05:46:43 UTC
Permalink
Post by Julien Cugnière
Post by Julien Cugnière
Post by Thiago Macieira
Post by Konstantin Tokarev
If your serialized data is intended to be read by Qt applications
only, QDataStream may be a good choice
Also slow.
Binary QJsonDocument is the fastest, followed closely by QCborValue
(new in 5.12).
Out of curiosity, what makes QDataStream slow? From your other
comment
Post by Julien Cugnière
on QBuffer, it sounds like it might be related to the QIODevice
interface. Does that mean it's impossible to get best performance
through QIODevice, because of some design flaw? Or is there something
else that makes QDataStream and QBuffer slow?
I benchmarked the full result, but didn't investigate what makes
QDataStream slow. But it suffers from the same problem that
QXmlStreamWriter does: it uses a QBuffer to write to a QDataStream (see [1]
and [2]). QCborStreamWriter does the same, actually -- I've only fixed
QCborStreamReader to bypass it.
[Maurice Kalinowski]

Hey,

Would you still have these benchmarks somewhere? I am currently playing around with various serialization ways as well, and so far came to different conclusions. Probably because of using different means, but would still like to see how you approached it for plain text json.

BR,
Maurice
Thiago Macieira
2018-04-25 06:36:15 UTC
Permalink
Post by Maurice Kalinowski
Post by Julien Cugnière
Post by Julien Cugnière
Post by Thiago Macieira
Post by Konstantin Tokarev
If your serialized data is intended to be read by Qt applications
only, QDataStream may be a good choice
Also slow.
Binary QJsonDocument is the fastest, followed closely by QCborValue
(new in 5.12).
Out of curiosity, what makes QDataStream slow? From your other
comment
Post by Julien Cugnière
on QBuffer, it sounds like it might be related to the QIODevice
interface. Does that mean it's impossible to get best performance
through QIODevice, because of some design flaw? Or is there something
else that makes QDataStream and QBuffer slow?
I benchmarked the full result, but didn't investigate what makes
QDataStream slow. But it suffers from the same problem that
QXmlStreamWriter does: it uses a QBuffer to write to a QDataStream (see [1]
and [2]). QCborStreamWriter does the same, actually -- I've only fixed
QCborStreamReader to bypass it.
[Maurice Kalinowski]
Hey,
Would you still have these benchmarks somewhere? I am currently playing
around with various serialization ways as well, and so far came to
different conclusions. Probably because of using different means, but would
still like to see how you approached it for plain text json.
I posted to dev in the CBOR thread a few months ago. Basically, the procedure
was:

1) find a large-ish JSON file as a test seed
2) make it bigger (replicate it 100x or more in an array) -- I used something
like 60 MB in binary JSON form, which is just about half the maximum
3) use the examples/corelib/serialization/convert tool[1] to convert to other
formats
4) use the same tool with the "null" output to benchmark reading[2]
5) use the tool with binary JSON as source to benchmark writing

I basically benchmarked using Linux's perf. The stat subcommand for overall
timings, but using perf record + perf annotate to analyse where the issues
were.

[1] https://codereview.qt-project.org/217410. The tool should work for all
other formats right now if you just remove "cborconverter.cpp" from the .pro
file.

[2] I modified main.cpp to read and write using QBuffer instead of QFile to
minimise the cost of QIODevice.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Thiago Macieira
2018-04-25 07:10:36 UTC
Permalink
Post by Thiago Macieira
3) use the examples/corelib/serialization/convert tool[1] to
convert to other formats
4) use the same tool with the "null" output to benchmark reading[2]
5) use the tool with binary JSON as source to benchmark writing
Attached is a sample shell session of how to use the tool and what it can do.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
alexander golks
2018-04-25 05:48:03 UTC
Permalink
This post might be inappropriate. Click to display it.
Thiago Macieira
2018-04-25 06:43:11 UTC
Permalink
Post by alexander golks
QJsonObject size maximum length 128MB
https://bugreports.qt.io/browse/QTBUG-47629
That's one of the reasons I'm adding CBOR support. The new implementation
scales much better than the binary JSON memory and on-disk format and has no
such limitation. Plus it's a standardised format, defined by the IETF and used
several new IoT protocols.

Since it uses QVector internally, it's currently limited to 2 GB vectors per
level, so each CBOR array can contain at most 2^31 / 16 = 2^27 elements. But
that's on each level: each element can be an array of 2^27 elements again.
Maps are limited to half that, so 2^26. There's also a limit on the total size
of strings in a map or array.

The code was carefully written so that in Qt 6, when we switch to 64-bit size
types for Qt containers, the limitations disappear.

And better: since CBOR is a full superset of JSON, the backend can be used to
hold JSON too. So Qt 6 QJsonDocument & family will have the 128 MB limit
removed, at the expense of the binary JSON format requiring parsing.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Konstantin Tokarev
2018-04-25 14:13:18 UTC
Permalink
Post by Thiago Macieira
So Qt 6 QJsonDocument & family will have the 128 MB limit
removed, at the expense of the binary JSON format requiring parsing.
That's sad, as we lose the single solution inside Qt for serialization without
parsing
Post by Thiago Macieira
--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
_______________________________________________
Interest mailing list
http://lists.qt-project.org/mailman/listinfo/interest
--
Regards,
Konstantin
Thiago Macieira
2018-04-25 15:45:41 UTC
Permalink
Post by Konstantin Tokarev
Post by Thiago Macieira
So Qt 6 QJsonDocument & family will have the 128 MB limit
removed, at the expense of the binary JSON format requiring parsing.
That's sad, as we lose the single solution inside Qt for serialization
without parsing
That was going to happen anyway, because of the 128 MB size limit. If we did
nothing else, we'd create a new format with a much expanded size limit, which
means the current format would need to be parsed and converted.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Thiago Macieira
2018-04-25 16:02:55 UTC
Permalink
Post by Thiago Macieira
Post by Konstantin Tokarev
Post by Thiago Macieira
So Qt 6 QJsonDocument & family will have the 128 MB limit
removed, at the expense of the binary JSON format requiring parsing.
That's sad, as we lose the single solution inside Qt for serialization
without parsing
That was going to happen anyway, because of the 128 MB size limit. If we did
nothing else, we'd create a new format with a much expanded size limit,
which means the current format would need to be parsed and converted.
By the way, QJsonDocument::fromBinaryData does still perform a correctness
check, to make sure it won't crash later reading corrupt data. You can skip
this step and then loading your data is extremely fast.

Here are my numbers comparing loading that 60+ MB file in both binary JSON
format (with validation) and CBOR:

Binary JSON:
69,844846 task-clock:u (msec)
196.906.259 cycles:u
422.255.714 instructions:u
[There's no readAll(); 70.2% of the time is spent inside
QJsonPrivate::Object::isValid]

JSON:
255,809132 task-clock:u (msec)
771.771.000 cycles:u
2.690.966.058 instructions:u
[80.2% inside QJsonPrivate::Parser::parseValue, 58.7% inside
QJsonPrivate::Parser::parseString and 16.3% inside QUtf8Functions::fromUtf8]

CBOR:
239,059121 task-clock:u (msec)
562.474.857 cycles:u
1.431.590.428 instructions:u
[71.6% inside QCborValue::fromCbor, 65.0% inside
QCborContainerPrivate::decodeStringFromCbor, 25.5% inside
QCborStreamReader::readStringChunk plus 12.6% inside QUtf8::isValidUtf8]

So it's just under 4x slower, but we're still talking about consuming over
250 MB/s of data.

PS: YMMV, especially if you don't use CPU-optimised UTF-8 methods like I do.
You need to compile your own Qt to get those.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Konstantin Tokarev
2018-04-25 16:55:55 UTC
Permalink
Post by Thiago Macieira
 > > So Qt 6 QJsonDocument & family will have the 128 MB limit
 > > removed, at the expense of the binary JSON format requiring parsing.
 >
 > That's sad, as we lose the single solution inside Qt for serialization
 > without parsing
 That was going to happen anyway, because of the 128 MB size limit. If we did
 nothing else, we'd create a new format with a much expanded size limit,
 which means the current format would need to be parsed and converted.
By the way, QJsonDocument::fromBinaryData does still perform a correctness
check, to make sure it won't crash later reading corrupt data. You can skip
this step and then loading your data is extremely fast.
For example, QtWebChannel implementations in QtWebKit and QtWebEngine
use QJsonDocument as a wire format for exchanging data between processes,
of course validation is skept because producer and consumer are reliable.
Post by Thiago Macieira
Here are my numbers comparing loading that 60+ MB file in both binary JSON
         69,844846 task-clock:u (msec)
       196.906.259 cycles:u
       422.255.714 instructions:u
[There's no readAll(); 70.2% of the time is spent inside
QJsonPrivate::Object::isValid]
        255,809132 task-clock:u (msec)
       771.771.000 cycles:u
     2.690.966.058 instructions:u
[80.2% inside QJsonPrivate::Parser::parseValue, 58.7% inside
QJsonPrivate::Parser::parseString and 16.3% inside QUtf8Functions::fromUtf8]
        239,059121 task-clock:u (msec)
       562.474.857 cycles:u
     1.431.590.428 instructions:u
[71.6% inside QCborValue::fromCbor, 65.0% inside
QCborContainerPrivate::decodeStringFromCbor, 25.5% inside
QCborStreamReader::readStringChunk plus 12.6% inside QUtf8::isValidUtf8]
So it's just under 4x slower, but we're still talking about consuming over
250 MB/s of data.
PS: YMMV, especially if you don't use CPU-optimised UTF-8 methods like I do.
You need to compile your own Qt to get those.
--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
_______________________________________________
Interest mailing list
http://lists.qt-project.org/mailman/listinfo/interest
--
Regards,
Konstantin
Gunnar Roth
2018-04-29 20:58:02 UTC
Permalink
Hi Thiago
PS: YMMV, especially if you don't use CPU-optimised UTF-8 methods like I do.
You need to compile your own Qt to get those.
Could you please give more information about this? I thought sse2 code
for utf8 is enabled by default at least in 64 bit build. What do I miss
here?

Regards,
Gunnar Roth
Vadim Peretokin
2018-05-10 17:17:17 UTC
Permalink
I went with pugixml (https://pugixml.org) for anyone interested and the
speed savings are worth it. It is also fairly easy to transition from qml
to pugixml and as a bonus, it does not choke on non-printable characters.
Post by Gunnar Roth
Hi Thiago
PS: YMMV, especially if you don't use CPU-optimised UTF-8 methods like I do.
You need to compile your own Qt to get those.
Could you please give more information about this? I thought sse2 code
for utf8 is enabled by default at least in 64 bit build. What do I miss
here?
Regards,
Gunnar Roth
_______________________________________________
Interest mailing list
http://lists.qt-project.org/mailman/listinfo/interest
Hamish Moffatt
2018-04-28 03:21:07 UTC
Permalink
Post by Thiago Macieira
Binary QJsonDocument is the fastest, followed closely by QCborValue (new in
5.12).
Is compatibility guaranteed on binary QJsonDocument between Qt versions?
Can I safely exchange data both ways between current Qt 5.8 and future
5.x versions?


Hamish
Thiago Macieira
2018-04-28 03:54:39 UTC
Permalink
Post by Hamish Moffatt
Post by Thiago Macieira
Binary QJsonDocument is the fastest, followed closely by QCborValue (new in
5.12).
Is compatibility guaranteed on binary QJsonDocument between Qt versions?
Yes.
Post by Hamish Moffatt
Can I safely exchange data both ways between current Qt 5.8 and future
5.x versions?
Yes.

But as indicated in this thread, this may not be the fastest method in the
future. We need to fix the 128 MB size limit, which means the current format
will be parsed into something different.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
alexander golks
2018-04-24 06:10:15 UTC
Permalink
Am Tue, 24 Apr 2018 05:02:20 +0000
Post by Vadim Peretokin
Post by Vadim Peretokin
Post by Vadim Peretokin
Thanks! That helped remove the resize operation from the profiling data,
but QBuffer::writeData is still there, and the overall time didn't change
much.
Is there anything else I can look at, or is this a dead end then -
requires
Post by Vadim Peretokin
a different serialization solution altogether (a lot of work)?
Correct, QBuffer is not optimised for speed. I ran into that while developing
QCborStreamWriter. The solution was to bypass it for speed.
OK, that's a pity. I'll look into using pugixml instead, and if anyone else
knows a faster XML serializer, I'd he happy to hear about it.
we had good experiences in using tinyxml2 as replacement for qxml. don't know if this works for you for serializing, too.
we had to extend a bit to beeing able to clone and deep compare XMLNodes, but it works great.

alex
--
/*
* printk ("scsi%d : Oh no Mr. Bill!\n", host->host_no);
* linux-2.6.6/drivers/scsi/53c7xx.c
*/
Loading...