aboutsummaryrefslogtreecommitdiffstats
path: root/doc/benchmark.md
blob: f2f6cda780fb918557aadb0bbeaee6d8f6d6fb34 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
% InterIMAP benchmark metrics and comparison
% [Guilhem Moulin](mailto:guilhem@fripost.org)

The [IMAP `QRESYNC` extension][RFC 7162] allows efficient mailbox
synchronization, in terms of I/O as well as CPU usage.  In this document
we give some benchmark metrics to compare [InterIMAP]'s network usage with
so-called full synchronization solutions such as [OfflineIMAP].  The
timings are to be taken with a grain of salt, though: they likely won't
reflect real-world situations as the emails are stored in RAM for this
benchmark, and all network access is on the loopback interface.  (Moreover
neither SSL/TLS nor STARTTLS are being used in the below.  They would add
another 2-3 round-trips per connection.)

These metrics show how [InterIMAP] scales linearly with the number of
*mailboxes* — pretty much regardless of how many messages they contain (at
least as long as the server can cope with large mailboxes) — while
[OfflineIMAP] scales with the number of *messages* on active mailboxes.

While [InterIMAP] performs significantly better (especially given that it
can be relied upon to synchronize flag changes, unlike [OfflineIMAP]'s
“quick” mode), it should be noted that efficiency comes at the expense of
flexibility.  In particular it's not possible to exclude old messages from
synchronization (mailboxes can be excluded but finer granularity is not
possible).  And of course not all IMAP servers support [`QRESYNC`][RFC 7162]
and other extensions [InterIMAP] requires.  Furthermore [InterIMAP] is
single threaded and doesn't use pipelining at the moment.  (Concurrency
opens a can of worms, and given the below metrics it simply doesn't seem
worth the trouble ☺)

-----------------------------------------------------------------------

The script used to compute these metrics can be found [there][benchmark-script].
We use [Dovecot] as IMAP server; the “remote” mailbox store is in
[multi-dbox][dbox] format (initially populated with random messages of average
size ~4kiB, and randomly pruned to avoid having only contiguous UIDs) while
[maildir] is used “locally”.  The configuration files were not tuned for
performance (however [InterIMAP] takes advantage of Dovecot's support of the
[IMAP `COMPRESS` extension][RFC 4978] as it is its default behavior).

The *user* (resp. *system*) column denotes the number of CPU-seconds
used by the process in user (resp. kernel) mode.  The *real* column is
the elapsed real (wall clock) time.  Network measurements are obtained
by placing packet counters on the interface.

[RFC 4978]: https://tools.ietf.org/html/rfc4978
[RFC 7162]: https://tools.ietf.org/html/rfc7162
[InterIMAP]: interimap.1.html
[OfflineIMAP]: https://www.offlineimap.org/
[benchmark-script]: https://git.guilhem.org/interimap/plain/benchmark/run
[Dovecot]: https://dovecot.org
[dbox]: https://doc.dovecot.org/admin_manual/mailbox_formats/dbox/
[maildir]: https://doc.dovecot.org/admin_manual/mailbox_formats/maildir/

-----------------------------------------------------------------------

Single mailbox  {#single-mailbox}
==============

We create a mailbox on the remote server, populate it with a number of
messages, and synchronize it locally.  We then collect metrics for no-op
synchronization (i.e., of mailboxes that are already in sync), and
reconciliation after receiving a *single* message on the remote server.

[OfflineIMAP]'s network usage remains low in “quick” mode for large
mailboxes that are already in sync, but as soon as a mail arrives the
performance degrades by *several orders of magnitude*.  On the other
hand [InterIMAP] has very little overhead on large mailboxes (also
memory-wise), and when a message is delivered there is barely more
traffic than what's required for the transfer of said message.

100 messages
------------

### No-op (in sync) ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.05s    0.01s   0.07s   85%    21368k     1439B / 1017B          13 / 15
offlineimap -q   0.04s    0.01s   0.27s   23%    19748k     2497B / 1236B          16 / 20
offlineimap      0.05s    0.01s   0.32s   22%    19268k     10kiB / 1456B          21 / 23

### Reconciliation ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.06s    0.00s   0.08s   83%    21116k     4516B / 1412B          17 / 19
offlineimap -q   0.06s    0.00s   0.32s   22%    19968k     15kiB / 1670B          23 / 26
offlineimap      0.06s    0.00s   0.32s   22%    18616k     14kiB / 1284B          25 / 19

1000 messages
-------------

### No-op (in sync) ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.05s    0.01s   0.07s   84%    21204k     1449B / 965B           13 / 14
offlineimap -q   0.06s    0.01s   0.33s   24%    19068k     2664B / 1236B          19 / 20
offlineimap      0.09s    0.02s   0.37s   30%    19868k     75kiB / 1508B          26 / 24

### Reconciliation ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.06s    0.00s   0.08s   78%    21212k     4524B / 1333B          17 / 16
offlineimap -q   0.08s    0.03s   0.33s   37%    22284k     80kiB / 1775B          29 / 28
offlineimap      0.10s    0.01s   0.32s   36%    20116k     80kiB / 1597B          24 / 25

10000 messages
--------------

### No-op (in sync) ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.06s    0.00s   0.09s   75%    20980k     1449B / 965B           13 / 14
offlineimap -q   0.10s    0.03s   0.37s   37%    36708k     2719B / 1184B          20 / 19
offlineimap      0.50s    0.09s   0.78s   75%    45424k    746kiB / 2080B          37 / 35

### Reconciliation ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.06s    0.00s   0.12s   54%    21136k     4530B / 1205B          17 / 16
offlineimap -q   0.51s    0.08s   0.76s   77%    42860k    751kiB / 2608B          43 / 44
offlineimap      0.62s    0.16s   0.88s   89%    47996k    750kiB / 2222B          38 / 37

100000 messages
---------------

### No-op (in sync) ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.06s    0.00s   0.16s   38%    21080k     1441B / 1017B          13 / 15
offlineimap -q   1.06s    0.10s   1.40s   83%   201376k     2722B / 1236B          20 / 20
offlineimap      4.88s    0.83s   5.23s  109%   280716k   7626kiB / 5564B         138 / 102

### Reconciliation ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.06s    0.00s   0.48s   15%    22876k     4532B / 1362B          17 / 19
offlineimap -q   5.09s    0.75s   5.38s  108%   277336k   7637kiB / 9941B         261 / 185
offlineimap      4.92s    0.76s   5.22s  108%   279592k   7631kiB / 5603B         144 / 102

-----------------------------------------------------------------------

75 mailboxes  {#multi-mailbox}
============

We create 75 mailboxes on the remote server, populate them with an equal
number of messages, and synchronize them locally.  We then collect
metrics for no-op synchronization (i.e., of mailboxes that are already
in sync), and reconciliation after the following changes are being
applied to the remote server:

  - 3 *new* messages (two on mailbox #2, one on mailbox #3); and
  - 5 existing messages *EXPUNGEd* (two on mailboxes #3 and #4, one on
    mailbox #5).

The results are not surprising given the metrics from the [above
section](#single-mailbox).  In “quick” mode [OfflineIMAP] still performs
reasonably well when the mailboxes are in sync (even though it iterates
through each mailbox and the extra roundtrips increase network traffic
compared to the single mailbox case), but performance decrease
significantly when a message is delivered to a large mailbox.  Once
again [InterIMAP] has very little network overhead regardless of mailbox
size; it does take longer on very large mailboxes, but the bottleneck is
the IMAP server ([InterIMAP] is just rolling thumbs waiting for Dovecot
to compute `STATUS` responses).

100 messages per mailbox
------------------------

### No-op (in sync) ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.06s    0.00s   0.12s   55%    21712k     1949B / 898B           11 / 13
offlineimap -q   0.32s    0.08s   0.43s   92%    22400k     36kiB / 7260B          93 / 99
offlineimap      0.97s    0.32s   1.32s   98%    22648k    606kiB / 19kiB         243 / 251

### Reconciliation ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.07s    0.00s   0.15s   53%    21860k     10kiB / 1634B          19 / 19
offlineimap -q   0.34s    0.11s   0.59s   77%    21248k     81kiB / 8697B         109 / 117
offlineimap      0.93s    0.35s   1.30s   98%    22804k    620kiB / 20kiB         252 / 253

1000 messages per mailbox
-------------------------

### No-op (in sync) ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.05s    0.01s   0.31s   22%    22028k     1944B / 898B           11 / 13
offlineimap -q   0.97s    0.22s   1.22s   97%    23920k     36kiB / 7000B          90 / 94
offlineimap      4.87s    1.54s   5.01s  127%    25040k   5507kiB / 26kiB         393 / 388

### Reconciliation ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.08s    0.00s   0.29s   28%    22132k     10kiB / 1931B          20 / 19
offlineimap -q   1.25s    0.32s   1.45s  108%    27276k    344kiB / 9038B         119 / 123
offlineimap      4.72s    1.70s   5.05s  127%    26464k   5521kiB / 27kiB         399 / 392

10000 messages per mailbox
--------------------------

### No-op (in sync) ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.07s    0.00s   1.57s    4%    21896k     1942B / 898B           11 / 13
offlineimap -q  12.10s    3.98s  11.67s  137%    58624k     37kiB / 10kiB          94 / 168
offlineimap     55.49s   23.68s  51.50s  153%    70652k     54MiB / 57kiB        1072 / 996

### Reconciliation ###

                  user   system    real   CPU   max RSS   traffic (in/out)    packets (in/out)
--------------  ------  -------  ------  ----  --------  ------------------  ------------------
  interimap      0.08s    0.00s   1.73s    5%    23108k     10kiB / 1624B          20 / 23
offlineimap -q  14.60s    5.22s  14.00s  141%    64988k   3028kiB / 15kiB         203 / 263
offlineimap     57.24s   25.92s  53.72s  154%    76560k     54MiB / 89kiB        1981 / 1625

-----------------------------------------------------------------------

Live synchronization  {#live-sync}
====================

97 mailboxes, 500000 messages in total:

  - 2 with 100000 messages;
  - 10 with 10000 messages;
  - 20 with 5000 messages;
  - 45 with 2000 messages; and
  - 20 with 500 messages.

The two local mail stores (respectively for [InterIMAP] and
[OfflineIMAP]) are initially in sync with the remote server, and we keep
long-running “autorefresh” synchronization processes alive for 6h, with
updates being regularly applied to the remote server: every 5 seconds,

  - a new message is delivered to a random mailbox with 5% probability
    (once every 100s on average);
  - a random message is EXPUNGEd with 5% probability (once every 100s on
    average); and
  - a random message is marked as seen with 10% probability (once every
    50s on average).

`interimap` is configured to sync every *30s*.  `offlineimap` is
configured to quick sync very *30s*, with a regular sync every *1h*.

                 user    system   max RSS   traffic (in/out)    packets (in/out)
-----------  --------  --------  --------  ------------------  ------------------
  interimap    12.95s     0.26s    24276k    743kiB / 257kiB       2207 / 4143
offlineimap  5327.79s  1495.78s   394044k    942MiB / 7840kiB       87k / 126k

Long-lived synchronization for large and busy mail stores is where
[InterIMAP] truly shines, in terms of CPU as well as network usage.
(The amount of CPU time spent in kernel mode is so low because the
process spends most of its time sleeping or in blocking calls waiting
for the server to compute `STATUS` responses.  Smart servers like
Dovecot should cache states though, hence are able to serve these
responses quickly.)  Thanks to the [`QRESYNC`][RFC 7162]-based
synchronization there is no need for complex client-side computation,
nor for sending vast amount of data over the network.  (To be fair,
while the amount of CPU time spent in user mode remains low, the local
IMAP server might do a bit of extra work which is not counted here.  But
here again caching helps avoid expensive directory traversal.)   The
performance gain is most appreciated for battery-powered devices, as
well as devices behind slow and/or high-latency network connections ☺.
Moreover [InterIMAP] *does* synchronize flag updates at every step, while
[OfflineIMAP] normally skips these in “quick” mode so might *delay* flag
updates for up to one hour.