aboutsummaryrefslogtreecommitdiffstats
path: root/interimap.md
blob: 50c1832bd6e9d8cb2f35b6bb4ea26488b6dab113 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
% intermap(1)
% [Guilhem Moulin](mailto:guilhem@fripost.org)
% July 2015

Name
====

InterIMAP - Fast bidirectional synchronization for QRESYNC-capable IMAP servers

Synopsis
========

`interimap` [*OPTION* ...] [*COMMAND*] [*MAILBOX* ...]

Description
===========

`interimap` performs stateful synchronization between two IMAP4rev1
servers.
Such synchronization is made possible by the [`QRESYNC` IMAP
extension][RFC 7162]; for convenience reasons servers must also support
the [`LIST-EXTENDED`][RFC 5258], [`LIST-STATUS`][RFC 5819] (or
[`NOTIFY`][RFC 5465]) and [`UIDPLUS`][RFC 4315] IMAP extensions.
See also the **[supported extensions](#supported-extensions)** section
below.

Stateful synchronization is only possible for mailboxes supporting
persistent message Unique Identifiers (UID) and persistent storage of
mod-sequences (MODSEQ); any non-compliant mailbox will cause `interimap`
to abort.
Furthermore, because UIDs are allocated not by the client but by the
server, `interimap` needs to keep track of associations between local
and remote UIDs for each mailbox.
The synchronization state of a mailbox consists of its `UIDNEXT` and
`HIGHESTMODSEQ` values on each server; it is then assumed that each
message with UID smaller than `UIDNEXT` have been replicated to the
other server, and that the metadata (such as flags) of each message with
MODSEQ at most `HIGHESTMODSEQ` have been synchronized.
Conceptually, the synchronization algorithm is derived from [RFC 4549]
with the [RFC 7162] (sec. 6) amendments, and works as follows:

 1. `SELECT` (on both servers) a mailbox the current `UIDNEXT` or `HIGHESTMODSEQ`
    values of which differ from the values found in the database (for
    either server).  Use the `QRESYNC` `SELECT` parameter from [RFC
    7162] to list changes (vanished messages and flag updates) since
    `HIGHESTMODSEQ` to messages with UID smaller than `UIDNEXT`.

 2. Propagate these changes onto the other server: get the corresponding
    UIDs from the database, then:
     a. issue an `UID STORE` command, followed by `UID EXPUNGE`, to
        remove messages that have not already been deleted on both
        servers; and
     b. issue some `UID STORE` commands to propagate flag updates (send
        a single command for each flag list in order the reduce the
        number of round trips).

    (Conflicts may occur if the metadata of a message has been updated
    on both servers with different flag lists; in that case, `interimap`
    issues a warning and updates the message on each server with the
    union of both flag lists.)
    Repeat this step if the server sent some updates in the meantime.
    Otherwise, update the `HIGHESTMODSEQ` value in the database.

 3. Process new messages (if the current `UIDNEXT` value of the mailbox
    differs from the one found in the database) by issuing an `UID
    FETCH` command; process each received message on-the-fly by issuing
    an `APPEND` command with the message's `RFC822` body, `FLAGS` and
    `INTERNALDATE`.
    Repeat this step if the server received new messages in the
    meantime.  Otherwise, update the `UIDNEXT` value in the database.
    Go back to step 2 if the server sent some metadata (such as flag)
    updates in the meantime.

 4. Go back to step 1 to proceed with the next unsynchronized mailbox.

Commands
========

By default, `interimap` synchronizes each mailbox listed by the `LIST ""
"*"` IMAP command;
the *list-mailbox*, *list-select-opts* and *ignore-mailbox* options from
the [configuration file](#configuration-file) can be used to shrink that
list and save bandwidth.
However if some extra argument are provided on the command line,
`interimap` ignores these options and synchronizes the given
*MAILBOX*es instead.  Note that each *MAILBOX* is taken “as is”; in
particular, it must be [UTF-7 encoded][RFC 2152], unquoted, and the list
wildcards ‘\*’ and ‘%’ are passed verbatim to the IMAP server.

If the synchronization was interrupted during a previous run while some
messages were being replicated (but before the `UIDNEXT` or
`HIGHESTMODSEQ` values have been updated), `interimap` performs a “full
synchronization” on theses messages: downloading the whole UID and flag
lists on each servers allows `interimap` to detect messages that have
been removed or for which their flags have changed in the meantime.
Finally, after propagating the offline changes for these messages,
`interimap` resumes the synchronization for the rest of the mailbox.

Specifying one of the commands below makes `interimap` perform an action
other than the default [`QRESYNC`][RFC 7162]-based synchronization.

`--repair` [*MAILBOX* ...]

:   List the database anomalies and try to repair them.  (Consider only
    the given *MAILBOX*es if non-optional arguments are provided.)
    This is done by performing a so-called “full synchronization”,
    namely:
      1/ download all UIDs along with their flag list both from the
         local and remote servers;
      2/ ensure that each entry in the database corresponds to an
         existing UID; and
      3/ ensure that both flag lists match.
    Any message found on a server but not in the database is replicated
    on the other server (which in the worst case, might lead to a
    message duplicate).
    Flag conflicts are solved by updating each message to the union of
    both lists.

`--delete` *MAILBOX* [*MAILBOX* ...]

:   Delete the given *MAILBOX*es on each target (by default each server
    plus the database, unless `--target` specifies otherwise) where it
    exists.
    Note that per the [IMAP4rev1 standard][RFC 3501] deletion is not
    recursive.  Thus *MAILBOX*'s children are not deleted.

`--rename` *SOURCE* *DEST*

:   Rename the mailbox *SOURCE* to *DEST* on each target (by default
    each server plus the database, unless `--target` specifies
    otherwise) where it exists.
    `interimap` aborts if *DEST* already exists on either target.
    Note that per the [IMAP4rev1 standard][RFC 3501] renaming is
    recursive.  Thus *SOURCE*'s children are moved to become *DEST*'s
    children instead.

Options
=======

`--config=`*FILE*

:   Specify an alternate [configuration file](#configuration-file).
    Relative paths start from *$XDG_CONFIG_HOME/interimap*, or *~/.config/interimap*
    if the `XDG_CONFIG_HOME` environment variable is unset.

`--target={local,remote,database}`

:   Limit the scope of a `--delete` or `--rename` command to the given
    target.  Can be repeated to act on multiple targets.  By default all
    three targets are considered.

`--watch`[`=`*seconds*]

:   Don't exit after a successful synchronization.  Instead, keep
    synchronizing forever.  Sleep for the given number of *seconds* (by
    default 1 minute if `--notify` is unset, and 15 minutes if
    `--notify` is set) between two synchronizations.  Setting this
    options enables `SO_KEEPALIVE` on the socket for *type*s other than
    `tunnel`.

`--notify`

:   Whether to use the [IMAP `NOTIFY` extension][RFC 5465] to instruct
    the server to automatically send updates to the client.  (Both local
    and remote servers must support [RFC 5465] for this to work.)
    This greatly reduces IMAP traffic since `interimap` can rely on
    server notifications instead of manually polling for updates.
    If the connection remains idle for 15 minutes (configurable with
    `--watch`), then `interimap` sends a `NOOP` command to avoid being
    logged out for inactivity.

`-q`, `--quiet`

:   Try to be quiet.

`--debug`

:   Turn on debug mode.  Debug messages are written to the given *logfile*.
    Note that this include all IMAP traffic (except literals).
    Depending on the chosen authentication mechanism, this might include
    authentication credentials.

`-h`, `--help`

:   Output a brief help and exit.

`--version`

:   Show the version number and exit.

Configuration file
==================

Unless told otherwise by the `--config=FILE` command-line option,
`interimap` reads its configuration from *$XDG_CONFIG_HOME/interimap/config*
(or *~/.config/interimap/config* if the `XDG_CONFIG_HOME` environment
variable is unset) as an [INI file].
The syntax of the configuration file is a series of `OPTION=VALUE`
lines organized under some `[SECTION]`; lines starting with a ‘#’ or
‘;’ character are ignored as comments.
The `[local]` and `[remote]` sections define the two IMAP servers to
synchronize.
Valid options are:

*database*

:   SQLite version 3 database file to use to keep track of associations
    between local and remote UIDs, as well as the `UIDVALIDITY`,
    `UIDNEXT` and `HIGHESTMODSEQ` of each known mailbox on both servers.
    Relative paths start from *$XDG_DATA_HOME/interimap*, or
    *~/.local/share/interimap* if the `XDG_DATA_HOME` environment
    variable is unset.  This option is only available in the default
    section.
    (Default: `HOST.db`, where *HOST* is taken from the `[remote]` or
    `[local]` sections, in that order.)

*list-reference*

:   An optional “reference name” to use for the initial `LIST` command,
    indicating the context in which the *MAILBOX*es are interpreted.
    For instance, by specifying `list-reference=perso/` in the `[local]`
    section, *MAILBOX* names are interpreted relative to `perso/` on the
    local server; in other words the remote mailbox hierarchy is mapped
    to the `perso/` sub-hierarchy on the local server.  This is useful
    for synchronizing multiple remote servers against different
    namespaces belonging to the same local IMAP server (using a
    different InterIMAP instance for each local namespace ↔ remote
    synchronization).

    (Note that if the reference name is not a level of mailbox hierarchy
    and/or does not end with the hierarchy delimiter, by [RFC 3501] its
    interpretation by the IMAP server is implementation-dependent.)

*list-mailbox*

:   A space separated list of mailbox patterns to use when issuing the
    initial `LIST` command (overridden by the *MAILBOX*es given as
    command-line arguments).
    Names containing special characters such as spaces or brackets need
    to be enclosed in double quotes.  Within double quotes C-style
    backslash escape sequences can be used (‘\\t’ for an horizontal tab,
    ‘\\n’ for a new line, ‘\\\\’ for a backslash, etc.), as well as
    hexadecimal escape sequences ‘\\xHH’.
    Furthermore, non-ASCII names must be [UTF-7 encoded][RFC 2152].
    Two wildcards are available, and passed verbatim to the IMAP server:
    a ‘\*’ character matches zero or more characters, while a ‘%’
    character matches zero or more characters up to the hierarchy
    delimiter.
    This option is only available in the default section.
    (The default pattern, `*`, matches all visible mailboxes on the
    server.)

*list-select-opts*

:   An optional space separated list of selectors for the initial `LIST`
    command.  (Requires a server supporting the [`LIST-EXTENDED` IMAP
    extension][RFC 5258].)  Useful values are `SUBSCRIBED` (to list only
    subscribed mailboxes), `REMOTE` (to also list remote mailboxes on a
    server supporting mailbox referrals), and `RECURSIVEMATCH` (to
    list parent mailboxes with children matching one of the above
    *list-mailbox* patterns).  This option is only available in the
    default section.

*ignore-mailbox*

:   An optional Perl Compatible Regular Expressions ([PCRE]) covering
    mailboxes to exclude: any ([UTF-7 encoded][RFC 2152] and unquoted)
    mailbox listed in the initial `LIST` responses is ignored if it
    matches the given expression.
    Note that the *MAILBOX*es given as command-line arguments bypass the
    check and are always considered for synchronization.  This option is
    only available in the default section.

*logfile*

:   A file name to use to log debug and informational messages.  (By
    default these messages are written to the error output.)  This
    option is only available in the default section.

*type*

:   One of `imap`, `imaps` or `tunnel`.
    `type=imap` and `type=imaps` are respectively used for IMAP and IMAP
    over SSL/TLS connections over a INET socket.
    `type=tunnel` causes `interimap` to create an unnamed pair of
    connected sockets for interprocess communication with a *command*
    instead of a opening a network socket.
    Note that specifying `type=tunnel` in the `[remote]` section makes
    the default *database* to be `localhost.db`.
    (Default: `imaps`.)

*host*

:   Server hostname, for `type=imap` and `type=imaps`.
    (Default: `localhost`.)

*port*

:   Server port.
    (Default: `143` for `type=imap`, `993` for `type=imaps`.)

*proxy*

:   An optional SOCKS proxy to use for TCP connections to the IMAP
    server (`type=imap` and `type=imaps` only), formatted as
    `PROTOCOL://[USER:PASSWORD@]PROXYHOST[:PROXYPORT]`.
    If `PROXYPORT` is omitted, it is assumed at port 1080.
    Only [SOCKSv5][RFC 1928] is supported (with optional
    [username/password authentication][RFC 1929]), in two flavors:
    `socks5://` to resolve *hostname* locally, and `socks5h://` to let
    the proxy resolve *hostname*.

*command*

:   Command to use for `type=tunnel`.  Must speak the [IMAP4rev1
    protocol][RFC 3501] on its standard output, and understand it on its
    standard input.

*STARTTLS*

:   Whether to use the [`STARTTLS`][RFC 2595] directive to upgrade to a
    secure connection.  Setting this to `YES` for a server not
    advertising the `STARTTLS` capability causes `interimap` to
    immediately abort the connection.
    (Ignored for *type*s other than `imap`.  Default: `YES`.)

*auth*

:   Space-separated list of preferred authentication mechanisms.
    `interimap` uses the first mechanism in that list that is also
    advertised (prefixed with `AUTH=`) in the server's capability list.
    Supported authentication mechanisms are `PLAIN` and `LOGIN`.
    (Default: `PLAIN LOGIN`.)

*username*, *password*

:   Username and password to authenticate with.  Can be required for non
    pre-authenticated connections, depending on the chosen
    authentication mechanism.

*compress*

:   Whether to use the [`IMAP COMPRESS` extension][RFC 4978] for servers
    advertising it.
    (Default: `NO` for the `[local]` section, `YES` for the `[remote]`
    section.)

*null-stderr*

:   Whether to redirect *command*'s standard error to `/dev/null` for
    type `type=tunnel`.  (Default: `NO`.)

*SSL_protocols*

:   A space-separated list of SSL protocols to enable or disable (if
    prefixed with an exclamation mark `!`.  Known protocols are `SSLv2`,
    `SSLv3`, `TLSv1`, `TLSv1.1`, `TLSv1.2`, and `TLSv1.3`.  Enabling a
    protocol is a short-hand for disabling all other protocols.
    (Default: `!SSLv2 !SSLv3 !TLSv1 !TLSv1.1`, i.e., only enable TLSv1.2
    and above.)

*SSL_cipher_list*

:   The cipher list to send to the server.  Although the server
    determines which cipher suite is used, it should take the first
    supported cipher in the list sent by the client.  See
    [`ciphers`(1ssl)] for more information.

*SSL_fingerprint*

:   Fingerprint of the server certificate's Subject Public Key Info, in
    the form `[ALGO$]DIGEST_HEX` where `ALGO` is the used algorithm (by
    default `sha256`).
    Attempting to connect to a server with a non-matching certificate
    SPKI fingerprint causes `interimap` to abort the connection during
    the SSL/TLS handshake.

    You can use the following command to compute the SHA-256 digest of
    certificate's Subject Public Key Info.

        openssl x509 -in /path/to/server/certificate.pem -pubkey \
        | openssl pkey -pubin -outform DER \
        | openssl dgst -sha256

*SSL_verify*

:   Whether to verify the server certificate chain.
    Note that using *SSL_fingerprint* to specify the fingerprint of the
    server certificate is an orthogonal authentication measure as it
    ignores the CA chain.
    (Default: `YES`.)

*SSL_CApath*

:   Directory to use for server certificate verification if
    `SSL_verify=YES`.
    This directory must be in “hash format”, see [`verify`(1ssl)] for
    more information.

*SSL_CAfile*

:   File containing trusted certificates to use during server
    certificate authentication if `SSL_verify=YES`.

Supported extensions
====================

`interimap` takes advantage of servers supporting the following
extensions to the [IMAP4rev1 protocol][RFC 3501] (those marked as
“recommended” give the most significant performance gain):

 * `LITERAL+` ([RFC 2088], recommended);
 * `MULTIAPPEND` ([RFC 3502], recommended);
 * `COMPRESS=DEFLATE` ([RFC 4978], recommended);
 * `NOTIFY` ([RFC 5465], recommended);
 * `SASL-IR` ([RFC 4959]); and
 * `UNSELECT` ([RFC 3691]).

Known bugs and limitations
==========================

 * Using `interimap` on two identical servers with a non-existent or
   empty *database* will duplicate each message due to the absence of
   local ↔ remote UID association.  Hence one needs to manually empty
   the mail store on one end when migrating to `interimap` from another
   synchronisation solution.

 * `interimap` is single threaded and doesn't use IMAP command
   pipelining.  Synchronization could be boosted up by sending
   independent commands (such as the initial `LIST` and `STATUS`
   commands) to both servers in parallel, and for a given server, by
   sending independent commands (such as flag updates) in a pipeline.

 * Because the [IMAP protocol][RFC 3501] doesn't have a specific
   response code for when a message is moved to another mailbox (either
   using the `MOVE` command from [RFC 6851], or via `COPY` + `STORE` +
   `EXPUNGE`), moving a message causes `interimap` to believe that it
   was deleted while another one (which is replicated again) was added
   to the other mailbox in the meantime.

 * `PLAIN` and `LOGIN` are the only authentication mechanisms currently
   supported.

 * `interimap` will probably not work with non [RFC][RFC 3501]-compliant
   servers.  In particular, no work-around is currently implemented
   beside the tunables in the [configuration file](#configuration-file).
   Moreover, few IMAP servers have been tested so far.

Standards
=========

 * M. Leech, M. Ganis, Y. Lee, R. Kuris, D. Koblas and L. Jones,
   _SOCKS Protocol Version 5_,
   [RFC 1928], March 1996.
 * M. Leech, _Username/Password Authentication for SOCKS V5_,
   [RFC 1929], March 1996.
 * J. Myers, _IMAP4 non-synchronizing literals_,
   [RFC 2088], January 1997.
 * D. Goldsmith and M. Davis,
   _A Mail-Safe Transformation Format of Unicode_,
   [RFC 2152], May 1997.
 * C. Newman, _Using TLS with IMAP, POP3 and ACAP_,
   [RFC 2595], June 1999.
 * M. Crispin, _Internet Message Access Protocol - Version 4rev1_,
   [RFC 3501], March 2003.
 * M. Crispin,
   _Internet Message Access Protocol (IMAP) - `MULTIAPPEND` Extension_,
   [RFC 3502], March 2003.
 * A. Melnikov,
   _Internet Message Access Protocol (IMAP) `UNSELECT` command_,
   [RFC 3691], February 2004.
 * M. Crispin,
   _Internet Message Access Protocol (IMAP) - `UIDPLUS` extension_,
   [RFC 4315], December 2005.
 * A. Melnikov,
   _Synchronization Operations for Disconnected IMAP4 Clients_,
   [RFC 4549], June 2006.
 * A. Gulbrandsen, _The IMAP `COMPRESS` Extension_,
   [RFC 4978], August 2007.
 * R. Siemborski and A. Gulbrandsen, _IMAP Extension for Simple
   Authentication and Security Layer (SASL) Initial Client Response_,
   [RFC 4959], September 2007.
 * A. Gulbrandsen and A. Melnikov,
   _The IMAP `ENABLE` Extension_,
   [RFC 5161], March 2008.
 * B. Leiba and A. Melnikov,
   _Internet Message Access Protocol version 4 - `LIST` Command Extensions_,
   [RFC 5258], June 2008.
 * A. Gulbrandsen, C. King and A. Melnikov,
   _The IMAP `NOTIFY` Extension_,
   [RFC 5465], February 2009.
 * A. Melnikov and T. Sirainen,
   _IMAP4 Extension for Returning `STATUS` Information in Extended LIST_,
   [RFC 5819], March 2010.
 * A. Gulbrandsen and N. Freed,
   _Internet Message Access Protocol (IMAP) - `MOVE` Extension_,
   [RFC 6851], January 2013.
 * A. Melnikov and D. Cridland,
   _IMAP Extensions: Quick Flag Changes Resynchronization (`CONDSTORE`)
   and Quick Mailbox Resynchronization (`QRESYNC`)_,
   [RFC 7162], May 2014.

[RFC 7162]: https://tools.ietf.org/html/rfc7162
[RFC 5258]: https://tools.ietf.org/html/rfc5258
[RFC 5819]: https://tools.ietf.org/html/rfc5819
[RFC 4315]: https://tools.ietf.org/html/rfc4315
[RFC 4549]: https://tools.ietf.org/html/rfc4549
[RFC 2152]: https://tools.ietf.org/html/rfc2152
[RFC 3501]: https://tools.ietf.org/html/rfc3501
[RFC 1928]: https://tools.ietf.org/html/rfc1928
[RFC 1929]: https://tools.ietf.org/html/rfc1929
[RFC 2595]: https://tools.ietf.org/html/rfc2595
[RFC 4978]: https://tools.ietf.org/html/rfc4978
[RFC 2088]: https://tools.ietf.org/html/rfc2088
[RFC 3502]: https://tools.ietf.org/html/rfc3502
[RFC 4959]: https://tools.ietf.org/html/rfc4959
[RFC 3691]: https://tools.ietf.org/html/rfc3691
[RFC 6851]: https://tools.ietf.org/html/rfc6851
[RFC 5161]: https://tools.ietf.org/html/rfc5161
[RFC 5465]: https://tools.ietf.org/html/rfc5465

[INI file]: https://en.wikipedia.org/wiki/INI_file
[PCRE]: https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions
[`ciphers`(1ssl)]: https://www.openssl.org/docs/manmaster/apps/ciphers.html
[`verify`(1ssl)]: https://www.openssl.org/docs/manmaster/apps/verify.html