% pullimap(1)
% [Guilhem Moulin](mailto:guilhem@fripost.org)
% March 2016

Name
====

PullIMAP - Pull mails from an IMAP mailbox and deliver them to an SMTP session

Synopsis
========

`pullimap` [**\-\-config=***FILE*] [**\-\-idle**[**=***SECONDS*]]
[**\-\-no-delivery**] [**\-\-quiet**] *SECTION*

Description
===========

`pullimap` retrieves messages from an IMAP mailbox and deliver them to
an SMTP or LMTP transmission channel.  It can also remove old messages
after a configurable retention period.

A *statefile* is used to keep track of the mailbox's `UIDVALIDITY` and
`UIDNEXT` values.  While `pullimap` is running, the *statefile* is also
used to keep track of UIDs being delivered, which avoids duplicate
deliveries in case the process is interrupted.
See the **[control flow](#control-flow)** section below for details.

Options
=======

`--config=`*FILE*

:   Specify an alternate [configuration file](#configuration-file).
    Relative paths start from *$XDG_CONFIG_HOME/pullimap*, or *~/.config/pullimap*
    if the `XDG_CONFIG_HOME` environment variable is unset.

`--idle`[`=`*seconds*]

:   Don't exit after a successful poll.  Instead, keep the connection open
    and issue `IDLE` commands (require an IMAP server supporting [RFC
    2177]) to watch for updates in the mailbox.  This also enables
    `SO_KEEPALIVE` on the socket.
    Each `IDLE` command is terminated after at most *seconds* (29
    minutes by default) to avoid being logged out for inactivity.

`--no-delivery`

:   Update the *statefile*, but skip SMTP/LMTP delivery.  This is mostly
    useful for initializing the *statefile* when migrating to `pullimap`
    from another similar program such as [`fetchmail`(1)] or
    [`getmail`(1)].

`-q`, `--quiet`

:   Try to be quiet.

`--debug`

:   Turn on debug mode.  Debug messages, which includes all IMAP traffic
    besides literals, are written to the given *logfile*.  The `LOGIN`
    and `AUTHENTICATE` commands are however redacted (in order to avoid
    disclosing authentication credentials) unless the `--debug` flag is
    set multiple times.

`-h`, `--help`

:   Output a brief help and exit.

`--version`

:   Show the version number and exit.

Configuration file
==================

Unless told otherwise by the `--config=FILE` command-line option,
`pullimap` reads its configuration from *$XDG_CONFIG_HOME/pullimap/config*
(or *~/.config/pullimap/config* if the `XDG_CONFIG_HOME` environment variable
is unset) as an [INI file].
The syntax of the configuration file is a series of `OPTION=VALUE`
lines organized under some `[SECTION]`; lines starting with a ‘#’ or
‘;’ character are ignored as comments.
Valid options are:

*statefile*

:   State file to use to keep track of the *mailbox*'s `UIDVALIDITY` and
    `UIDNEXT` values.  Relative paths start from
    *$XDG_DATA_HOME/pullimap*, or *~/.local/share/pullimap* if the
    `XDG_DATA_HOME` environment variable is unset.
    (Default: the parent section name of the option.)

*mailbox*

:   The IMAP mailbox ([UTF-7 encoded][RFC 2152] and unquoted) to pull
    messages from.  Support for persistent message Unique Identifiers
    (UID) is required.  (Default: `INBOX`.)

*deliver-method*

:   `PROTOCOL:[ADDRESS]:PORT` where to deliver messages.  Both
    [SMTP][RFC 5321] and [LMTP][RFC 2033] servers are supported, and
    [SMTP pipelining][RFC 2920] is used when possible.
    (Default: `smtp:[127.0.0.1]:25`.)

*deliver-ehlo*

:   Hostname to use in `EHLO` or `LHLO` commands.
    (Default: `localhost.localdomain`.)

*deliver-rcpt*

:   Message recipient.  Note that the local part needs to quoted if it
    contains special characters; see [RFC 5321] for details.
    (Default: the username associated with the effective uid of the
    `pullimap` process.)

*purge-after*

:   Retention period (in days), after which messages are removed from
    the IMAP server.  (The value is at best 24h accurate due to the IMAP
    `SEARCH` criterion ignoring time and timezone.)
    If *purge-after* is set to `0` then messages are deleted immediately
    after delivery.  Otherwise `pullimap` issues an IMAP `SEARCH` (or
    extended `SEARCH` on servers advertizing the [`ESEARCH`][RFC 4731]
    capability) command to list old messages; if `--idle` is set then
    the `SEARCH` command is issued again every 12 hours.

*type*

:   One of `imap`, `imaps` or `tunnel`.
    `type=imap` and `type=imaps` are respectively used for IMAP and IMAP
    over SSL/TLS connections over an INET socket.
    `type=tunnel` causes `pullimap` to create an unnamed pair of
    connected sockets for interprocess communication with a *command*
    instead of opening a network socket.
    (Default: `imaps`.)

*host*

:   Server hostname, for `type=imap` and `type=imaps`.
    (Default: `localhost`.)

*port*

:   Server port.
    (Default: `143` for `type=imap`, `993` for `type=imaps`.)

*proxy*

:   An optional SOCKS proxy to use for TCP connections to the IMAP
    server (`type=imap` and `type=imaps` only), formatted as
    `PROTOCOL://[USER:PASSWORD@]PROXYHOST[:PROXYPORT]`.
    If `PROXYPORT` is omitted, it is assumed at port 1080.
    Only [SOCKSv5][RFC 1928] is supported (with optional
    [username/password authentication][RFC 1929]), in two flavors:
    `socks5://` to resolve *hostname* locally, and `socks5h://` to let
    the proxy resolve *hostname*.

*command*

:   Command to use for `type=tunnel`.  Must speak the [IMAP4rev1
    protocol][RFC 3501] on its standard output, and understand it on its
    standard input.  The value is passed to `` `/bin/sh -c` `` if it
    contains shell metacharacters; otherwise it is split into words and
    the resulting list is passed to `execvp`(3).

*STARTTLS*

:   Whether to use the [`STARTTLS`][RFC 2595] directive to upgrade to a
    secure connection.  Setting this to `YES` for a server not
    advertising the `STARTTLS` capability causes `pullimap` to
    immediately abort the connection.
    (Ignored for *type*s other than `imap`.  Default: `YES`.)

*auth*

:   Space-separated list of preferred authentication mechanisms.
    `pullimap` uses the first mechanism in that list that is also
    advertised (prefixed with `AUTH=`) in the server's capability list.
    Supported authentication mechanisms are `PLAIN` and `LOGIN`.
    (Default: `PLAIN LOGIN`.)

*username*, *password*

:   Username and password to authenticate with.  Can be required for non
    pre-authenticated connections, depending on the chosen
    authentication mechanism.

*compress*

:   Whether to use the [`IMAP COMPRESS` extension][RFC 4978] for servers
    advertising it.  (Default: `YES`.)

*null-stderr*

:   Whether to redirect *command*'s standard error to `/dev/null` for
    `type=tunnel`.  (Default: `NO`.)

*SSL_protocols*

:   A space-separated list of SSL protocols to enable or disable (if
    prefixed with an exclamation mark `!`.  Known protocols are `SSLv2`,
    `SSLv3`, `TLSv1`, `TLSv1.1`, `TLSv1.2`, and `TLSv1.3`.  Enabling a
    protocol is a short-hand for disabling all other protocols.
    (Default: `!SSLv2 !SSLv3 !TLSv1 !TLSv1.1`, i.e., only enable TLSv1.2
    and above.)

*SSL_cipher_list*

:   The cipher list to send to the server.  Although the server
    determines which cipher suite is used, it should take the first
    supported cipher in the list sent by the client.  See
    [`ciphers`(1ssl)] for more information.

*SSL_fingerprint*

:   Fingerprint of the server certificate's Subject Public Key Info, in
    the form `[ALGO$]DIGEST_HEX` where `ALGO` is the used algorithm (by
    default `sha256`).
    Attempting to connect to a server with a non-matching certificate
    SPKI fingerprint causes `pullimap` to abort the connection during
    the SSL/TLS handshake.
    The following command can be used to compute the SHA-256 digest of a
    certificate's Subject Public Key Info:

        openssl x509 -in /path/to/server/certificate.pem -pubkey \
        | openssl pkey -pubin -outform DER \
        | openssl dgst -sha256

*SSL_verify*

:   Whether to verify the server certificate chain.
    Note that using *SSL_fingerprint* to specify the fingerprint of the
    server certificate is an orthogonal authentication measure as it
    ignores the CA chain.
    (Default: `YES`.)

*SSL_CApath*

:   Directory to use for server certificate verification if
    `SSL_verify=YES`.
    This directory must be in “hash format”, see [`verify`(1ssl)] for
    more information.

*SSL_CAfile*

:   File containing trusted certificates to use during server
    certificate authentication if `SSL_verify=YES`.

Control flow
============

`pullimap` opens the *statefile* corresponding to a given configuration
*SECTION* with `O_DSYNC` to ensure that written data is flushed to the
underlying hardware by the time [`write`(2)] returns.  Moreover an
exclusive lock is placed on the file descriptor immediately after
opening to prevent multiple `pullimap` processes from accessing the
*statefile* concurrently.

Each *statefile* consists of a series of 32-bits big-endian integers.
Usually there are only two integers: the first is the *mailbox*'s
`UIDVALIDITY` value, and the second is the *mailbox*'s last seen
`UIDNEXT` value (`pullimap` then assumes that all messages with UID
smaller than this `UIDNEXT` value have already been retrieved and
delivered).
The [IMAP4rev1 specification][RFC 3501] does not guaranty that untagged
`FETCH` responses are sent ordered by UID in response to a `UID FETCH`
command.  Thus it would be unsafe for `pullimap` to update the `UIDNEXT`
value in its *statefile* while the `UID FETCH` command is progress.
Instead, for each untagged `FETCH` response received while the `UID
FETCH` command is in progress, `pullimap` delivers the message `RFC822`
body to the SMTP or LMTP server (specified with *deliver-method*) then
appends the message UID to the *statefile*.
When the `UID FETCH` command eventually terminates, `pullimap` updates
the `UIDNEXT` value in the *statefile* and truncate the file down to 8
bytes.  Keeping track of message UIDs as they are received avoids
duplicate in the event of a crash or connection loss while the `UID
FETCH` command is in progress.

In more details, `pullimap` works as follows:

 1. Issue a `UID FETCH` command to retrieve message `ENVELOPE` and
    `RFC822` (and `UID`) with UID bigger or equal than the `UIDNEXT`
    value found in the *statefile*.
    While the `UID FETCH` command is in progress, perform the following
    for each untagged `FETCH` response sent by the server:

     i. if no SMTP/LMTP transmission channel was opened, open one to the
        server specified with *deliver-method* and send an `EHLO` (or
        `LHO`) command with the domain specified by *deliver-ehlo* (the
        channel is kept open and shared for all messages retrieved while
        the `UID FETCH` IMAP command is in progress);

     i. perform a mail transaction (using [SMTP pipelining][RFC 2920] if
        possible) to deliver the retrieved message `RFC822` body to the
        SMTP or LMTP session; and

     i. append the message UID to the *statefile*.

 2. If an SMTP/LMTP transmission channel was opened, send a `QUIT` command
    to terminate it gracefully.

 3. Issue a `UID STORE` command to mark all retrieved messages (and
    stalled UIDs found in the *statefile* after the eigth byte) as
    `\Seen`.

 4. Update the *statefile* with the new UIDNEXT value (bytes 5-8).

 5. Truncate the *statefile* down to 8 bytes (so that it contains only
    two 32-bits integers, respectively the *mailbox*'s current
    `UIDVALIDITY` and `UIDNEXT` values).

 6. If `--idle` was set, issue an `IDLE` command; stop idling and go
    back to step 1 when a new message is received (or when the `IDLE`
    timeout expires).

Standards
=========

 * M. Leech, M. Ganis, Y. Lee, R. Kuris, D. Koblas and L. Jones,
   _SOCKS Protocol Version 5_,
   [RFC 1928], March 1996.
 * M. Leech, _Username/Password Authentication for SOCKS V5_,
   [RFC 1929], March 1996.
 * J. Myers, _Local Mail Transfer Protocol_,
   [RFC 2033], October 1996.
 * J. Myers, _IMAP4 non-synchronizing literals_,
   [RFC 2088], January 1997.
 * D. Goldsmith and M. Davis,
   _A Mail-Safe Transformation Format of Unicode_,
   [RFC 2152], May 1997.
 * B. Leiba, _IMAP4 `IDLE` command_,
   [RFC 2177], June 1997.
 * C. Newman, _Using TLS with IMAP, POP3 and ACAP_,
   [RFC 2595], June 1999.
 * N. Freed, _SMTP Service Extension for Command Pipelining_,
   [RFC 2920], September 2000.
 * M. Crispin, _Internet Message Access Protocol - Version 4rev1_,
   [RFC 3501], March 2003.
 * M. Crispin,
   _Internet Message Access Protocol (IMAP) - `UIDPLUS` extension_,
   [RFC 4315], December 2005.
 * A. Gulbrandsen, _The IMAP `COMPRESS` Extension_,
   [RFC 4978], August 2007.
 * A. Melnikov and D. Cridland, _IMAP4 Extension to SEARCH Command for
   Controlling What Kind of Information Is Returned_,
   [RFC 4731], November 2006.
 * R. Siemborski and A. Gulbrandsen, _IMAP Extension for Simple
   Authentication and Security Layer (SASL) Initial Client Response_,
   [RFC 4959], September 2007.
 * J. Klensin, _Simple Mail Transfer Protocol_,
   [RFC 5321], October 2008.

[RFC 4315]: https://tools.ietf.org/html/rfc4315
[RFC 2177]: https://tools.ietf.org/html/rfc2177
[RFC 2595]: https://tools.ietf.org/html/rfc2595
[RFC 4959]: https://tools.ietf.org/html/rfc4959
[RFC 2152]: https://tools.ietf.org/html/rfc2152
[RFC 2088]: https://tools.ietf.org/html/rfc2088
[RFC 5321]: https://tools.ietf.org/html/rfc5321
[RFC 2033]: https://tools.ietf.org/html/rfc2033
[RFC 2920]: https://tools.ietf.org/html/rfc2920
[RFC 3501]: https://tools.ietf.org/html/rfc3501
[RFC 4978]: https://tools.ietf.org/html/rfc4978
[RFC 1928]: https://tools.ietf.org/html/rfc1928
[RFC 1929]: https://tools.ietf.org/html/rfc1929
[RFC 4731]: https://tools.ietf.org/html/rfc4731

[INI file]: https://en.wikipedia.org/wiki/INI_file
[`fetchmail`(1)]: http://www.fetchmail.info/
[`getmail`(1)]: http://pyropus.ca/software/getmail/
[`write`(2)]: http://man7.org/linux/man-pages/man2/write.2.html
[`ciphers`(1ssl)]: https://www.openssl.org/docs/manmaster/apps/ciphers.html
[`verify`(1ssl)]: https://www.openssl.org/docs/manmaster/apps/verify.html