% pullimap(1)
% [Guilhem Moulin](mailto:guilhem@fripost.org)
% March 2016

Name
====

PullIMAP - Pull mails from an IMAP mailbox and deliver them to an SMTP session

Synopsis
========

`pullimap` [**\-\-config=***FILE*] [**\-\-idle**[**=***SECONDS*]]
[**\-\-no-delivery**] [**\-\-quiet**] *SECTION*

Description
===========

`pullimap` retrieves messages from an IMAP mailbox and deliver them to
an SMTP or LMTP transmission channel.  It can also remove old messages
after a configurable retention period.

A *statefile* is used to keep track of the mailbox's `UIDVALIDITY` and
`UIDNEXT` values.  While `pullimap` is running, the *statefile* is also
used to keep track of UIDs being delivered, which avoids duplicate
deliveries in case the process is interrupted.
See the **[control flow](#control-flow)** section below for details.

Options
=======

`--config=`*FILE*

:   Specify an alternate [configuration file](#configuration-file).
    Relative paths start from *$XDG_CONFIG_HOME/pullimap*, or *~/.config/pullimap*
    if the `XDG_CONFIG_HOME` environment variable is unset.

`--idle`[`=`*seconds*]

:   Don't exit after a successful poll.  Instead, keep the connection open
    and issue `IDLE` commands (require an IMAP server supporting [RFC
    2177]) to watch for updates in the mailbox.  This also enables
    `SO_KEEPALIVE` on the socket.
    Each `IDLE` command is terminated after at most *seconds* (29
    minutes by default) to avoid being logged out for inactivity.

`--no-delivery`

:   Update the *statefile*, but skip SMTP/LMTP delivery.  This is mostly
    useful for initializing the *statefile* when migrating to `pullimap`
    from another similar program such as [`fetchmail`(1)] or
    [`getmail`(1)].

`-q`, `--quiet`

:   Try to be quiet.

`--debug`

:   Turn on debug mode.  Debug messages, which includes all IMAP traffic
    besides literals, are written to the given *logfile*.  The `LOGIN`
    and `AUTHENTICATE` commands are however redacted (in order to avoid
    disclosing authentication credentials) unless the `--debug` flag is
    set multiple times.

`-h`, `--help`

:   Output a brief help and exit.

`--version`

:   Show the version number and exit.

Configuration file  {#configuration-file}
==================

Unless told otherwise by the `--config=FILE` command-line option,
`pullimap` reads its configuration from *$XDG_CONFIG_HOME/pullimap/config*
(or *~/.config/pullimap/config* if the `XDG_CONFIG_HOME` environment variable
is unset) as an [INI file].
The syntax of the configuration file is a series of `OPTION=VALUE`
lines organized under some `[SECTION]`; lines starting with a ‘#’ or
‘;’ character are ignored as comments.
Valid options are:

*statefile*

:   State file to use to keep track of the *mailbox*'s `UIDVALIDITY` and
    `UIDNEXT` values.  Relative paths start from
    *$XDG_DATA_HOME/pullimap*, or *~/.local/share/pullimap* if the
    `XDG_DATA_HOME` environment variable is unset.
    (Default: the parent section name of the option.)

*mailbox*

:   The IMAP mailbox ([UTF-7 encoded][RFC 2152] and unquoted) to pull
    messages from.  Support for persistent message Unique Identifiers
    (UID) is required.  (Default: `INBOX`.)

*deliver-method*

:   `PROTOCOL:[ADDRESS]:PORT` where to deliver messages.  Both
    [SMTP][RFC 5321] and [LMTP][RFC 2033] servers are supported, and
    [SMTP pipelining][RFC 2920] is used when possible.
    (Default: `smtp:[127.0.0.1]:25`.)

*deliver-ehlo*

:   Name to use in `EHLO` or `LHLO` commands.
    (Default: `localhost.localdomain`.)

*deliver-rcpt*

:   Message recipient.  Note that the local part needs to quoted if it
    contains special characters; see [RFC 5321] for details.
    (Default: the username associated with the effective user ID of the
    `pullimap` process.)

*purge-after*

:   Retention period (in days), after which messages are removed from
    the IMAP server.  (The value is at best 24h accurate due to the IMAP
    `SEARCH` criterion ignoring time and timezone.)
    If *purge-after* is set to `0` then messages are deleted immediately
    after delivery.  Otherwise `pullimap` issues an IMAP `SEARCH` (or
    extended `SEARCH` on servers advertising the [`ESEARCH`][RFC 4731]
    capability) command to list old messages; if `--idle` is set then
    the `SEARCH` command is issued again every 12 hours.

*type*

:   One of `imap`, `imaps` or `tunnel`.
    `type=imap` and `type=imaps` are respectively used for IMAP and IMAP
    over SSL/TLS connections over an INET socket.
    `type=tunnel` causes `pullimap` to create an unnamed pair of
    connected sockets for inter-process communication with a *command*
    instead of opening a network socket.
    (Default: `imaps`.)

*host*

:   Server hostname or IP address, for `type=imap` and `type=imaps`.
    The value can optionally be enclosed in square brackets to force its
    interpretation as an IP literal (hence skip name resolution).
    (Default: `localhost`.)

*port*

:   Server port.
    (Default: `143` for `type=imap`, `993` for `type=imaps`.)

*proxy*

:   Optional SOCKS proxy to use for TCP connections to the IMAP server
    (`type=imap` and `type=imaps` only), formatted as
    `PROTOCOL://[USER:PASSWORD@]PROXYHOST[:PROXYPORT]`.
    If `PROXYPORT` is omitted, it is assumed at port 1080.
    Only [SOCKSv5][RFC 1928] is supported (with optional
    [username/password authentication][RFC 1929]), in two flavors:
    `socks5://` to resolve *hostname* locally, and `socks5h://` to let
    the proxy resolve *hostname*.

*command*

:   Command to use for `type=tunnel`.  Must speak the [IMAP4rev1
    protocol][RFC 3501] on its standard output, and understand it on its
    standard input.  The value is passed to `` `/bin/sh -c` `` if it
    contains shell metacharacters; otherwise it is split into words and
    the resulting list is passed to `execvp`(3).

*STARTTLS*

:   Whether to use the [`STARTTLS`][RFC 2595] directive to upgrade to a
    secure connection.  Setting this to `YES` for a server not
    advertising the `STARTTLS` capability causes `pullimap` to
    immediately abort the connection.
    (Ignored for *type*s other than `imap`.  Default: `YES`.)

*auth*

:   Space-separated list of preferred authentication mechanisms.
    `pullimap` uses the first mechanism in that list that is also
    advertised (prefixed with `AUTH=`) in the server's capability list.
    Supported authentication mechanisms are `PLAIN` and `LOGIN`.
    (Default: `PLAIN LOGIN`.)

*username*, *password*

:   Username and password to authenticate with.  Can be required for non
    pre-authenticated connections, depending on the chosen
    authentication mechanism.

*compress*

:   Whether to use the [`IMAP COMPRESS` extension][RFC 4978] for servers
    advertising it.  (Default: `YES`.)

*null-stderr*

:   Whether to redirect *command*'s standard error to `/dev/null` for
    `type=tunnel`.  (Default: `NO`.)

*SSL_protocols*

:   Space-separated list of SSL/TLS protocol versions to explicitly
    enable (or disable if prefixed with an exclamation mark `!`).
    Potentially known protocols are `SSLv2`, `SSLv3`, `TLSv1`,
    `TLSv1.1`, `TLSv1.2`, and `TLSv1.3`, depending on the OpenSSL
    version used.
    Enabling a protocol is a short-hand for disabling all other
    protocols.

    *DEPRECATED*: Use *SSL_protocol_min* and/or *SSL_protocol_max*
    instead.

*SSL_protocol_min*, *SSL_protocol_max*

:   Set minimum resp. maximum SSL/TLS protocol version to use for the
    connection.  Potentially recognized values are `SSLv3`, `TLSv1`,
    `TLSv1.1`, `TLSv1.2`, and `TLSv1.3`, depending on the OpenSSL
    version used.

*SSL_cipherlist*, *SSL_ciphersuites*

:   Sets the TLSv1.2 and below cipher list resp. TLSv1.3 cipher suites.
    The combination of these lists is sent to the server, which then
    determines which cipher to use (normally the first supported one
    from the list sent by the client).  The default suites depend on the
    OpenSSL version and its configuration, see [`ciphers`(1ssl)] for
    more information.

*SSL_fingerprint*

:   Space-separated list of acceptable fingerprints for the server
    certificate's Subject Public Key Info, in the form
    `[ALGO$]DIGEST_HEX` where `ALGO` is the digest algorithm (by default
    `sha256`).
    Attempting to connect to a server with a non-matching certificate
    SPKI fingerprint causes `pullimap` to abort the connection during
    the SSL/TLS handshake.
    The following command can be used to compute the SHA-256 digest of a
    certificate's Subject Public Key Info:

        $ openssl x509 -in /path/to/server/certificate.pem -pubkey \
			| openssl pkey -pubin -outform DER \
			| openssl dgst -sha256

    Specifying multiple digest values can be useful in key rollover
    scenarios and/or when the server supports certificates of different
    types (for instance a dual-cert RSA/ECDSA setup).  In that case the
    connection is aborted when none of the specified digests matches.

*SSL_verify*

:   Whether to 1/ verify the server certificate chain; and 2/ match its
    Subject Alternative Name (SAN) or Subject CommonName (CN) against
    the value of the *host* option.
    (Default: `YES`.)

    Note that using *SSL_fingerprint* to specify the fingerprint of the
    server certificate provides an independent server authentication
    measure as it pins directly its key material and ignore its chain of
    trust.

*SSL_CAfile*

:   File containing trusted certificates to use during server
    certificate verification when `SSL_verify=YES`.

    Trusted CA certificates are loaded from the default system locations
    unless one (or both) of *SSL_CAfile* or *SSL_CApath* is set.

*SSL_CApath*

:   Directory to use for server certificate verification when
    `SSL_verify=YES`.
    This directory must be in “hash format”, see [`verify`(1ssl)] for
    more information.

    Trusted CA certificates are loaded from the default system locations
    unless one (or both) of *SSL_CAfile* or *SSL_CApath* is set.

*SSL_hostname*

:   Name to use for the TLS SNI (Server Name Indication) extension.  The
    default value is taken from the *host* option when it is a hostname,
    and to the empty string when it is an IP literal.
    Setting *SSL_hostname* to the empty string explicitly disables SNI.

Control flow  {#control-flow}
============

`pullimap` opens the *statefile* corresponding to a given configuration
*SECTION* with `O_DSYNC` to ensure that written data is flushed to the
underlying hardware by the time [`write`(2)] returns.  Moreover an
exclusive lock is placed on the file descriptor immediately after
opening to prevent multiple `pullimap` processes from accessing the
*statefile* concurrently.

Each *statefile* consists of a series of 32-bits big-endian integers.
Usually there are only two integers: the first is the *mailbox*'s
`UIDVALIDITY` value, and the second is the *mailbox*'s last seen
`UIDNEXT` value (`pullimap` then assumes that all messages with UID
smaller than this `UIDNEXT` value have already been retrieved and
delivered).
The [IMAP4rev1 specification][RFC 3501] does not guaranty that untagged
`FETCH` responses are sent ordered by UID in response to a `UID FETCH`
command.  Thus it would be unsafe for `pullimap` to update the `UIDNEXT`
value in its *statefile* while the `UID FETCH` command is progress.
Instead, for each untagged `FETCH` response received while the `UID
FETCH` command is in progress, `pullimap` delivers the message `RFC822`
body to the SMTP or LMTP server (specified with *deliver-method*) then
appends the message UID to the *statefile*.
When the `UID FETCH` command eventually terminates, `pullimap` updates
the `UIDNEXT` value in the *statefile* and truncate the file down to 8
bytes.  Keeping track of message UIDs as they are received avoids
duplicate in the event of a crash or connection loss while the `UID
FETCH` command is in progress.

In more details, `pullimap` works as follows:

 1. Issue a `UID FETCH` command to retrieve message `ENVELOPE` and
    `RFC822` (and `UID`) with UID bigger or equal than the `UIDNEXT`
    value found in the *statefile*.
    While the `UID FETCH` command is in progress, perform the following
    for each untagged `FETCH` response sent by the server:

     i. if no SMTP/LMTP transmission channel was opened, open one to the
        server specified with *deliver-method* and send an `EHLO` (or
        `LHO`) command with the domain specified by *deliver-ehlo* (the
        channel is kept open and shared for all messages retrieved while
        the `UID FETCH` IMAP command is in progress);

     i. perform a mail transaction (using [SMTP pipelining][RFC 2920] if
        possible) to deliver the retrieved message `RFC822` body to the
        SMTP or LMTP session; and

     i. append the message UID to the *statefile*.

 2. If an SMTP/LMTP transmission channel was opened, send a `QUIT` command
    to terminate it gracefully.

 3. Issue a `UID STORE` command to mark all retrieved messages (and
    stalled UIDs found in the *statefile* after the eighth byte) as
    `\Seen`.

 4. Update the *statefile* with the new UIDNEXT value (bytes 5-8).

 5. Truncate the *statefile* down to 8 bytes (so that it contains only
    two 32-bits integers, respectively the *mailbox*'s current
    `UIDVALIDITY` and `UIDNEXT` values).

 6. If `--idle` was set, issue an `IDLE` command; stop idling and go
    back to step 1 when a new message is received (or when the `IDLE`
    timeout expires).

Standards
=========

 * M. Leech, M. Ganis, Y. Lee, R. Kuris, D. Koblas and L. Jones,
   _SOCKS Protocol Version 5_,
   [RFC 1928], March 1996.
 * M. Leech, _Username/Password Authentication for SOCKS V5_,
   [RFC 1929], March 1996.
 * J. Myers, _Local Mail Transfer Protocol_,
   [RFC 2033], October 1996.
 * J. Myers, _IMAP4 non-synchronizing literals_,
   [RFC 2088], January 1997.
 * D. Goldsmith and M. Davis,
   _A Mail-Safe Transformation Format of Unicode_,
   [RFC 2152], May 1997.
 * B. Leiba, _IMAP4 `IDLE` command_,
   [RFC 2177], June 1997.
 * C. Newman, _Using TLS with IMAP, POP3 and ACAP_,
   [RFC 2595], June 1999.
 * N. Freed, _SMTP Service Extension for Command Pipelining_,
   [RFC 2920], September 2000.
 * M. Crispin, _Internet Message Access Protocol - Version 4rev1_,
   [RFC 3501], March 2003.
 * M. Crispin,
   _Internet Message Access Protocol (IMAP) - `UIDPLUS` extension_,
   [RFC 4315], December 2005.
 * A. Gulbrandsen, _The IMAP `COMPRESS` Extension_,
   [RFC 4978], August 2007.
 * A. Melnikov and D. Cridland, _IMAP4 Extension to SEARCH Command for
   Controlling What Kind of Information Is Returned_,
   [RFC 4731], November 2006.
 * R. Siemborski and A. Gulbrandsen, _IMAP Extension for Simple
   Authentication and Security Layer (SASL) Initial Client Response_,
   [RFC 4959], September 2007.
 * J. Klensin, _Simple Mail Transfer Protocol_,
   [RFC 5321], October 2008.

[RFC 4315]: https://tools.ietf.org/html/rfc4315
[RFC 2177]: https://tools.ietf.org/html/rfc2177
[RFC 2595]: https://tools.ietf.org/html/rfc2595
[RFC 4959]: https://tools.ietf.org/html/rfc4959
[RFC 2152]: https://tools.ietf.org/html/rfc2152
[RFC 2088]: https://tools.ietf.org/html/rfc2088
[RFC 5321]: https://tools.ietf.org/html/rfc5321
[RFC 2033]: https://tools.ietf.org/html/rfc2033
[RFC 2920]: https://tools.ietf.org/html/rfc2920
[RFC 3501]: https://tools.ietf.org/html/rfc3501
[RFC 4978]: https://tools.ietf.org/html/rfc4978
[RFC 1928]: https://tools.ietf.org/html/rfc1928
[RFC 1929]: https://tools.ietf.org/html/rfc1929
[RFC 4731]: https://tools.ietf.org/html/rfc4731

[INI file]: https://en.wikipedia.org/wiki/INI_file
[`fetchmail`(1)]: https://www.fetchmail.info/
[`getmail`(1)]: http://pyropus.ca/software/getmail/
[`write`(2)]: https://man7.org/linux/man-pages/man2/write.2.html
[`ciphers`(1ssl)]: https://www.openssl.org/docs/manmaster/man1/openssl-ciphers.html
[`verify`(1ssl)]: https://www.openssl.org/docs/manmaster/man1/openssl-verify.html