Internet Engineering Task Force (IETF) B. Gondwana, Ed. Request for Comments: 8474 FastMail Updates: 3501 September 2018 Category: Standards Track ISSN: 2070-1721
IMAP Extension for Object Identifiers
Abstract
This document updates RFC 3501 (IMAP4rev1) with persistent identifiers on mailboxes and messages to allow clients to more efficiently reuse cached data when resources have changed location on the server.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc8474.
Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
IMAP stores are often used by many clients. Each client may cache data from the server so that it doesn't need to redownload information. [RFC3501] states that a mailbox can be uniquely referenced by its name and UIDVALIDITY, and a message within that mailbox can be uniquely referenced by its mailbox (name + UIDVALIDITY) and unique identifier (UID). The triple of mailbox name, UIDVALIDITY, and UID is guaranteed to be immutable.
[RFC4315] defines a COPYUID response that allows a client that copies messages to know the mapping between the UIDs in the source and destination mailboxes and, hence, update its local cache.
If a mailbox is successfully renamed by a client, that client will know that the same messages exist in the destination mailbox name as previously existed in the source mailbox name.
Gondwana Standards Track [Page 2]
RFC 8474 IMAP ObjectID September 2018
The result is that the client that copies (or moves [RFC6851]) messages or renames a mailbox can update its local cache, but any other client connected to the same store cannot know with certainty that the messages are identical, so it will redownload everything.
This extension adds new properties to a message (EMAILID) and mailbox (MAILBOXID). These properties allow a client to quickly identify messages or mailboxes that have been renamed by another client.
This extension also adds an optional thread identifier (THREADID) to messages, which can be used by the server to indicate messages that it has identified to be related. A server that does not implement threading will return NIL to all requests for THREADID.
In examples, "C:" indicates lines sent by a client that is connected to a server. "S:" indicates lines sent by the server to the client.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
The MAILBOXID is a server-allocated unique identifier for each mailbox.
The server MUST return the same MAILBOXID for a mailbox with the same name and UIDVALIDITY.
The server MUST NOT report the same MAILBOXID for two mailboxes at the same time.
The server MUST NOT reuse the same MAILBOXID for a mailbox that does not obey all the invariants that [RFC3501] defines for a mailbox that does not change name or UIDVALIDITY.
Gondwana Standards Track [Page 3]
RFC 8474 IMAP ObjectID September 2018
The server MUST keep the same MAILBOXID for the source and destination when renaming a mailbox in a way that keeps the same messages (but see [RFC3501] for the special case regarding the renaming of INBOX, which is treated as creating a new mailbox and moving the messages).
This document extends the CREATE command to have the response code MAILBOXID on successful mailbox creation.
A server advertising the OBJECTID capability MUST include the MAILBOXID response code in the tagged OK response to all successful CREATE commands.
Syntax: "MAILBOXID" SP "(" objectid ")"
Response code in tagged OK response for successful CREATE command.
Example:
C: 3 create foo S: 3 OK [MAILBOXID (F2212ea87-6097-4256-9d51-71338625)] Completed C: 4 create bar S: 4 OK [MAILBOXID (F6352ae03-b7f5-463c-896f-d8b48ee3)] Completed C: 5 create foo S: 5 NO Mailbox already exists
4.2. New OK Untagged Response for SELECT and EXAMINE
This document adds a new untagged response code to the SELECT and EXAMINE commands.
A server advertising the OBJECTID capability MUST return an untagged OK response with the MAILBOXID response code on all successful SELECT and EXAMINE commands.
This document adds the MAILBOXID attribute to the STATUS command using the extended syntax defined in [RFC4466].
A server that advertises the OBJECTID capability MUST support the MAILBOXID status attribute.
Syntax: "MAILBOXID"
The attribute in the STATUS command.
Syntax: "MAILBOXID" SP "(" objectid ")"
The response item in the STATUS response contains the ObjectID assigned by the server for this mailbox.
Example:
C: 6 status foo (mailboxid) S: * STATUS foo (MAILBOXID (F2212ea87-6097-4256-9d51-71338625)) S: 6 OK Completed C: 7 status bar (mailboxid) S: * STATUS bar (MAILBOXID (F6352ae03-b7f5-463c-896f-d8b48ee3)) S: 7 OK Completed C: 8 rename foo renamed S: * OK rename foo renamed S: 8 OK Completed C: 9 status renamed (mailboxid) S: * STATUS renamed (MAILBOXID (F2212ea87-6097-4256-9d51-71338625)) S: 9 OK Completed C: 10 status bar (mailboxid) S: * STATUS bar (MAILBOXID (F6352ae03-b7f5-463c-896f-d8b48ee3)) S: 10 OK Completed
When the LIST-STATUS IMAP capability defined in [RFC5819] is also available, the STATUS command can be combined with the LIST command.
Example:
C: 11 list "" "*" return (status (mailboxid)) S: * LIST (\HasNoChildren) "." INBOX S: * STATUS INBOX (MAILBOXID (Ff8e3ead4-9389-4aff-adb1-d8d89efd8cbf)) S: * LIST (\HasNoChildren) "." bar S: * STATUS bar (MAILBOXID (F6352ae03-b7f5-463c-896f-d8b48ee3)) S: * LIST (\HasNoChildren) "." renamed S: * STATUS renamed (MAILBOXID (F2212ea87-6097-4256-9d51-71338625)) S: 11 OK Completed (0.001 secs 3 calls)
Gondwana Standards Track [Page 5]
RFC 8474 IMAP ObjectID September 2018
5. EMAILID Object Identifier and THREADID Correlator
The EMAILID data item is an ObjectID that uniquely identifies the content of a single message. Anything that must remain immutable on a {name, uidvalidity, uid} triple must also be the same between messages with the same EMAILID.
The server MUST return the same EMAILID for the same triple; hence, EMAILID is immutable.
The server MUST return the same EMAILID as the source message for the matching destination message in the COPYUID pairing after a COPY or MOVE command [RFC6851].
The server MAY assign the same EMAILID as an existing message upon APPEND (e.g., if it detects that the new message has exactly identical content to that of an existing message).
NOTE: EMAILID only identifies the immutable content of the message. In particular, it is possible for different messages with the same EMAILID to have different keywords. This document does not specify a way to STORE by EMAILID.
The THREADID data item is an ObjectID that uniquely identifies a set of messages that the server believes should be grouped together when presented.
THREADID calculation is generally based on some combination of References, In-Reply-To, and Subject, but the exact logic is left up to the server implementation. [RFC5256] describes some algorithms that could be used; however, this specification does not mandate any particular strategy.
The server MUST return the same THREADID for all messages with the same EMAILID.
The server SHOULD return the same THREADID for related messages, even if they are in different mailboxes; for example, messages that would appear in the same thread if they were in the same mailbox SHOULD have the same THREADID, even if they are in different mailboxes.
The server MUST NOT change the THREADID of a message once reported.
Gondwana Standards Track [Page 6]
RFC 8474 IMAP ObjectID September 2018
THREADID is OPTIONAL; if the server doesn't support THREADID or is unable to calculate relationships between messages, it MUST return NIL to all FETCH responses for the THREADID data item, and a SEARCH for THREADID MUST NOT match any messages.
The server MUST NOT use the same ObjectID value for both EMAILIDs and THREADIDs. If they are stored with the same value internally, the server can generate prefixed values (as shown in the examples below with M and T prefixes) to avoid clashes.
5.3. New Message Data Items in FETCH and UID FETCH Commands
This document defines two FETCH items:
Syntax: "EMAILID"
The EMAILID message data item causes the server to return EMAILID FETCH response data items.
Syntax: "THREADID"
The THREADID message data item causes the server to return THREADID FETCH response data items.
This document defines the following responses:
Syntax: "EMAILID" SP "(" objectid ")"
The EMAILID response data item contains the server-assigned ObjectID for each message.
Syntax: "THREADID" SP "(" objectid ")"
The THREADID response data item contains the server-assigned ObjectID for the set of related messages to which this message belongs.
Syntax: "THREADID" SP nil
The NIL value is returned for the THREADID response data item when the server mailbox does not support THREADID calculation.
The following syntax specification uses the Augmented Backus-Naur Form (ABNF) [RFC5234] notation. Elements not defined here can be found in the formal syntax of the ABNF [RFC5234], IMAP [RFC3501], and IMAP ABNF extensions [RFC4466] specifications.
Except as noted otherwise, all alphabetic characters are case insensitive. The use of uppercase or lowercase characters to define token strings is for editorial clarity only. Implementations MUST accept these strings in a case-insensitive fashion.
Please note specifically that ObjectID values are case sensitive.
Gondwana Standards Track [Page 9]
RFC 8474 IMAP ObjectID September 2018
capability =/ "OBJECTID"
fetch-att =/ "EMAILID" / "THREADID"
fetch-emailid-resp = "EMAILID" SP "(" objectid ")" ; follows tagged-ext production from [RFC4466]
fetch-threadid-resp = "THREADID" SP ( "(" objectid ")" / nil ) ; follows tagged-ext production from [RFC4466]
objectid = 1*255(ALPHA / DIGIT / "_" / "-") ; characters in object identifiers are case ; significant
resp-text-code =/ "MAILBOXID" SP "(" objectid ")" ; incorporated before the expansion rule of ; atom [SP 1*<any TEXT-CHAR except "]">] ; that appears in [RFC3501]
In the interest of reducing the possibilities of encoding mistakes, ObjectIDs are restricted to a safe subset of possible byte values; in order to allow clients to allocate storage, they are restricted in length.
An ObjectID is a string of 1 to 255 characters from the following set of 64 codepoints: a-z, A-Z, 0-9, _, -. These characters are safe to use in almost any context (e.g., filesystems, URIs, IMAP atoms). These are the same characters defined as base64url in [RFC4648].
Gondwana Standards Track [Page 10]
RFC 8474 IMAP ObjectID September 2018
For maximum safety, servers should also follow defensive allocation strategies to avoid creating risks where glob completion or data type detection may be present (e.g., on filesystems or in spreadsheets). In particular, it is wise to avoid:
o IDs starting with a dash
o IDs starting with digits
o IDs that contain only digits
o IDs that differ only by ASCII case (for example, A vs. a)
o the specific sequence of three characters NIL in any case (because this sequence can be confused with the IMAP protocol expression of the null value)
A good solution to these issues is to prefix every ID with a single alphabetical character.
The case of RENAME INBOX may need special handling because it has special behavior, as defined in [RFC3501], Section 6.3.5.
It is advisable (though not required) to have MAILBOXID be globally unique, but it is only required to be unique within messages offered to a single client login to a single server hostname. For example, a proxy that aggregates multiple independent servers MUST NOT advertise the OBJECTID capability unless it can guarantee that different objects will never use the same identifiers, even if backend object identifiers collide.
Servers that implement both RFC 6154 and this specification should optimize their execution of commands like UID SEARCH OR EMAILID 1234 EMAILID 4321.
Clients can assume that searching the all-mail mailbox using OR/ EMAILID or OR/THREADID is a fast way to find messages again if some other client has moved them out of the mailbox where they were previously seen.
Clients that cache data offline should fetch the EMAILID of all new messages to avoid redownloading already-cached message details.
Gondwana Standards Track [Page 11]
RFC 8474 IMAP ObjectID September 2018
Clients should fetch the MAILBOXID for any new mailboxes before discarding cache data for any mailbox that is no longer present on the server so that they can detect renames and avoid redownloading data.
In cases of server failure and disaster recovery, or misbehaving servers, it is possible that a client will be sent invalid information, e.g., identical ObjectIDs or ObjectIDs that have changed where they MUST NOT change according to this document.
In a case where a client detects inconsistent ObjectID responses from a server, it SHOULD fall back to relying on the guarantees of RFC 3501. For simplicity, a client MAY instead choose to discard its entire cache and resync all state from the server.
Client authors protecting against server misbehavior MUST ensure that their design cannot get into an infinite loop of discarding cache and fetching the same data repeatedly without user interaction.
It is strongly advised that servers generate ObjectIDs that are safe to use as filesystem names and unlikely to be autodetected as numbers. See implementation considerations.
If a digest is used for ID generation, it must have a collision- resistant property, so server implementations are advised to monitor current security research and choose secure digests. As the IDs are generated by the server, it will be possible to migrate to a new hash by just using the new algorithm when creating new IDs. This is particularly true if a prefix is used on each ID, which can be changed when the algorithm changes.
The use of a digest for ID generation may be used as proof that a particular sequence of bytes was seen by the server. However, this is only a risk if IDs are leaked to clients who don't have permission to fetch the data directly. Servers that are expected to handle highly sensitive data should consider this when choosing how to create IDs.
See also the security considerations in [RFC3501], Section 11.
o Server-assigned sequence number (guaranteed not to be reused)
There is a need to index and look up reference/in-reply-to data at message creation to efficiently find matching messages for threading. Threading may be either across mailboxes or within each mailbox only. The server has significant leeway here.
Acknowledgments
The author would like to thank the EXTRA working group at IETF for feedback and advice -- in particular, Arnt Gulbrandsen, Brandon Long, Chris Newman, and Josef Sipek.
This document drew inspiration from the Gmail X-GM-THRID and X-GM- MSGID implementations as currently defined at <https://developers.google.com/gmail/imap/imap-extensions>, as well as the X-GUID implementation in the Dovecot server.
Gondwana Standards Track [Page 15]
RFC 8474 IMAP ObjectID September 2018
Author's Address
Bron Gondwana (editor) FastMail Level 2, 114 William St Melbourne VIC 3000 Australia