Network Working Group R. Housley Request for Comments: 3686 Vigil Security Category: Standards Track January 2004
Using Advanced Encryption Standard (AES) Counter Mode With IPsec Encapsulating Security Payload (ESP)
Status of this Memo
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2004). All Rights Reserved.
Abstract
This document describes the use of Advanced Encryption Standard (AES) Counter Mode, with an explicit initialization vector, as an IPsec Encapsulating Security Payload (ESP) confidentiality mechanism.
The National Institute of Standards and Technology (NIST) recently selected the Advanced Encryption Standard (AES) [AES], also known as Rijndael. The AES is a block cipher, and it can be used in many different modes. This document describes the use of AES Counter Mode (AES-CTR), with an explicit initialization vector (IV), as an IPsec Encapsulating Security Payload (ESP) [ESP] confidentiality mechanism.
This document does not provide an overview of IPsec. However, information about how the various components of IPsec and the way in which they collectively provide security services is available in [ARCH] and [ROADMAP].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [STDWORDS].
NIST has defined five modes of operation for AES and other FIPS- approved block ciphers [MODES]. Each of these modes has different characteristics. The five modes are: ECB (Electronic Code Book), CBC (Cipher Block Chaining), CFB (Cipher FeedBack), OFB (Output FeedBack), and CTR (Counter).
Only AES Counter mode (AES-CTR) is discussed in this specification. AES-CTR requires the encryptor to generate a unique per-packet value, and communicate this value to the decryptor. This specification calls this per-packet value an initialization vector (IV). The same IV and key combination MUST NOT be used more than once. The
Housley Standards Track [Page 2]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
encryptor can generate the IV in any manner that ensures uniqueness. Common approaches to IV generation include incrementing a counter for each packet and linear feedback shift registers (LFSRs).
This specification calls for the use of a nonce for additional protection against precomputation attacks. The nonce value need not be secret. However, the nonce MUST be unpredictable prior to the establishment of the IPsec security association that is making use of AES-CTR.
AES-CTR has many properties that make it an attractive encryption algorithm for in high-speed networking. AES-CTR uses the AES block cipher to create a stream cipher. Data is encrypted and decrypted by XORing with the key stream produced by AES encrypting sequential counter block values. AES-CTR is easy to implement, and AES-CTR can be pipelined and parallelized. AES-CTR also supports key stream precomputation.
Pipelining is possible because AES has multiple rounds (see section 2.2). A hardware implementation (and some software implementations) can create a pipeline by unwinding the loop implied by this round structure. For example, after a 16-octet block has been input, one round later another 16-octet block can be input, and so on. In AES- CTR, these inputs are the sequential counter block values used to generate the key stream.
Multiple independent AES encrypt implementations can also be used to improve performance. For example, one could use two AES encrypt implementations in parallel, to process a sequence of counter block values, doubling the effective throughput.
The sender can precompute the key stream. Since the key stream does not depend on any data in the packet, the key stream can be precomputed once the nonce and IV are assigned. This precomputation can reduce packet latency. The receiver cannot perform similar precomputation because the IV will not be known before the packet arrives.
AES-CTR uses the only AES encrypt operation (for both encryption and decryption), making AES-CTR implementations smaller than implementations of many other AES modes.
When used correctly, AES-CTR provides a high level of confidentiality. Unfortunately, AES-CTR is easy to use incorrectly. Being a stream cipher, any reuse of the per-packet value, called the IV, with the same nonce and key is catastrophic. An IV collision immediately leaks information about the plaintext in both packets. For this reason, it is inappropriate to use this mode of operation
Housley Standards Track [Page 3]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
with static keys. Extraordinary measures would be needed to prevent reuse of an IV value with the static key across power cycles. To be safe, implementations MUST use fresh keys with AES-CTR. The Internet Key Exchange (IKE) [IKE] protocol can be used to establish fresh keys. IKE can also provide the nonce value.
With AES-CTR, it is trivial to use a valid ciphertext to forge other (valid to the decryptor) ciphertexts. Thus, it is equally catastrophic to use AES-CTR without a companion authentication function. Implementations MUST use AES-CTR in conjunction with an authentication function, such as HMAC-SHA-1-96 [HMAC-SHA].
To encrypt a payload with AES-CTR, the encryptor partitions the plaintext, PT, into 128-bit blocks. The final block need not be 128 bits; it can be less.
PT = PT[1] PT[2] ... PT[n]
Each PT block is XORed with a block of the key stream to generate the ciphertext, CT. The AES encryption of each counter block results in 128 bits of key stream. The most significant 96 bits of the counter block are set to the nonce value, which is 32 bits, followed by the per-packet IV value, which is 64 bits. The least significant 32 bits of the counter block are initially set to one. This counter value is incremented by one to generate subsequent counter blocks, each resulting in another 128 bits of key stream. The encryption of n plaintext blocks can be summarized as:
CTRBLK := NONCE || IV || ONE FOR i := 1 to n-1 DO CT[i] := PT[i] XOR AES(CTRBLK) CTRBLK := CTRBLK + 1 END CT[n] := PT[n] XOR TRUNC(AES(CTRBLK))
The AES() function performs AES encryption with the fresh key.
The TRUNC() function truncates the output of the AES encrypt operation to the same length as the final plaintext block, returning the most significant bits.
Housley Standards Track [Page 4]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
Decryption is similar. The decryption of n ciphertext blocks can be summarized as:
CTRBLK := NONCE || IV || ONE FOR i := 1 to n-1 DO PT[i] := CT[i] XOR AES(CTRBLK) CTRBLK := CTRBLK + 1 END PT[n] := CT[n] XOR TRUNC(AES(CTRBLK))
AES supports three key sizes: 128 bits, 192 bits, and 256 bits. The default key size is 128 bits, and all implementations MUST support this key size. Implementations MAY also support key sizes of 192 bits and 256 bits.
AES uses a different number of rounds for each of the defined key sizes. When a 128-bit key is used, implementations MUST use 10 rounds. When a 192-bit key is used, implementations MUST use 12 rounds. When a 256-bit key is used, implementations MUST use 14 rounds.
The AES has a block size of 128 bits (16 octets). As such, when using AES-CTR, each AES encrypt operation generates 128 bits of key stream. AES-CTR encryption is the XOR of the key stream with the plaintext. AES-CTR decryption is the XOR of the key stream with the ciphertext. If the generated key stream is longer than the plaintext or ciphertext, the extra key stream bits are simply discarded. For this reason, AES-CTR does not require the plaintext to be padded to a multiple of the block size. However, to provide limited traffic flow confidentiality, padding MAY be included, as specified in [ESP].
The AES-CTR IV field MUST be eight octets. The IV MUST be chosen by the encryptor in a manner that ensures that the same IV value is used only once for a given key. The encryptor can generate the IV in any manner that ensures uniqueness. Common approaches to IV generation include incrementing a counter for each packet and linear feedback shift registers (LFSRs).
Including the IV in each packet ensures that the decryptor can generate the key stream needed for decryption, even when some packets are lost or reordered.
AES-CTR mode does not require plaintext padding. However, ESP does require padding to 32-bit word-align the authentication data. The padding, Pad Length, and the Next Header MUST be concatenated with the plaintext before performing encryption, as described in [ESP].
Since it is trivial to construct a forgery AES-CTR ciphertext from a valid AES-CTR ciphertext, AES-CTR implementations MUST employ a non- NULL ESP authentication method. HMAC-SHA-1-96 [HMAC-SHA] is a likely choice.
Housley Standards Track [Page 6]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
Each packet conveys the IV that is necessary to construct the sequence of counter blocks used to generate the key stream necessary to decrypt the payload. The AES counter block cipher block is 128 bits. Figure 2 shows the format of the counter block.
The components of the counter block are as follows:
Nonce The Nonce field is 32 bits. As the name implies, the nonce is a single use value. That is, a fresh nonce value MUST be assigned for each security association. It MUST be assigned at the beginning of the security association. The nonce value need not be secret, but it MUST be unpredictable prior to the beginning of the security association.
Initialization Vector The IV field is 64 bits. As described in section 3.1, the IV MUST be chosen by the encryptor in a manner that ensures that the same IV value is used only once for a given key.
Block Counter The block counter field is the least significant 32 bits of the counter block. The block counter begins with the value of one, and it is incremented to generate subsequent portions of the key stream. The block counter is a 32-bit big-endian integer value.
Using the encryption process described in section 2.1, this construction permits each packet to consist of up to:
This section describes the conventions used to generate keying material and nonces for use with AES-CTR using the Internet Key Exchange (IKE) [IKE] protocol. The identifiers and attributes needed to negotiate a security association which uses AES-CTR are also defined.
As described in section 2.1, implementations MUST use fresh keys with AES-CTR. IKE can be used to establish fresh keys. This section describes the conventions for obtaining the unpredictable nonce value from IKE. Note that this convention provides a nonce value that is secret as well as unpredictable.
IKE makes use of a pseudo-random function (PRF) to derive keying material. The PRF is used iteratively to derive keying material of arbitrary size, called KEYMAT. Keying material is extracted from the output string without regard to boundaries.
The size of the requested KEYMAT MUST be four octets longer than is needed for the associated AES key. The keying material is used as follows:
AES-CTR with a 128 bit key The KEYMAT requested for each AES-CTR key is 20 octets. The first 16 octets are the 128-bit AES key, and the remaining four octets are used as the nonce value in the counter block.
AES-CTR with a 192 bit key The KEYMAT requested for each AES-CTR key is 28 octets. The first 24 octets are the 192-bit AES key, and the remaining four octets are used as the nonce value in the counter block.
AES-CTR with a 256 bit key The KEYMAT requested for each AES-CTR key is 36 octets. The first 32 octets are the 256-bit AES key, and the remaining four octets are used as the nonce value in the counter block.
Housley Standards Track [Page 8]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
This document does not specify the conventions for using AES-CTR for IKE Phase 1 negotiations. For AES-CTR to be used in this manner, a separate specification is needed, and an Encryption Algorithm Identifier needs to be assigned.
Since the AES supports three key lengths, the Key Length attribute MUST be specified in the IKE Phase 2 exchange [DOI]. The Key Length attribute MUST have a value of 128, 192, or 256.
This section contains nine test vectors, which can be used to confirm that an implementation has correctly implemented AES-CTR. The first three test vectors use AES with a 128 bit key; the next three test vectors use AES with a 192 bit key; and the last three test vectors use AES with a 256 bit key.
When used properly, AES-CTR mode provides strong confidentiality. Bellare, Desai, Jokipii, Rogaway show in [BDJR] that the privacy guarantees provided by counter mode are at least as strong as those for CBC mode when using the same block cipher.
Unfortunately, it is very easy to misuse this counter mode. If counter block values are ever used for more that one packet with the same key, then the same key stream will be used to encrypt both packets, and the confidentiality guarantees are voided.
What happens if the encryptor XORs the same key stream with two different plaintexts? Suppose two plaintext byte sequences P1, P2, P3 and Q1, Q2, Q3 are both encrypted with key stream K1, K2, K3. The two corresponding ciphertexts are:
(P1 XOR K1), (P2 XOR K2), (P3 XOR K3)
(Q1 XOR K1), (Q2 XOR K2), (Q3 XOR K3)
If both of these two ciphertext streams are exposed to an attacker, then a catastrophic failure of confidentiality results, since:
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
Once the attacker obtains the two plaintexts XORed together, it is relatively straightforward to separate them. Thus, using any stream cipher, including AES-CTR, to encrypt two plaintexts under the same key stream leaks the plaintext.
Therefore, stream ciphers, including AES-CTR, should not be used with static keys. It is inappropriate to use AES-CTR with static keys. Extraordinary measures would be needed to prevent reuse of a counter block value with the static key across power cycles. To be safe, ESP implementations MUST use fresh keys with AES-CTR. The Internet Key Exchange (IKE) protocol [IKE] can be used to establish fresh keys. IKE can also be used to establish the nonce at the beginning of the security association.
When IKE is used to establish fresh keys between two peer entities, separate keys are established for the two traffic flows. When a mechanism other than IKE is used to establish fresh keys, and that mechanism establishes only a single key to encrypt packets, then there is a high probability that the peers will select the same IV values for some packets. Thus, to avoid counter block collisions,
ESP implementations that permit use of the same key for encrypting outbound traffic and decrypting incoming traffic with the same peer MUST ensure that the two peers assign different Nonce values to the security association.
Data forgery is trivial with CTR mode. The demonstration of this attack is similar to the key stream reuse discussion above. If a known plaintext byte sequence P1, P2, P3 is encrypted with key stream K1, K2, K3, then the attacker can replace the plaintext with one of his own choosing. The ciphertext is:
(P1 XOR K1), (P2 XOR K2), (P3 XOR K3)
The attacker simply XORs a selected sequence Q1, Q2, Q3 with the ciphertext to obtain:
Decryption of the attacker-generated ciphertext will yield exactly what the attacker intended:
(Q1 XOR P1), (Q2 XOR P2), (Q3 XOR P3)
Housley Standards Track [Page 13]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
Accordingly, ESP implementations MUST use of AES-CTR in conjunction with ESP authentication.
Additionally, since AES has a 128-bit block size, regardless of the mode employed, the ciphertext generated by AES encryption becomes distinguishable from random values after 2^64 blocks are encrypted with a single key. Since ESP with Enhanced Sequence Numbers allows for up to 2^64 packets in a single security association, there is real potential for more than 2^64 blocks to be encrypted with one key. Therefore, implementations SHOULD generate a fresh key before 2^64 blocks are encrypted with the same key. Note that ESP with 32- bit Sequence Numbers will not exceed 2^64 blocks even if all of the packets are maximum-length IPv6 jumbograms [JUMBO].
There are fairly generic precomputation attacks against all block cipher modes that allow a meet-in-the-middle attack against the key. These attacks require the creation and searching of huge tables of ciphertext associated with known plaintext and known keys. Assuming that the memory and processor resources are available for a precomputation attack, then the theoretical strength of AES-CTR (and any other block cipher mode) is limited to 2^(n/2) bits, where n is the number of bits in the key. The use of long keys is the best countermeasure to precomputation attacks. Therefore, implementations that employ 128-bit AES keys should take precautions to make the precomputation attacks more difficult. The unpredictable nonce value in the counter block significantly increases the size of the table that the attacker must compute to mount a successful attack.
In the development of this specification, the use of the ESP sequence number field instead of an explicit IV field was considered. This selection is not a cryptographic security issue, as either approach will prevent counter block collisions.
In a very conservative model of encryption security, at most 2^64 blocks ought to be encrypted with AES-CTR under a single key. Under this constraint, no more than 64 bits are needed to identify each packet within a security association. Since the ESP extended sequence number is 64 bits, it is an obvious candidate for use as an implicit IV. This would dictate a single method for the assignment of per-packet value in the counter block. The use of an explicit IV does not dictate such a method, which is desirable for several reasons.
Housley Standards Track [Page 14]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
1. Only the encryptor can ensure that the value is not used for more than one packet, so there is no advantage to selecting a mechanism that allows the decryptor to determine whether counter block values collide. Damage from the collision is done, whether the decryptor detects it or not.
2. Allows adders, LFSRs, and any other technique that meets the time budget of the encryptor, so long as the technique results in a unique value for each packet. Adders are simple and straightforward to implement, but due to carries, they do not execute in constant time. LFSRs offer an alternative that executes in constant time.
3. Complexity is in control of the implementer. Further, the decision made by the implementer of the encryptor does not make the decryptor more (or less) complex.
4. When the encryptor has more than one cryptographic hardware device, an IV prefix can be assigned to each device, ensuring that collisions will not occur. Yet, since the decryptor does not need to examine IV structure, the decryptor is unaffected by the IV structure selected by the encryptor. One cannot make use of the same technique with the ESP sequence numbers, because the semantics for them require sequential value generation.
5. Assurance boundaries are very important to implementations that will be evaluated against the FIPS Pub 140-1 or FIPS Pub 140-2 [SECRQMTS]. The assignment of the per-packet counter block value needs to be inside the assurance boundary. Some implementations assign the sequence number inside the assurance boundary, but others do not. A sequence number collision does not have the dire consequences, but, as described in section 6, a collision in counter block values has disastrous consequences.
6. Coupling with the sequence number is possible in those architectures where the sequence number assignment is performed within the assurance boundary. In this situation, the sequence number and the IV field will contain the same value.
7. Decoupling from the sequence number is possible in those architectures where the sequence number assignment is performed outside the assurance boundary.
The use of an explicit IV field directly follows from the decoupling of the sequence number and the per-packet counter block value. The overhead associated with 64 bits for the IV field is acceptable. This overhead is significantly less than the overhead associated with Cipher Block Chaining (CBC) mode. As normally employed, CBC requires
Housley Standards Track [Page 15]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
a full block for the IV and, on average, half of a block for padding. AES-CTR with an explicit IV has about one-third of the overhead as AES-CBC, and the overhead is constant for each packet.
The inclusion of the nonce provides a weak countermeasure against precomputation attacks. For this countermeasure to be effective, the attacker must not be able to predict the value of the nonce well in advance of security association establishment. The use of long keys provides a strong countermeasure to precomputation attacks, and AES offers key sizes that thwart these attacks for many decades to come.
A 28-bit block counter value is sufficient for the generation of a key stream to encrypt the largest possible IPv6 jumbogram [JUMBO]; however, a 32-bit field is used. This size is convenient for both hardware and software implementations.
The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director.
This document is the result of extensive discussions and compromises. While not all of the participants are completely satisfied with the outcome, the document is better for their contributions.
Housley Standards Track [Page 16]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
The author thanks the members of the IPsec working group for their contributions to the design, with special mention of the efforts of (in alphabetical order) Steve Bellovin, David Black, Niels Ferguson, Charlie Kaufman, Steve Kent, Tero Kivinen, Paul Koning, David McGrew, Robert Moskowitz, Jesse Walker, and Doug Whiting.
The author thanks and Alireza Hodjat, John Viega, and Doug Whiting for assistance with the test vectors.
[ARCH] Kent, S. and R. Atkinson, "Security Architecture for the Internet Protocol", RFC 2401, November 1998.
[BDJR] Bellare, M, Desai, A., Jokipii, E. and P. Rogaway, "A Concrete Security Treatment of Symmetric Encryption: Analysis of the DES Modes of Operation", Proceedings 38th Annual Symposium on Foundations of Computer Science, 1997.
[HMAC-SHA] Madson, C. and R. Glenn, "The Use of HMAC-SHA-1-96 within ESP and AH", RFC 2404, November 1998.
[IKE] Harkins, D. and D. Carrel, "The Internet Key Exchange (IKE)", RFC 2409, November 1998.
Housley Standards Track [Page 17]
RFC 3686 Using AES Counter Mode With IPsec ESP January 2004
[JUMBO] Borman, D., Deering, S. and R. Hinden, "IPv6 Jumbograms", RFC 2675, August 1999.
[ROADMAP] Thayer, R., Doraswamy, N. and R. Glenn, "IP Security Document Roadmap", RFC 2411, November 1998.
[SECRQMTS] National Institute of Standards and Technology. FIPS Pub 140-1: Security Requirements for Cryptographic Modules. 11 January 1994.
National Institute of Standards and Technology. FIPS Pub 140-2: Security Requirements for Cryptographic Modules. 25 May 2001. [Supercedes FIPS Pub 140-1]
Copyright (C) The Internet Society (2004). All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees.
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
Funding for the RFC Editor function is currently provided by the Internet Society.