Internet Engineering Task Force (IETF) S. Jiang Request for Comments: 7819 Huawei Technologies Co., Ltd Category: Informational S. Krishnan ISSN: 2070-1721 Ericsson T. Mrugalski ISC April 2016
Privacy Considerations for DHCP
Abstract
DHCP is a protocol that is used to provide addressing and configuration information to IPv4 hosts. This document discusses the various identifiers used by DHCP and the potential privacy issues.
Status of This Memo
This document is not an Internet Standards Track specification; it is published for informational purposes.
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7819.
Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The Dynamic Host Configuration Protocol (DHCP) [RFC2131] is used to provide addressing and configuration information to IPv4 hosts. DHCP uses several identifiers that could become a source for gleaning information about the IPv4 host. This information may include device type, operating system information, location(s) that the device may have previously visited, etc. This document discusses the various identifiers used by DHCP and the potential privacy issues [RFC6973]. In particular, it takes into consideration the problem of pervasive monitoring [RFC7258].
Future works may propose protocol changes to fix the privacy issues that have been analyzed in this document. Those changes are out of scope for this document.
The primary focus of this document is around privacy considerations for clients to support client mobility and connection to random networks. The privacy of DHCP servers and relay agents is considered less important as they are typically open for public services. And, it is generally assumed that communication from relay agent to server is protected from casual snooping, as that communication occurs in the provider's backbone. Nevertheless, the topics involving relay agents and servers are explored to some degree. However, future work may want to further explore the privacy of DHCP servers and relay agents.
Naming conventions from [RFC2131] and related documents are used throughout this document.
In addition, the following terminology is used:
Stable identifier - Any property disclosed by a DHCP client that does not change over time or changes very infrequently and is unique for said client in a given context. Examples include MAC address, client-id, and a hostname. Some identifiers may be considered stable only under certain conditions; for example, one client implementation may keep its client-id stored in stable storage, while another may generate it on the fly and use a different one after each boot. Stable identifiers may or may not be globally unique.
In DHCP, there are a few options that contain identification information or that can be used to extract identification information about the client. This section enumerates various options and the identifiers that they convey and that can be used to disclose client identification. They are targets of various attacks that are analyzed in Section 5.
The Client Identifier option [RFC2131] is used to pass an explicit client identifier to a DHCP server.
The client identifier is an opaque key that must be unique to that client within the subnet to which the client is attached. It typically remains stable after it has been initially generated. It may contain a hardware address, identical to the contents of the 'chaddr' field, or another type of identifier, such as a DNS name. Section 9.2 of [RFC3315] specifies DUID-LLT (Link-layer plus time) as the recommended DUID (DHCP Unique Identifier) type in DHCPv6. Section 6.1 of [RFC4361] introduces this concept to DHCP. Those two documents recommend that client identifiers be generated by using the permanent link-layer address of the network interface that the client is trying to configure. [RFC4361] updates the recommendation for a Client Identifier as follows: "[it] consists of a type field whose value is normally 255, followed by a four-byte IA_ID field, followed by the DUID for the client as defined in RFC 3315, section 9". This does not change the lifecycle of client identifiers. Clients are expected to generate their client identifiers once (during first operation) and store them in non-volatile storage or use the same deterministic algorithm to generate the same client identifier values again.
This means that typically an implementation will use the available link-layer address during its first boot. Even if the administrator enables link-layer address randomization, it is likely that it was not yet enabled during the first device boot. Hence the original, unobfuscated link-layer address will likely end up being announced as the client identifier, even if the link-layer address has changed (or even if it is being changed on a periodic basis). The exposure of the original link-layer address in the client identifier will also undermine other privacy extensions such as [RFC4941].
The 'yiaddr' field [RFC2131] in a DHCP message is used to convey an allocated address from the server to the client.
Jiang, et al. Informational [Page 4]
RFC 7819 DHCP Privacy Considerations April 2016
The DHCP specification [RFC2131] provides a way to specify the client link-layer address in the DHCP message header. A DHCP message header has 'htype' and 'chaddr' fields to specify the client link-layer address type and the link-layer address, respectively. The 'chaddr' field is used both as a hardware address for transmission of reply messages and as a client identifier.
The 'requested IP address' option [RFC2131] is used by a client to suggest that a particular IP address be assigned.
The Client Fully Qualified Domain Name (FQDN) option [RFC4702] is used by DHCP clients and servers to exchange information about the client's FQDN and about who has the responsibility for updating the DNS with the associated A and PTR RRs.
A client can use this option to convey all or part of its domain name to a DHCP server for the IP-address-to-FQDN mapping. In most cases, a client sends its hostname as a hint for the server. The DHCP server may be configured to modify the supplied name or to substitute a different name. The server should send its notion of the complete FQDN for the client in the Domain Name field.
The Parameter Request List option [RFC2131] is used to inform the server about options the client wants the server to send to the client. The contents of a Parameter Request List option are the option codes of the options requested by the client.
3.5. Vendor Class and Vendor-Identifying Vendor Class Options
The Vendor Class option [RFC2131], the Vendor-Identifying Vendor Class option, and the Vendor-Identifying Vendor Information option [RFC3925] are used by the DHCP client to identify the vendor that manufactured the hardware on which the client is running.
The information contained in the data area of this option is contained in one or more opaque fields that identify the details of the hardware configuration of the host on which the client is running or of industry consortium compliance -- for example, the version of the operating system the client is running or the amount of memory installed on the client.
DHCP servers use the Civic Location Option [RFC4776] to deliver location information (the civic and postal addresses) to DHCP clients. It may refer to three locations: the location of the DHCP server, the location of the network element believed to be closest to the client, or the location of the client, identified by the "what" element within the option.
The GeoConf and GeoLoc options [RFC6225] are used by a DHCP server to provide coordinate-based geographic location information to DHCP clients. They enable a DHCP client to obtain its geographic location.
The Client System Architecture Type Option [RFC4578] is used by a DHCP client to send a list of supported architecture types to the DHCP server. It is used by clients that must be booted using the network rather than from local storage, so the server can decide which boot file should be provided to the client.
3.9. Relay Agent Information Option and Suboptions
A DHCP relay agent includes a Relay Agent Information option[RFC3046] to identify the remote host end of the circuit. It contains a "circuit ID" suboption for the incoming circuit, which is an agent- local identifier of the circuit from which a DHCP client-to-server packet was received, and a "remote ID" suboption that provides a trusted identifier for the remote high-speed modem.
Possible encoding of the "circuit ID" suboption includes: router interface number, switching hub port number, remote access server port number, frame relay Data Link Connection Identifier (DLCI), ATM virtual circuit number, cable data virtual circuit number, etc.
Possible encoding of the "remote ID" suboption includes: a "caller ID" telephone number for dial-up connection, a "user name" prompted for by a remote access server, a remote caller's ATM address, a "modem ID" of a cable data modem, the remote IP address of a point- to-point link, a remote X.25 address for X.25 connections, etc.
The link-selection suboption [RFC3527] is used by any DHCP relay agent that desires to specify a subnet/link for a DHCP client request that it is relaying but needs the subnet/link specification to be different from the IP address the DHCP server should use when
Jiang, et al. Informational [Page 6]
RFC 7819 DHCP Privacy Considerations April 2016
communicating with the relay agent. It contains an IP address that can identify the client's subnet/link. Also, assuming there is knowledge of the network topology, it also reveals client location.
A DHCP relay includes a Subscriber-ID option [RFC3993] to associate some provider-specific information with clients' DHCP messages that is independent of the physical network configuration through which the subscriber is connected. The "subscriber-id" assigned by the provider is intended to be stable as customers connect through different paths and as network changes occur. The Subscriber-ID is an ASCII string that is assigned and configured by the network provider.
The Client FQDN (Fully Qualified Domain Name) Option [RFC4702] used along with DNS Updates [RFC2136] defines a mechanism that allows both clients and server to insert into the DNS domain information about clients. Both forward (A) and reverse (PTR) resource records can be updated. This allows other nodes to conveniently refer to a host, despite the fact that its IP address may be changing.
This mechanism exposes two important pieces of information: current address (which can be mapped to current location) and client's hostname. The stable hostname can then be used to correlate the client across different network attachments even when its IP addresses keep changing.
A DHCP server running in typical, stateful mode is given a task of managing one or more pools of IP addresses. When a client requests an address, the server must pick an address out of a configured pool. Depending on the server's implementation, various allocation strategies are possible. Choices in this regard may have privacy implications. Note that the constraints in DHCP and DHCPv6 are radically different, but servers that allow allocation strategy configuration may allow configuring them in both DHCP and DHCPv6. Not every allocation strategy is equally suitable for DHCP and for DHCPv6.
Jiang, et al. Informational [Page 7]
RFC 7819 DHCP Privacy Considerations April 2016
Iterative allocation: A server may choose to allocate addresses one by one. That strategy has the benefit of being very fast, thus being favored in deployments that prefer performance. However, it makes the allocated addresses very predictable. Also, since the addresses allocated tend to be clustered at the beginning of an available pool, it makes scanning attacks much easier.
Identifier-based allocation: Some server implementations may choose to allocate an address that is based on one of the available identifiers, e.g., client identifier or MAC address. It is also convenient, as a returning client is very likely to get the same address. Those properties are convenient for system administrators, so DHCP server implementers are often requested to implement it. The downside of such an allocation is that the client has a very stable IP address. That means that correlation of activities over time, location tracking, address scanning, and OS/vendor discovery apply. This is certainly an issue in DHCPv6, but due to a much smaller address space it is almost never a problem in DHCP.
Hash allocation: This is an extension of identifier-based allocation. Instead of using the identifier directly, it is hashed first. If the hash is implemented correctly, it removes the flaw of disclosing the identifier, a property that eliminates susceptibility to address scanning and OS/vendor discovery. If the hash is poorly implemented (e.g., it can be reversed), it introduces no improvement over identifier-based allocation.
Random allocation: A server can pick a resource randomly out of an available pool. This allocation scheme essentially prevents returning clients from getting the same address again. On the other hand, it is beneficial from a privacy perspective as addresses generated that way are not susceptible to correlation attacks, OS/vendor discovery attacks, or identity discovery attacks. Note that even though the address itself may be resilient to a given attack, the client may still be susceptible if additional information is disclosed in another way, e.g., the client's address may be randomized, but it still can leak its MAC address in the Client Identifier option.
Other allocation strategies may be implemented.
Given the limited size of most IPv4 public address pools, allocation mechanisms in IPv4 may not provide much privacy protection or leak much useful information, if misused.
The type of device used by the client can be guessed by the attacker using the Vendor Class Option, the 'chaddr' field, and by parsing the Client ID Option. All of those options may contain an Organizationally Unique Identifier (OUI) that represents the device's vendor. That knowledge can be used for device-specific vulnerability exploitation attacks.
The operating system running on a client can be guessed using the Vendor Class option, the Client System Architecture Type option, or by using fingerprinting techniques on the combination of options requested using the Parameter Request List option.
The location information can be obtained by the attacker by many means. The most direct way to obtain this information is by looking into a message originating from the server that contains the Civic Location, GeoConf, or GeoLoc options. It can also be indirectly inferred using the Relay Agent Information option, with the remote ID suboption, the circuit ID option (e.g., if an access circuit on an Access Node corresponds to a civic location), or the Subscriber ID Option (if the attacker has access to subscriber information).
When DHCP clients connect to a network, they attempt to obtain the same address they had used before they attached to the network. They do this by putting the previously assigned address in the requested IP address option. By observing these addresses, an attacker can identify the network the client had previously visited.
An attacker might use a stable identity gleaned from DHCP messages to correlate activities of a given client on unrelated networks. The Client FQDN option, the Subscriber ID option, and the Client ID option can serve as long-lived identifiers of DHCP clients. The Client FQDN option can also provide an identity that can easily be correlated with web server activity logs.
Pervasive monitoring [RFC7258] is widespread (and often covert) surveillance through intrusive gathering of protocol artifacts, including application content, or protocol metadata such as headers. An operator who controls a nontrivial number of access points or network segments may use obtained information about a single client and observe the client's habits. Although users may not expect true privacy from their operators, the information that is set up to be monitored by users' service operators may also be gathered by an adversary who monitors a wide range of networks and develops correlations from that information.
Many DHCP deployments use DNS Updates [RFC4702] that put a client's information (current IP address, client's hostname) into the DNS, where it is easily accessible by anyone interested. Client ID is also disclosed, albeit not in an easily accessible form (SHA-256 digest of the client-id). As SHA-256 is considered irreversible, DHCP client ID can't be converted back to client-id. However, SHA-256 digest can be used as a unique identifier that is accessible by any host.
As with other identifiers, an IP address can be used to correlate the activities of a host for at least as long as the lifetime of the address. If that address was generated from some other, stable identifier and that generation scheme can be deduced by an attacker, the duration of the correlation attack extends to that of the identifier. In many cases, its lifetime is equal to the lifetime of the device itself.
If a stable identifier is used for assigning an address and such mapping is discovered by an attacker, it can be used for tracking a user. In particular, both passive (a service that the client connects to can log the client's address and draw conclusions regarding its location and movement patterns based on the addresses it is connecting from) and active (an attacker can send ICMP echo requests or other probe packets to networks of suspected client locations) methods can be used. To give a specific example, by accessing a social portal from tomek-laptop.coffee.somecity.com.example, tomek-laptop.mycompany.com.example, and tomek-laptop.myisp.example.com, the portal administrator can draw
Jiang, et al. Informational [Page 10]
RFC 7819 DHCP Privacy Considerations April 2016
conclusions about tomek-laptop's owner's current location and his habits.
Attackers may pretend to be an access concentrator, either as a DHCP relay agent or as a DHCP client, to obtain location information directly from the DHCP server(s) using the DHCP leasequery [RFC4388] mechanism.
Location information is information needed by the access concentrator to forward traffic to a broadband-accessible host. This information includes knowledge of the host hardware address, the port or virtual circuit that leads to the host, and/or the hardware address of the intervening subscriber modem.
Furthermore, the attackers may use the DHCP bulk leasequery [RFC6926] mechanism to obtain bulk information about DHCP bindings, even without knowing the target bindings.
Additionally, active leasequery [RFC7724] is a mechanism for subscribing to DHCP lease update changes in near real-time. The intent of this mechanism is to update an operator's database; however, if the mechanism is misused, an attacker could defeat the server's authentication mechanisms and subscribe to all updates. He then could continue receiving updates, without any need for local presence.
In current practice, the client privacy and client authentication are mutually exclusive. The client authentication procedure reveals additional client information in the certificates and identifiers. Full privacy for the clients may mean the clients are also anonymous to the server and the network.
[RFC6225] Polk, J., Linsner, M., Thomson, M., and B. Aboba, Ed., "Dynamic Host Configuration Protocol Options for Coordinate-Based Location Configuration Information", RFC 6225, DOI 10.17487/RFC6225, July 2011, <http://www.rfc-editor.org/info/rfc6225>.
The authors would like to thank the valuable comments made by Stephen Farrell, Ted Lemon, Ines Robles, Russ White, Christian Huitema, Bernie Volz, Jinmei Tatuya, Marcin Siodelski, Christian Schaefer, Robert Sparks, Peter Yee, and other members of DHC WG.