Internet Engineering Task Force (IETF) L. Zhou Request for Comments: 7485 N. Kong Category: Informational S. Shen ISSN: 2070-1721 CNNIC S. Sheng ICANN A. Servin LACNIC March 2015
Inventory and Analysis of WHOIS Registration Objects
Abstract
WHOIS output objects from registries, including both Regional Internet Registries (RIRs) and Domain Name Registries (DNRs), were collected and analyzed. This document describes the process and results of the statistical analysis of existing WHOIS information. The purpose of this document is to build an object inventory to facilitate discussions of data objects included in Registration Data Access Protocol (RDAP) responses.
Status of This Memo
This document is not an Internet Standards Track specification; it is published for informational purposes.
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7485.
Zhou, et al. Informational [Page 1]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Zhou, et al. Informational [Page 2]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Regional Internet Registries (RIRs) and Domain Name Registries (DNRs) have historically maintained a lookup service to permit public access to some portion of the registry database. Most registries offer the service via the WHOIS protocol [RFC3912], with additional services being offered via World Wide Web pages, bulk downloads, and other services, such as Routing Policy Specification Language (RPSL) [RFC2622].
Although the WHOIS protocol is widely adopted and supported, it has several shortcomings that limit its usefulness to the evolving needs of the Internet community. Specifically:
o It has no query and response format.
o It does not support user authentication or access control for differentiated access.
o It has not been internationalized and thus does not consistently support Internationalized Domain Names (IDNs) as described in [RFC5890].
This document records an inventory of registry data objects to facilitate discussions of registration data objects. The Registration Data Access Protocol (RDAP) ([RFC7480], [RFC7482], [RFC7483], and [RFC7484]) was developed using this inventory as input.
In the number space, there were altogether five RIRs. Although all RIRs provided information about IP addresses, Autonomous System Numbers (ASNs), and contacts, the data model used was different for each RIR. In the domain name space, there were over 200 country code Top-Level Domains (ccTLDs) and over 400 generic Top-Level Domains (gTLDs) when this document was published. Different Domain Name Registries may have different WHOIS response objects and formats. A common understanding of all these data formats was critical to construct a single data model for each object.
This document describes the WHOIS data collection procedures and gives an inventory analysis of data objects based on the collected data from the five RIRs, 106 ccTLDs, and 18 gTLDs from DNRs. The RIR data objects are classified by the five RIRs into IP address, ASN, person or contact, and the organization that held the resource. According to SPECIFICATION 4 ("SPECIFICATION FOR REGISTRATION DATA PUBLICATION SERVICES") of the new gTLD applicant guidebook [ICANN.AGB-201206] and the Extensible Provisioning Protocol (EPP) ([RFC5730], [RFC5731], [RFC5732], and [RFC5733]), the DNR data
Zhou, et al. Informational [Page 4]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
objects are classified by whether they relate to the domain, contact, nameserver, or registrar. Objects that do not belong to the categories above are viewed as privately specified objects. In this document, there is no intent to analyze all the query and response types that exist in RIRs and DNRs. The most common query objects are discussed, but other objects such as RPSL data structures used by Internet Routing Registries (IRRs) can be documented later if the community feels it is necessary.
WHOIS information, including port 43 response and web response data, was collected between July 9, 2012, and July 20, 2012, following the procedures described below.
(1) First, find the RIR WHOIS servers of the five RIRs, which are AFRINIC, APNIC, ARIN, LACNIC, and RIPE NCC. All the RIRs provide information about IP addresses, ASNs, and contacts.
(2) Query the corresponding IP addresses, ASNs, contacts, and organizations registered in the five RIRs. Then, make a comparative analysis of the response data.
(3) Group together the data elements that have the same meaning but use different labels.
Zhou, et al. Informational [Page 5]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
DNR object collection process:
(1) A programming script was applied to collect port 43 response data from 294 ccTLDs. "nic.ccTLD" was used as the query string, which is usually registered in a domain registry. Responses for 106 ccTLDs were received. 18 gTLDs' port 43 response data was collected from their contracts with ICANN. Thus, the sample size of port 43 WHOIS response data is 124 registries in total.
(2) WHOIS data from the web was collected manually from the 124 registries that send port 43 WHOIS responses.
(3) Some of the response that which were collected by the program did not seem to be correct, so data for the top 10 ccTLD registries, like .de, .eu, and .uk, was re-verified by querying domain names other than "nic.ccTLD".
(4) In accordance with SPECIFICATION 4 of the new gTLD applicant guidebook [ICANN.AGB-201206] and EPP ([RFC5730], [RFC5731], [RFC5732] and [RFC5733]), the response data objects are classified into public and other data objects. Public data objects are those that are defined in the above references. Other objects are those that are privately specified data elements or objects in different registries.
(5) Data elements with the same meaning, but using different labels, were grouped together. The number of registries that support each data element is shown in the "No. of TLDs" column.
Zhou, et al. Informational [Page 6]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Table 2 shows the contact objects of the five RIRs.
+--------------+---------+---------+------------+---------+---------+ | Data Element | AFRINIC | APNIC | ARIN | LACNIC | RIPE | | | | | | | NCC | +--------------+---------+---------+------------+---------+---------+ | Name | person | person | Name | person | person | +--------------+---------+---------+------------+---------+---------+ | Company | NA | NA | Company | NA | NA | +--------------+---------+---------+------------+---------+---------+ | Postal | address | address | Address | address | address | | Address | | | | | | +--------------+---------+---------+------------+---------+---------+ | City | NA | NA | City | NA | address | +--------------+---------+---------+------------+---------+---------+ | State | NA | NA | StateProv | NA | address | +--------------+---------+---------+------------+---------+---------+ | Postal Code | NA | NA | PostalCode | NA | address | +--------------+---------+---------+------------+---------+---------+ | Country | NA | country | Country | country | NA | +--------------+---------+---------+------------+---------+---------+ | Phone | phone | phone | Mobile | phone | phone | +--------------+---------+---------+------------+---------+---------+ | Fax Number | fax-no | fax-no | Fax | NA | fax-no | +--------------+---------+---------+------------+---------+---------+ | Email | e-mail | e-mail | Email | e-mail | NA | +--------------+---------+---------+------------+---------+---------+ | ID | nic-hdl | nic-hdl | Handle | nic-hdl | nic-hdl | +--------------+---------+---------+------------+---------+---------+ | Remarks | remarks | remarks | Remarks | NA | remarks | +--------------+---------+---------+------------+---------+---------+ | Notify | notify | notify | NA | NA | notify | +--------------+---------+---------+------------+---------+---------+ | ID of | mnt-by | mnt-by | NA | NA | mnt-by | | maintainer | | | | | | +--------------+---------+---------+------------+---------+---------+ | Registration | changed | NA | RegDate | created | changed | | Date | | | | | | +--------------+---------+---------+------------+---------+---------+ | Registration | changed | changed | Updated | changed | changed | | update | | | | | | +--------------+---------+---------+------------+---------+---------+ | Source | source | source | NA | NA | source |
Zhou, et al. Informational [Page 9]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
+--------------+---------+---------+------------+---------+---------+ | Reference | NA | NA | Ref | NA | NA | +--------------+---------+---------+------------+---------+---------+
Table 4 shows the IP address objects of the five RIRs.
Note: Due to the 72-character limit on line length, strings in some cells of the table are split into two or more parts, which are placed on separate lines within the same cell. A hyphen in the final position of a string indicates that the string has been split due to the length limit.
As can be observed, some data elements were not supported by all RIRs, and some were given different labels by different RIRs. Also, there were identical labels used for different data elements by different RIRs. In order to construct a single data model for each object, a selection of the most common and useful fields was made. That initial selection was the starting point for [RFC7483].
Zhou, et al. Informational [Page 13]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
WHOIS data was collected from 124 registries, including 106 ccTLDs and 18 gTLDs. All 124 registries support domain queries. Among 124 registries, eight ccTLDs and 15 gTLDs support queries for specific contact persons or roles. 10 ccTLDs and 18 gTLDs support queries by nameserver. Four ccTLDs and 18 gTLDs support registrar queries. Domain WHOIS data contain 68 data elements that use a total of 550 labels. There is a total of 392 other objects for domain WHOIS data.
As mentioned above, public objects are those data elements selected according to the new gTLD applicant guidebook and EPP. They are generally classified into four categories by whether they are related to the domain, contact, nameserver, or registrar.
Zhou, et al. Informational [Page 14]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
WHOIS replies about domains include "Domain Name", "Creation Date", "Domain Status", "Expiration Date", "Updated Date", "Domain ID", "DNSSEC", and "Last Transferred Date". Table 6 gives the element name, most popular label, and the corresponding numbers of TLDs and labels.
+-------------------+-------------------+------------+--------------+ | Data Element | Most Popular | No. of | No. of | | | Label | TLDs | Labels | +-------------------+-------------------+------------+--------------+ | Domain Name | Domain Name | 118 | 6 | +-------------------+-------------------+------------+--------------+ | Creation Date | Created | 106 | 24 | +-------------------+-------------------+------------+--------------+ | Domain Status | Status | 95 | 8 | +-------------------+-------------------+------------+--------------+ | Expiration Date | Expiration Date | 81 | 21 | +-------------------+-------------------+------------+--------------+ | Updated Date | Modified | 70 | 20 | +-------------------+-------------------+------------+--------------+ | Domain ID | Domain ID | 34 | 5 | +-------------------+-------------------+------------+--------------+ | DNSSEC | DNSSEC | 14 | 4 | +-------------------+-------------------+------------+--------------+ | Last Transferred | Last Transferred | 4 | 3 | | Date | Date | | | +-------------------+-------------------+------------+--------------+
Table 6. WHOIS Data for Domains
Several statistical conclusions obtained from above data are:
o 95.16% of the 124 registries support a "Domain Name" data element.
o 85.48% of the 124 registries support a "Creation Date" data element.
o 76.61% of the 124 registries support a "Domain Status" data element.
o On the other hand, some elements such as "DNSSEC" and "Last Transferred Date" are only supported by 11.29% and 3.23% of the registries, respectively.
Zhou, et al. Informational [Page 15]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Among all the data elements, only "Registrant Name" is supported by more than one half of registries. Those supported by more than one third of registries are: "Registrant Name", "Registrant Email", "Registrant ID", "Registrant Phone", "Registrant Fax", "Registrant Organization", and "Registrant Country Code".
Zhou, et al. Informational [Page 17]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Among all the data elements, only "Admin Street" is supported by more than one half of registries. Those supported by more than one third of registries are: "Admin Street", "Admin Name", "Admin Email", "Admin ID", "Admin Fax", "Admin Phone", "Admin Organization", and "Admin Country Code".
Zhou, et al. Informational [Page 18]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Among all the data elements, there are no elements supported by more than one half of registries. Those supported by more than one third of registries are: "Tech Email", "Tech ID", "Tech Name", "Tech Fax", "Tech Phone", and "Tech Country Code".
Zhou, et al. Informational [Page 19]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Among all the data elements, there are no elements supported by more than one half of registries. Those supported by more than one third of registries are "Billing Name", "Billing Fax", and "Billing Email".
Zhou, et al. Informational [Page 20]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
114 registries (about 92% of the 124 registries) have the "nameserver" data element in their WHOIS responses. However, there are 63 different labels for this element, as shown in Table 11. The top three labels for this element are "Name Server" (which is supported by 25% of the registries), "Name Servers" (which is supported by 16% of the registries), and "nserver" (which is supported by 12% of the registries).
+--------------+--------------------+-------------+---------------+ | Data Element | Most Popular Label | No. of TLDs | No. of Labels | +--------------+--------------------+-------------+---------------+ | NameServer | Name Server | 114 | 63 | +--------------+--------------------+-------------+---------------+
Table 11. WHOIS Data for Nameservers
Some registries have nameserver elements such like "nameserver 1", "nameserver 2" till "nameserver n". Thus, there are more labels than of other data elements.
There are three data elements about registrar information.
+-------------------+---------------------+-----------+-------------+ | Data Element | Most Popular Label | No. of | No. of | | | | TLDs | Labels | +-------------------+---------------------+-----------+-------------+ | Sponsoring | Registrar | 84 | 6 | | Registrar | | | | +-------------------+---------------------+-----------+-------------+ | Created by | Created by | 14 | 3 | | Registrar | | | | +-------------------+---------------------+-----------+-------------+ | Updated by | Last Updated by | 11 | 3 | | Registrar | Registrar | | | +-------------------+---------------------+-----------+-------------+
Table 12. WHOIS Data for Registrars
67.7% of the registries have the "Sponsoring Registrar" data element. The elements "Created by Registrar" and "Updated by Registrar" are supported by 11.3% and 8.9% of the registries, respectively.
Zhou, et al. Informational [Page 21]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
So-called "other objects" are those data elements that are privately specified or are difficult to be classified. There are 392 other objects altogether. Table 13 lists the top 50 other objects found during data collection.
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
+----------------------------------------+-------------+ | hold | 6 | +----------------------------------------+-------------+ | nsl-id | 6 | +----------------------------------------+-------------+ | obsoleted | 6 | +----------------------------------------+-------------+ | Customer Service Contact | 5 | +----------------------------------------+-------------+ | Customer Service Email | 4 | +----------------------------------------+-------------+ | Registrar ID | 4 | +----------------------------------------+-------------+ | org | 4 | +----------------------------------------+-------------+ | person | 4 | +----------------------------------------+-------------+ | Maintainer | 4 | +----------------------------------------+-------------+ | Nombre | 3 | +----------------------------------------+-------------+ | Sponsoring Registrar IANA ID | 3 | +----------------------------------------+-------------+ | Trademark Number | 3 | +----------------------------------------+-------------+ | Trademark Country | 3 | +----------------------------------------+-------------+ | descr | 3 | +----------------------------------------+-------------+ | url | 3 | +----------------------------------------+-------------+ | Postal address | 3 | +----------------------------------------+-------------+ | Registrar URL | 3 | +----------------------------------------+-------------+ | International Name | 3 | +----------------------------------------+-------------+ | International Address | 3 | +----------------------------------------+-------------+ | Admin Contacts | 2 | +----------------------------------------+-------------+ | Contractual Language | 2 | +----------------------------------------+-------------+ | Date Trademark Registered | 2 | +----------------------------------------+-------------+ | Date Trademark Applied For | 2 | +----------------------------------------+-------------+ | IP Address | 2 |
Zhou, et al. Informational [Page 23]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
+----------------------------------------+-------------+ | Keys | 2 | +----------------------------------------+-------------+ | Language | 2 | +----------------------------------------+-------------+ | NIC handle | 2 | +----------------------------------------+-------------+ | Record maintained by | 2 | +----------------------------------------+-------------+ | Registration Service Provider | 2 | +----------------------------------------+-------------+ | Registration Service Provided By | 2 | +----------------------------------------+-------------+ | Registrar URL (registration services) | 2 | +----------------------------------------+-------------+
Table 13. The Top 50 Other Objects
Some registries returned things that looked like labels, but were not. For example, in this reply:
Registrant: Name: Email: ...
"Name" and "Email" appeared to be data elements, but "Registrant" did not. The inventory work proceeded on that assumption, i.e., there were two data elements to be recorded in this example.
Some other data elements, like "Remarks", "anniversary", and "Customer service Contact", are designed particularly for their own purpose by different registries.
Some preliminary conclusions could be drawn from the raw data.
o All of the 124 domain registries have the object names in their responses, although they are in various formats.
o Of the 118 WHOIS services contacted, 65 registries show their registrant contact. About half of the registries (60 registries) support admin contact information. There are 47 registries, which is about one third of the total number, that have technical and
Zhou, et al. Informational [Page 24]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
billing contact information. Only seven of the 124 registries give their abuse email in a "remarks" section. No explicit abuse contact information is provided.
o There are mainly two presentation formats. One is key-value; the other is data block format. Example of key-value format:
Domain Information Query: nic.example.com Status: Delegated Created: 17 Apr 2004 Modified: 14 Nov 2010 Expires: 31 Dec 9999 Name Servers: ns.example.net ns1.na.example.net ns2.na.example.net ...
Example of data block format:
WHOIS database domain nic.example.org
Domain Name nic.example.org Registered 1998-09-02 Expiry 2012-09-02
Resource Records
a 198.51.100.1 mx 10 test.example.net www a 198.51.100.10
Contact details
Registrant, Technical Contact, Billing Contact, Admin. Contact AdamsNames Reserved Domains (i) These domains are not available for registration United Kingdom Identifier: test123
Zhou, et al. Informational [Page 25]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Servidor WHOIS de NIC-Example
Este servidor contiene informacion autoritativa exclusivamente de dominios nic.example.org Cualquier consulta sobre este servicio, puede hacerla al correo electronico whois@nic.example.org
Titular: John (nic.example.org) john@nic.example.org NIC Example Av. Veracruz con calle Cali, Edif Aguila, Urb. Las Mercedes Caracas, Distrito Capital VE 0212-1234567 (FAX) +582123456789
o 11 registries give local script responses. The WHOIS information of other registries are all represented in English.
+----------------------+-------------+ | Data Element | No. of TLDs | +----------------------+-------------+ | Domain Name | 118 | +----------------------+-------------+ | Name Server | 114 | +----------------------+-------------+ | Creation Date | 106 | +----------------------+-------------+ | Domain Status | 95 | +----------------------+-------------+ | Sponsoring Registrar | 84 | +----------------------+-------------+ | Expiration Date | 81 | +----------------------+-------------+ | Updated Date | 70 | +----------------------+-------------+ | Registrant Name | 65 | +----------------------+-------------+ | Admin Street | 64 | +----------------------+-------------+ | Admin Name | 60 | +----------------------+-------------+
Table 14. The Top 10 Data Elements
Zhou, et al. Informational [Page 26]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
Most of the domain-related WHOIS information is included in the top 10 data elements. Other information like name server and registrar name is also supported by most registries.
A cumulative distribution analysis of all the data elements was done.
(1) About 5% of the data elements discovered by the inventory work are supported by 111 registries (i.e., 90%).
(2) About 30% of the data elements discovered by the inventory work are supported by 44 registries (i.e., 35%).
(3) About 60% of the data elements discovered by the inventory work are supported by 32 registries (i.e., 26%).
(4) About 90% of the data elements discovered by the inventory work are supported by 14 registries (i.e., 11%).
From the above result, it is clear that only a few registries support all the public objects, most of the registries support just some of the objects.
Zhou, et al. Informational [Page 27]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
The top 10 labels of different data elements are listed in Table 15.
+-------------------+---------------+ | Labels | No. of Labels | +-------------------+---------------+ | Name Server | 63 | +-------------------+---------------+ | Creation Date | 24 | +-------------------+---------------+ | Expiration Date | 21 | +-------------------+---------------+ | Updated Date | 20 | +-------------------+---------------+ | Admin Street | 19 | +-------------------+---------------+ | Tech ID | 18 | +-------------------+---------------+ | Registrant Street | 16 | +-------------------+---------------+ | Admin ID | 16 | +-------------------+---------------+ | Tech Street | 16 | +-------------------+---------------+ | Billing Street | 13 | +-------------------+---------------+
Table 15. The Top 10 Labels
As explained above, the "Name Server" label is a unique example because many registries define the name server elements from "nameserver 1" through "nameserver n". Thus, the count of labels for name servers is much higher than other elements. Data elements representing dates and street addresses were also common.
A cumulative distribution analysis of label numbers was done. About 90% of data elements have more than two labels. It is therefore necessary to specify a standard and unified format for object names in a WHOIS response.
The results indicate that there are 392 other data objects in total that are not easy to be classified or are privately defined by various registries. The top 50 other objects are listed in Table 13 in Section 5.3. It is clear that various different objects are
Zhou, et al. Informational [Page 28]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
designed for some particular purpose. In order to ensure uniqueness of JSON names used in the RDAP service, establishment of an IANA registry is advised.
This section lists the limitations of the survey and some assumptions that were made in the execution of this work.
o The input "nic.ccTLD" may not be a good choice, for the term "nic" is often specially used by the corresponding ccTLD, so the collected WHOIS data may be customized and different from the common data.
o Since the programming script queried the "nic.ccTLD" in an anonymous way, only the public WHOIS data from WHOIS servers having nic.ccTLD were collected. So, the private WHOIS data were not covered by this document.
o 11 registries did not provide responses in English. The classification of data elements within their responses may not be accurate.
o The extension data elements are used randomly by different registries. It is difficult to do statistical analysis.
o Sample sizes of contact, name server, and registrar queries are small.
* Only WHOIS queries for contact ID, nameserver, and registrar were used.
* Some registries may not support contact, name server, or registrar queries.
* Some may not support query contact by ID.
* Contact information of some registries may be protected.
Zhou, et al. Informational [Page 29]
RFC 7485 Inventory of WHOIS Reg. Objects March 2015
There are some objects that are included in the existing WHOIS system but not mentioned in [RFC7483]. This document is intended to give a list of reference extension objects for discussion.
The following objects are selected from the top 50 other objects in Section 5.3 that are supported by more than five registries. These objects are considered as possible extension objects.
o zone-c - The identifier of a 'role' object with authority over a zone.
o maintainer - authentication information that identifies who can modify the contents of this object.
o Registration URL - typically the website address of a registry.
o anonymous - whether the registration information is anonymous or not.
o hold - whether the domain is "on hold" or not.
o nsl-id - nameserver list ID.
o obsoleted - whether a domain is obsoleted or not.
[RFC2622] Alaettinoglu, C., Villamizar, C., Gerich, E., Kessens, D., Meyer, D., Bates, T., Karrenberg, D., and M. Terpstra, "Routing Policy Specification Language (RPSL)", RFC 2622, June 1999, <http://www.rfc-editor.org/info/rfc2622>.
This document is the work product of the IETF's WEIRDS working group, of which Olaf Kolkman and Murray Kucherawy were chairs.
The authors especially thank the following individuals who gave their suggestions and contributions to this document: Guangqing Deng, Frederico A C Neves, Ray Bellis, Edward Shryane, Kaveh Ranjbar, Murray Kucherawy, Edward Lewis, Pete Resnick, Juergen Schoenwaelder, Ben Campbell, and Claudio Allocchio.
Authors' Addresses
Linlin Zhou CNNIC 4 South 4th Street, Zhongguancun, Haidian District Beijing 100190 China