Independent Submission D. Zisiadis, Ed. Request for Comments: 6137 S. Kopsidas, Ed. Category: Experimental M. Tsavli, Ed. ISSN: 2070-1721 CERTH G. Cessieux, Ed. CNRS February 2011
The Network Trouble Ticket Data Model (NTTDM)
Abstract
Handling multiple sets of network trouble tickets (TTs) originating from different participants' inter-connected network environments poses a series of challenges for the involved institutions. A Grid is a good example of such a multi-domain project. Each of the participants follows different procedures for handling trouble in its domain, according to the local technical and linguistic profile. The TT systems of the participants collect, represent, and disseminate TT information in different formats.
As a result, management of the daily workload by a central Network Operation Centre (NOC) is a challenge on its own. Normalization of TTs to a common format at the central NOC can ease presentation, storing, and handling of the TTs. In the present document, we provide a model for automating the collection and normalization of the TT received by multiple networks forming the Grid. Each of the participants is using its home TT system within its domain for handling trouble incidents, whereas the central NOC is gathering the tickets in the normalized format for repository and handling. XML is used as the common representation language. The model was defined and used as part of the networking support activity of the EGEE (Enabling Grids for E-sciencE) project.
Status of This Memo
This document is not an Internet Standards Track specification; it is published for examination, experimental implementation, and evaluation.
This document defines an Experimental Protocol for the Internet community. This is a contribution to the RFC Series, independently of any other RFC stream. The RFC Editor has chosen to publish this document at its discretion and makes no statement about its value for implementation or deployment. Documents approved for publication by the RFC Editor are not a candidate for any level of Internet Standard; see Section 2 of RFC 5741.
Zisiadis, et al. Experimental [Page 1]
RFC 6137 NTTDM February 2011
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6137.
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
Table of Contents
1. Introduction ....................................................4 1.1. Terminology ................................................5 1.2. Notations ..................................................6 1.3. About the Network Trouble Ticket Data Model ................6 1.4. About the Network Trouble Ticket Implementation ............7 1.5. Future Plans ...............................................7 2. NTTDM Types and Definitions .....................................7 2.1. Types and Definitions for the TYPE Attribute ...............8 2.1.1. Defined .............................................8 2.1.2. Free ................................................8 2.1.3. Multiple ............................................8 2.1.4. List ................................................8 2.2. Types and Definitions for the VALID FORMAT Attributes ......9 2.2.1. Predefined String ...................................9 2.2.1.1. Definitions of the Predefined Values ......10 2.2.2. String .............................................13 2.2.3. Datetime ...........................................13 3. NTTDM ..........................................................14 3.1. NTTDM Components ..........................................14 3.1.1. NTTDM Attributes ...................................14 3.2. NTTDM Aggregate Classes ...................................15 3.2.1. NTTDM-Document Class ...............................15 3.2.2. Ticket Class .......................................15 3.2.3. Ticket Origin Information ..........................17 3.2.3.1. PARTNER_ID ................................17 3.2.3.2. ORIGINAL_ID ...............................17 3.2.4. Ticket Information .................................17 3.2.4.1. TT_ID .....................................17 3.2.4.2. TT_TITLE ..................................18 3.2.4.3. TT_TYPE ...................................18
Problem-impact assessment, reporting, identification, and handling, as well as dissemination of trouble information and delegation of authority, are some of the main tasks that have to be implemented by the members of a Grid in order to successfully manage the network and maintain operational efficiency of the services offered to their users.
Different TT systems are used by each network domain, delivering TTs in alternate formats, while the TT load is growing proportionally with network size and serviced users.
We hereby define a data model for TT normalization -- the Network Trouble Ticket Data Model (NTTDM) -- initially targeted for network providers serving EGEE [8]. The model is designed in accordance with RFC 1297 [11] and meets requirements of the multiple TT systems used.
The NTTDM
o is both effective and comprehensive, as it compensates for the core activities of the Network Operation Centres (NOCs). It is also dynamic, allowing additional options to be included in the future, according to demand.
o provides an XML representation for conveying incident information across administrative domains between parties that have an operational responsibility of remediation or a "watch-and-warn" policy over a defined constituency.
o encodes information about hosts, networks, and the services running on these systems; attack methodology and associated forensic evidence; impact of the activity; and limited approaches for documenting workflow.
o aims to simplify TT exchange within the boundaries of a Grid and to enhance the functional cooperation of every NOC and of the Grid Operation Centre (GOC). Community adoption of the NTTDM enhances trouble resolution within the Grid framework and imparts network status cognizance by modeling collaboration and information exchange among operators.
Zisiadis, et al. Experimental [Page 4]
RFC 6137 NTTDM February 2011
o provides a common format that allows GOCs as well as all participating NOCs to store, exchange, manage, and analyze TTs (assessment of TT impact).
o provides increased automation in handling a TT, since the network operators have a common view of the incident.
The model was designed and used as part of the networking support activity of the EGEE project; one of the subtasks of this support activity was to enhance the ENOC (EGEE Network Operation Centre) [9] procedures for better overall network coordination of the Grid.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [1].
The NTTDM uses specific keywords to describe the various data components. These keywords are:
The NTTDM is specified in two ways: as an abstract data model and as an XML Schema. Section 3 provides a Unified Modeling Language (UML) [10] model describing the individual classes and their relationship with each other. The semantics of each class are discussed and their attributes explained. In Section 6, this UML model is converted into an XML Schema [2] [3] [4] [5]. A specific namespace [6] is also defined.
The term "XML document" refers to any instance of an XML Document. The term "NTTDM document" refers to specific elements and attributes of the NTTDM Schema. Finally, the terms "class" and "element" are used interchangeably to reference either a given UML class in the data model or its corresponding Schema implementation.
The NTTDM is a data representation that provides a framework for normalizing and sharing information among network operators and the GOC regarding troubles within the Grid boundaries. There has been a lot of thought processing during the design of the data model:
o The data model serves as a common storage and exchange format.
o Every NOC still uses its home TT system for network management within its area of control.
o As there is no universally adopted definition for a trouble, in the NTTDM definition, the term is used with a comprehensive meaning to cover all NOCs.
o Handling every possible definition of a trouble incident would call for an extremely expanded and complex data model. Therefore, the NTTDM's purpose is to serve as the basis for normalizing and exchanging TTs. It is flexible and expressive in order to ensure that specific NOC requirements are met. Specific NOC information is kept outside the NTTDM, and external databases can be used to feed it.
Zisiadis, et al. Experimental [Page 6]
RFC 6137 NTTDM February 2011
o The domain of managing the information is not fully standardized and must rely on free-form textual descriptions. The NTTDM attempts to strike a balance between supporting this free-form content, while still allowing automated processing of incident information.
The NTTDM is only one of several feasible TT data representations. The goal of this design was to be as effective and comprehensive as these other representations and to account for the management of a general Grid environment. The already used TT formats influenced the design of the NTTDM.
1.4. About the Network Trouble Ticket Implementation
Here we describe an example of a typical use case.
The Grid project EGEE manages its infrastructure as a network overlay over the European National Research and Educational Networks (NRENs) and wants to be able to warn EGEE sites of the unavailability of the network. Thanks to collaboration with its network provider, the EGEE NOC receives a high volume of TTs (800 tickets/month, 2500 emails/month) from 20 NRENs and should always be able to cope with such a heavy load. Thanks to the NTTDM, the EGEE NOC can automate the TT workflow:
o The TT is filtered, sorted, and stored in a local database (DB).
o The TT's impact on the Grid is assessed.
o The TT is pushed to an ENOC dashboard application and other tools (EGEE TT system, statistics, etc.).
Since this is an Experimental document, operational experience will be used to expand the subsections of Section 3.2.3, "Ticket Origin Information", below. The current specification is already used within EGEE. Other Grids are free to use it and report comments to the authors. After enough experimentation, we would like to advance it to the Standards Track.
The various data elements of the TT data model are typed. This section discusses these data types. When possible, native Schema data types were adopted, but for more complicated formats, regular expressions or external standards were used.
o Operational: for network incident and maintenance only.
o Informational: information about the TT system or the exchange interface (maintenance, upgrade).
o Administrative: information about the access to the TT system (credentials) or the exchange interface.
o Test: to test the TT system or the exchange interface, etc.
TYPE
o Scheduled: the incident was scheduled to happen.
o Unscheduled: the incident was unscheduled.
TT_PRIORITY
o Low: the TT priority is low.
o Medium: the TT priority is medium.
o High: the TT priority is high.
TT_SHORT_DESCRIPTION
o Core Line Fault: malfunction of a high-bandwidth core line.
o Access Line Fault: malfunction of a medium-bandwidth access line.
o Degraded Service.
o Router Hardware Fault: malfunction of the router hardware.
o Router Software Fault: malfunction of the router software.
o Routing Problem: incident regarding the routing service.
o Undefined Problem: nature of the problem not identified.
o Network Congestion: problem due to traffic at the network (blocked).
o Client Upgrade: incidents regarding client/services upgrade.
Zisiadis, et al. Experimental [Page 10]
RFC 6137 NTTDM February 2011
o IPv6: incident regarding the IPv6 network.
o QoS: incident regarding the Quality of Service (QoS) of the network.
o VoIP: incident regarding Voice over IP (VoIP).
o Other: non-listed incident.
TT_IMPACT_ASSESSMENT
o No impact: the incident does not cause any impacts.
o Reduced redundancy: the incident reduces network redundancy.
o Minor performance impact: the incident causes a minor performance impact.
o Severe performance impact: the incident causes a severe performance impact.
o No connectivity: the incident causes connectivity failure.
o On backup: the incident causes a malfunction of backup services.
o At risk: the incident should not have any impact but could possibly cause some trouble.
o Unknown: the nature of the impact is not identified.
TT_STATUS
o Opened: the ticket is opened.
o Closed: the ticket is closed.
o Updated: the ticket's contents have been updated.
o Cancelled: the ticket has been opened twice; one of the tickets is cancelled, and a relationship between them is defined via the RELATED_ACTIVITY field.
o Solved: the incident is solved, but the team prefers to monitor/check for future issues.
o Opened/Closed: the ticket was opened only to report an incident that has already been solved.
Zisiadis, et al. Experimental [Page 11]
RFC 6137 NTTDM February 2011
o Inactive: the ticket is under the responsibility of an external domain and is no longer under the reporting domain's control.
o Reopened: the ticket was closed by error, or the problem was erroneously declared to be solved. Data in the History field are very important in this case.
o Superseded: the ticket has been superseded by another one (for example, a bigger problem that had resulted in many tickets was later merged into a single incident/ticket). The RELATED_ACTIVITY field SHOULD include the master ticket reference.
Allowed transitions for TT_STATUS are only those indicated in Figure 2. Possible final states are indicated with (X).
In this section, the individual components of the NTTDM will be discussed in detail. This class provides a standardized representation for commonly exchanged Field Name data.
The Field Name class has four attributes. Each attribute provides information about a Field Name instance. The attributes that characterize one instance constitute all the information required to form the data model.
DESCRIPTION
This field contains a short description of the Field Name.
TYPE
The TYPE attribute contains information about the type of the Field Name it depends on. The values that it may contain are:
Defined, Free, Multiple, and List.
VALID FORMAT
This attribute contains information about the format of each field. The values that it may contain are:
Predefined String, String, and Datetime.
MANDATORY
This attribute indicates whether the information of each field is required or optional. If the information is required, the MANDATORY field contains the word "YES". If the information is optional, the MANDATORY field contains the word "NO".
The Field Names are the Aggregate Classes that constitute the NTTDM, and each of them is an element that is characterized by a quadruple (DESCRIPTION, TYPE, VALID FORMAT, MANDATORY).
+--------------+ | PARTNER_ID | +--------------+ | DESCRIPTION | The unique ID of the TT source partner. | TYPE | Multiple. | VALID FORMAT | String. | MANDATORY | Yes. +--------------+
+--------------+ | ORIGINAL_ID | +--------------+ | DESCRIPTION | The TT ID that was assigned by the party. | TYPE | Free. | VALID FORMAT | String. | MANDATORY | Yes. +--------------+
+--------------+ | TT_ID | +--------------+ | DESCRIPTION | The unique ID of the TT. | TYPE | As defined below. | VALID FORMAT | String. | MANDATORY | Yes. +--------------+
Figure 7. TT_ID Class
Zisiadis, et al. Experimental [Page 17]
RFC 6137 NTTDM February 2011
TYPE is constructed as "PARTNER_ID"_"ORIGINAL_ID". PARTNER_ID and ORIGINAL_ID therefore MUST NOT contain an underscore character.
+---------------+ | TT_TITLE | +---------------+ | DESCRIPTION | The title of the TT. | TYPE | Defined. | VALID FORMAT | String. | MANDATORY | Yes. +---------------+
+---------------+ | TT_TYPE | +---------------+ | DESCRIPTION | The type of the TT. | TYPE | Multiple. | VALID FORMAT | Predefined String. | MANDATORY | Yes. +---------------+
+------------------+ | TT_OPEN_DATETIME | +------------------+ | DESCRIPTION | The date and time when the TT was opened. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | Yes. +------------------+
+-------------------+ | TT_CLOSE_DATETIME | +-------------------+ | DESCRIPTION | The date and time when the TT was closed. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | Yes. +-------------------+
+----------------------+ | TT_SHORT_DESCRIPTION | +----------------------+ | DESCRIPTION | The short description of the trouble. | TYPE | Multiple. | VALID FORMAT | Predefined String. | MANDATORY | Yes. +----------------------+
+---------------------+ | TT_LONG_DESCRIPTION | +---------------------+ | DESCRIPTION | The detailed description of the | | incident/maintenance reported in the TT. | TYPE | Free. | VALID FORMAT | String. | MANDATORY | No. +---------------------+
+--------------+ | TYPE | +--------------+ | DESCRIPTION | The type of the trouble. | TYPE | Multiple. | VALID FORMAT | Predefined String. | MANDATORY | Yes. +--------------+
+----------------+ | START_DATETIME | +----------------+ | DESCRIPTION | The date and time that the | | incident/maintenance started. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | Yes. +----------------+
+-------------------+ | DETECT_DATETIME | +-------------------+ | DESCRIPTION | The date and time when the incident | | was detected. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | No. +-------------------+
+-----------------+ | REPORT_DATETIME | +-----------------+ | DESCRIPTION | The date and time when the incident | | was reported. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | No. +-----------------+
+--------------+ | END_DATETIME | +--------------+ | DESCRIPTION | The date and time when the incident/maintenance | | ended. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | Yes. +--------------+
+---------------------+ | TT_LAST_UPDATE_TIME | +---------------------+ | DESCRIPTION | The last date and time when the TT was | | updated. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | Yes. +---------------------+
+-------------------+ | TIME_WINDOW_START | +-------------------+ | DESCRIPTION | The window start time in which planned | | maintenance may occur. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | No, unless TYPE is "Scheduled". +-------------------+
+-----------------+ | TIME_WINDOW_END | +-----------------+ | DESCRIPTION | The window end time in which planned | | maintenance may occur. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | No, unless TYPE is "Scheduled". +-----------------+
+--------------------------+ | WORK_PLAN_START_DATETIME | +--------------------------+ | DESCRIPTION | Work planned (expected): start time | | in case of maintenance. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | No. +--------------------------+
+------------------------+ | WORK_PLAN_END_DATETIME | +------------------------+ | DESCRIPTION | Work planned (expected): end time | | in case of maintenance. | TYPE | Multiple. | VALID FORMAT | Datetime. | MANDATORY | No. +------------------------+
Figure 27. Work_Plan_End_Datetime Class
The period delimited by WORK_PLAN_START_DATETIME and WORK_PLAN_END_DATETIME MUST be included in the period delimited by TIME_WINDOW_START and TIME_WINDOW_END, and duplicated with {START, END}_DATETIME, even in case of maintenance.
+--------------------------+ | RELATED_EXTERNAL_TICKETS | +--------------------------+ | DESCRIPTION | The NOC entity related to the incident. | TYPE | List. | VALID FORMAT | String. | MANDATORY | No. +--------------------------+
+------------------+ | RELATED_ACTIVITY | +------------------+ | DESCRIPTION | The TT IDs of the related incidents. | TYPE | Multiple. | VALID FORMAT | String. | MANDATORY | No. +------------------+
+--------------------+ | AFFECTED_COMMUNITY | +--------------------+ | DESCRIPTION | Information about the community that was | | affected by the incident. | TYPE | Free. | VALID FORMAT | String. | MANDATORY | No. +--------------------+
+------------------+ | AFFECTED_SERVICE | +------------------+ | DESCRIPTION | The service that was affected by the | | incident. | TYPE | Multiple. | VALID FORMAT | String. | MANDATORY | No. +------------------+
+--------------+ | NETWORK_NODE | +--------------+ | DESCRIPTION | The NOC network node related to the incident. | TYPE | List. | VALID FORMAT | String. | MANDATORY | No. +--------------+
+----------------------+ | NETWORK_LINK_CIRCUIT | +----------------------+ | DESCRIPTION | The name of the network line related | | to the incident. | TYPE | List. | VALID FORMAT | String. | MANDATORY | No. +----------------------+
+---------------+ | OPEN_ENGINEER | +---------------+ | DESCRIPTION | The engineer that opened the ticket. | TYPE | Multiple. | VALID FORMAT | String. | MANDATORY | No. +---------------+
The collected and processed TTs received from multiple telecommunications networks are adjusted in a normalized NTTDM. Figure 43 shows the representation of this normalized data model. The "DESCRIPTION" attribute is implied.
Internationalization and localization are of specific concern to the NTTDM, since it is only through collaboration, often across language barriers, that certain incidents can be resolved. The NTTDM supports this goal by depending on XML constructs, and through explicit design choices in the data model.
The main advantage of the model is that it provides a normalized data type that is implemented fully in the English language and can be used conveniently. It also supports free-formed text that can be written in any language. In the future, it will provide translation services for all such free-formed text.
In this section, an example of network TTs exchanged using the proposed format is provided. This is an actual GRNet ticket normalized according to the NTTDM. Fields that were not included in the ticket are left blank.
<?xml version="1.0" encoding="UTF-8"?> <!-- This example describes a link failure that was detected -->
<NTTDM-Document version="1.00" lang="el" xmlns="urn:ietf:params:xml:ns:nttdm-1.0"> <Ticket> <Original_ID>5985</Original_ID> <Partner_ID>01</Partner_ID> <TT_ID>01_5985</TT_ID> <TT_Title>Forth Link Failure</TT_Title> <TT_Type>Operational</TT_Type> <TT_Status>Closed</TT_Status> <TT_Open_Datetime>2008-12-16T10:01:15+02:00</TT_Open_Datetime> <TT_Short_Description>Core Line Fault</TT_Short_Description> <TT_Long_Description>Forth Link Failure</TT_Long_Description> <Type>Unscheduled</Type> <TT_Impact_Assessment>No connectivity</TT_Impact_Assessment> <Start_Datetime>2008-12-16T09:55:00+02:00</Start_Datetime> <TT_Last_Update_Time>2008-12-16T15:00:34+02:00</TT_Last_Update_Time> <Location>HERAKLION</Location> <History>Optical transmitter was changed</History> <TT_Close_Datetime>2008-12-16T15:05:00+02:00</TT_Close_Datetime> <End_Datetime>2008-12-16T15:01:21+02:00</End_Datetime>
The NTTDM data model defines a data model and the relevant XML Schema for trouble ticket normalization; as such, the NTTDM itself does not raise any security concerns. However, some security issues SHOULD be considered as network TTs could carry sensitive information (IP addresses, contact details, authentication details, commercial providers involved, etc.) about flagship institutions (military, health centre...).
The security considerations MAY involve measures during the exchange as well as during processing of the information.
The HASH field is intended to provide an integrity insurance attribute within the exchanged tickets; however, it alone does not ensure integrity.
Confidentiality MAY be ensured by encrypting whole tickets or only some parts of them. This could permit meaningful tickets to be disclosed, while only sensitive information would be protected.
Peer entity authentication SHOULD be provided in order to establish a session with data origin authentication, regardless of the form in which the TTs are exchanged -- being delivered either through email, web forms, or through a Simple Object Access Protocol (SOAP) service. SOAP is considered the better choice; the model itself, though, does not specify the communications requirements.
The underlying communications service MUST provide guarantees to properly address integrity, confidentiality, and peer entity authentication. The selection of the enforcing mechanisms is not in the scope of this document, and the choice is up to the implementers.
For data processing security, each participating organization MAY use its own privacy policy, as part of its own data processing system. This approach avoids any interoperability issues and does not pose any extra burden for the adoption of the current scheme into the operational procedures of the NOCs. Unauthorized and inappropriate usage MUST be avoided.
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[2] World Wide Web Consortium, "Extensible Markup Language (XML) 1.0 (Fifth Edition)", W3C Recommendation, 26 November 2008, <http://www.w3.org/TR/2008/REC-xml-20081126>.
[4] World Wide Web Consortium, "XML Schema Part 1: Structures Second Edition", W3C Recommendation, 28 October 2004, <http://www.w3.org/TR/xmlschema-1/>.
[5] World Wide Web Consortium, "XML Schema Part 2: Datatypes Second Edition", W3C Recommendation, 28 October 2004, <http://www.w3.org/TR/xmlschema-2/>.