Internet Engineering Task Force (IETF) R. Ravindranath Request for Comments: 8068 Cisco Systems, Inc. Category: Informational P. Ravindran ISSN: 2070-1721 Nokia Networks P. Kyzivat Huawei February 2017
Session recording is a critical requirement in many communications environments, such as call centers and financial trading organizations. In some of these environments, all calls must be recorded for regulatory, compliance, and consumer-protection reasons. The recording of a session is typically performed by sending a copy of a media stream to a recording device. This document lists call flows with metadata snapshots sent from a Session Recording Client (SRC) to a Session Recording Server (SRS).
Status of This Memo
This document is not an Internet Standards Track specification; it is published for informational purposes.
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 7841.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc8068.
Ravindranath, et al. Informational [Page 1]
RFC 8068 SIP Recording Call Flows February 2017
Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Table of Contents
1. Overview ........................................................3 2. Terminology .....................................................3 3. Metadata XML Instances ..........................................3 3.1. Sample Call Flow ...........................................3 3.2. Call Scenarios with SRC Recording Streams without Mixing ...5 3.2.1. Example 1: Basic Call ...............................5 3.2.2. Example 2: Hold/Resume ..............................9 3.2.3. Example 3:Call Transfer (RE-INVITE and REFER Based) .......................................12 3.2.4. Example 4: Call Disconnect .........................19 3.3. Call Scenarios with SRC Recording Streams by Mixing .......20 3.3.1. Example 1: Basic Call with SRC Mixing Streams ......20 3.3.2. Example 2: Hold/Resume with SRC Recording by Mixing Streams ..................................23 3.3.3. Example 3: Metadata Snapshot of Joining/Dropping of a ..............................25 3.3.4. Example 4: Call Disconnect .........................28 3.4. Call Scenarios with Persistent RS between SRC and SRS .....28 3.4.1. Example 1: Metadata Snapshot during CS Disconnect with ....................................29 3.5. Turret-Case: Multiple CS into Single RS with Mixed Stream ....................................................30 4. Security Considerations ........................................32 5. IANA Considerations ............................................32 6. References .....................................................33 6.1. Normative References ......................................33 6.2. Informative References ....................................33 Acknowledgements ..................................................34 Authors' Addresses ................................................34
Session recording is a critical requirement in many communications environments, such as call centers and financial trading organizations. In some of these environments, all calls must be recorded for regulatory, compliance, and consumer-protection reasons. The recording of a session is typically performed by sending a copy of a media stream to a recording device. [RFC7865] focuses on the recording metadata that describes the Communication Session (CS). This document lists few examples and shows the snapshots of metadata sent from a Session Recording Client (SRC) to Session Recording Server (SRS). For the sake of simplicity, the entire Session Initiation Protocol (SIP) [RFC3261] messages are not shown, instead only snippets of the SIP and Session Description Protocol (SDP) [RFC4566] messages and the XML snapshot of metadata is shown.
The following is a sample call flow that shows the SRC establishing a Recording Session (RS) towards the SRS. In this example, the SRC could be part of any one of the architectures described in Section 3 of [RFC7245].
For the sake of simplicity, ACKs to RE-INVITES and BYEs are not shown. The subsequent sections describe the snapshot of metadata sent from the SRC to the SRS for each of the above transactions (F1 ... Fn-1). There may be multiple UPDATES/RE-INVITES mid call to indicate snapshots of different CS changes. Depending on the architecture described in Section 3 of [RFC7245], an SRC may be an endpoint, a B2BUA, or part of the MEDIACTRL architecture or the Conference focus. The subsequent sections in this document try to list some example metadata snapshots for three major categories.
o The SRC recording streams unmixed to the SRS. This includes cases where the SRC is a SIP UA or B2BUA.
o The SRC recording mixed streams to the SRS. This includes cases where the SRC is part of SIP conference model, as explained in [RFC4353].
o The SRC having a persistent RS with the SRS.
Ravindranath, et al. Informational [Page 4]
RFC 8068 SIP Recording Call Flows February 2017
o Special flows like turret flows (used on financial trading floors to manage call activity). A trading turret is a specialized telephony key system that has a highly distributed switching architecture enabling parallel processing of calls. Figure 6 in Section 4 of [RFC6341] has the turret use case.
Note that only those examples where metadata changes are listed in each category. For some of the call flows, the snapshots may be the same (like in case of endpoint or B2BUA acting as SRC) and the same is mentioned in the text preceding the example.
3.2. Call Scenarios with SRC Recording Streams without Mixing
This section describes example flows where SRC can be a SIP-UA or B2BUA as described in Section 3 of [RFC7245]. The SRS here can be a SIP-UA or an entity part of the MEDIACTRL architecture described in Section 3 of [RFC7245].
Basic call between two participants, Alice and Bob, who are part of the same CS. In this use case, each participant sends two media streams (audio and video). Media streams sent by each participant are received by the other participant in this use case. In this example, the SRC is a B2BUA in the path between Alice and Bob, as described in Section 3.1.1 of [RFC7245]. Below is the initial snapshot sent by SRC in the INVITE to SRS. This snapshot has the complete metadata. For the sake of simplicity, only snippets of SIP/ SDP are shown. In this example, the SRCs records the streams of each participant to SRS without mixing.
A call between two participants Alice and Bob is established and an RS is created for recording, as in example 1. Bob puts Alice on hold and then resumes as part of the same CS. The 'send' and 'recv' XML elements of a 'participantstreamassoc' XML element is used to indicate whether or not a participant is contributing to a media stream. SRC sends a snapshot with only the changed XML elements.
In the above snippet, Alice with participant_id srfBElmCRp2QB23b7Mpk0w== only receives media streams and does not send any media. The same is indicated by the absence of a 'send' XML element. On the other hand, Bob (participant_id zSfPoSvdSDCmU3A3TRDxAw==) would be sending media, but he does not receive any media from Alice; therefore, the 'recv' XML element is absent in this instance.
During resume
The snapshot now has 'send' and 'recv' XML elements for both Alice and Bob, indicating that both are receiving and sending media.
3.2.3. Example 3:Call Transfer (RE-INVITE and REFER Based)
A basic call between two participants, Alice and Bob, is connected, and SRC (a B2BUA acting as SRC as per Section 3.1.1 of [RFC7245]) has sent a snapshot as described in example 1. Transfer is initiated by one of the participants (Alice). After the transfer is completed, the SRC sends a snapshot of the participant changes to the SRS. In this transfer scenario, Alice drops out after transfer is completed, Bob and Carol get connected, and recording of media between Bob and Carol is done by the SRC. There are two flows that can happen here as described below.
Transfer within the same session (e.g., a RE-INVITE-based transfer): Alice drops out and Carol is added to the same session. No change to the session/group element is made. A 'participantsessassoc' XML element indicating that Alice has disassociated from the CS will be present in the snapshot. A new 'participant' XML element representing Carol with mapping to the same RS SDP stream used for mapping earlier Alice's stream is sent in the snapshot. A new 'sipSessionID' XML element that has Universally Unique Identifier (UUID) tuples and that corresponds to Bob and Carol is sent in the snapshot from the SRC to the SRS. Note that one half of the session ID, that which corresponds to Bob, remains the same.
Ravindranath, et al. Informational [Page 12]
RFC 8068 SIP Recording Call Flows February 2017
Metadata snapshot for INVITE based transfer in CS:
Transfer with a new session (e.g., REFER-based transfer): in this case, a new session (CS) is created and shall be part of same CS- group (done by the SRC).
The SRC first sends an *optional* snapshot indicating disassociation of the participant from the old CS. An SRC may choose to just send an INVITE with a new 'session' XML element to implicitly indicate that the participants are now part of a different CS without sending disassociation from the old CS. In this example, the SRC uses the same RS. In case the SRC wishes to use a new RS, it will tear down the current RS using normal SIP procedures (BYE) with metadata, as in example 4.
In the above snapshot, the 'participantsessionassoc' XML element is optional as indicating a 'session' XML element with a 'stop-time' XML element implicitly means that all the participants associated with that session have been disassociated.
The SRC sends another snapshot to indicate the participant change (due to REFER) and new session information after transfer. In this example, it is assumed that the SRC uses the same RS to continue recording the call. The 'sipSessionID' XML element in the metadata snapshot now indicates Bob and Carol in the (local, remote) UUID pair.
3.3. Call Scenarios with SRC Recording Streams by Mixing
This section describes a few example call flows where the SRC may be part of conference model either as focus or a participant in conference as explained in Section 3.1.5 of [RFC7245]. The SRS here can be a SIP User Agent (UA) or an entity part of the MEDIACTRL architecture. Note that the disconnect case is not shown since the metadata snapshot will be same as for a non-mixing case.
3.3.1. Example 1: Basic Call with SRC Mixing Streams
A basic call between two participants, Alice and Bob, who are part of one CS. In this use case, each participant calls into a conference server (say, a Multipoint Control Unit (MCU)) to attend one of many conferences hosted on or managed by that server. Media streams sent by each participant are received by all the other participants in the conference. Below is the initial snapshot sent by the SRC in the INVITE to the SRS that has the complete metadata. For the sake of simplicity, only snippets of SIP/SDP are shown. The SRC records the streams of each participant to SRS by mixing in this example. The SRC here is part of conference model described in Section 3 of [RFC7245] as a focus and does mixing. The SRC here is not a participant by itself and hence it does not contribute to media.
Ravindranath, et al. Informational [Page 20]
RFC 8068 SIP Recording Call Flows February 2017
Metadata snapshot with the SRC mixing streams to the SRS:
In the above example, there are two participants, Alice and Bob, in the conference. Among other things, the SRC sends Session-ID in the metadata snapshot. There are two Session-IDs here: one that corresponds to the SIP session between Alice and the Conference focus and the other for the SIP session between Bob and the Conference focus. In this use case, since Alice and Bob call into the conference, these Session-IDs are different.
Ravindranath, et al. Informational [Page 22]
RFC 8068 SIP Recording Call Flows February 2017
3.3.2. Example 2: Hold/Resume with SRC Recording by Mixing Streams
This is the continuation of example 1 (basic call with SRC mixing streams). A call between two participants, Alice and Bob, is established and an RS is created for recording, as in example 5. One of the participants, Bob, puts Alice on hold, and then resumes as part of the same CS. The 'send' and 'recv' XML elements of a 'participant' XML element are used to indicate whether or not a participant is contributing to a media stream. The metadata snapshot is represented below:
During hold
Metadata snapshot when a CS participant goes on hold and the SRC is mixing the streams:
3.3.3. Example 3: Metadata Snapshot of Joining/Dropping of a Participant to a Session
In a conference model, participants can join and drop a session any time during the session. Below is a snapshot sent from the SRC to the SRS in this case. Note the SRC here can be a focus or a participant in the conference. In the case where the SRC is a participant, it may learn the information required for metadata by subscribing to a conference event package [RFC4575]. Assume Alice and Bob were in the conference and a third participant (Carol) joins, then the SRC sends the below snapshot with the indication of new participant.
Ravindranath, et al. Informational [Page 25]
RFC 8068 SIP Recording Call Flows February 2017
Metadata snapshot for a new participant joining CS:
When a CS is disconnected, the SRC sends a BYE with a snapshot of metadata having a session stop time and participant disassociation times. The snapshot looks the same as listed in Section 3.2.4.
3.4. Call Scenarios with Persistent RS between SRC and SRS
This section shows the snapshots of metadata for the cases where a persistent RS exists between the SRC and the SRS. An SRC here may be a SIP UA or a B2BUA, or it may be part of a conference model as either the focus or a participant in a conference. The SRS here could be a SIP UA or an entity part of the MEDIACTRL architecture. Except in the disconnect case, the snapshot remains same as mentioned in previous sections.
Ravindranath, et al. Informational [Page 28]
RFC 8068 SIP Recording Call Flows February 2017
3.4.1. Example 1: Metadata Snapshot during CS Disconnect with Persistent RS between SRC and SRS
Metadata snapshot for a CS disconnect with a persistent RS:
3.5. Turret-Case: Multiple CS into Single RS with Mixed Stream
In trading-floor environments, in order to minimize storage and recording system resources, it may be preferable to mix multiple concurrent calls (each call is one CS) on different handsets/speakers on the same turret into a single RS. This would mean media in each CS is mixed and recorded as part of single media stream, and multiple such CSs are recording in one RS from an SRC to an SRS.
Taking an example where there are two CSs [CS1 and CS2]: assume mixing is done in each of these CSs and both these CSs are recorded as part of single RS from a single SRC, which is part of both the CSs. There are three possibilities here:
o CS1 and CS2 use the same focus for mixing, and that focus is also acting as SRC in each of the CSs.
o One CS (e.g. CS1) SRC is the focus and the other CS (e.g. CS2), SRC is just one of the participants of the conference.
o In both CS1 and CS2, the SRC is just a participant of conference.
The following example shows the first possibility where CS1 and CS2 use the same focus for mixing, and that focus is also acting as SRC in each of the CSs.
Metadata snapshot with two CSs recorded as part of the same RS:
Security and privacy considerations mentioned in [RFC7865] and [RFC7866] have to be followed by the SRC and the SRS for setting up RS SIP dialogs and sending metadata.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, DOI 10.17487/RFC3261, June 2002, <http://www.rfc-editor.org/info/rfc3261>.
Thanks to Ofir Rath, Charles Eckel, Yaron Pdut, Dmitry Andreyev, and Charles Armitage for their review comments.
Thanks to Alissa Cooper, Stephen Farrell, Kathleen Moriarty, Suresh Krishnan, Benoit Claise, Carlos Pignataro, Dan Romascanu, and Derek Atkins for their feedback and comments during IESG reviews.
Authors' Addresses
Ram Mohan Ravindranath Cisco Systems, Inc. Cessna Business Park, Kadabeesanahalli Village, Varthur Hobli, Sarjapur-Marathahalli Outer Ring Road Bangalore, Karnataka 560103 India
Email: rmohanr@cisco.com
Parthasarathi Ravindran Nokia Networks Bangalore, Karnataka India
Email: partha@parthasarathi.co.in
Paul Kyzivat Huawei Hudson, MA United States of America