Subject: Multimedia Communication and Content Security (MZKO), Department of Telecommunications, Faculty of electrical engineering and computer science, VSB-TUO.

Name: Bc. Kryštof Šara (SAR0130)

Task syllabus:

  • master key exchange (symetric cryptography)
  • SRTP-DES exchange, SIP VoIP, SDP session descriptor, RTP stream description (codecs, media type, ports, SRTP master key in bae64) in SIP signalling
  • key distribution problem (MitM-prone)
  • ZRTP and Diffie-Hellmann (DH) alg (MitM-prone and DH implementation in old HW problem)
  • SRTP-DTLS session, WebRTC over DTLS channel, media encryption
  • simulation

introduction

In the world of a continuous need for communication (preferably in real-time), it is vital for the media stream transportation to be reliable (uninterrupeted), secure (end-to-end encrypted, and fast (UDP/IP, low jitter and RTT).

Internet telephony (also Voice over Internet Protocol, VoIP) technology mainly ensures a connection between Plain Old Telephone Systems (POTS), which could be represented by a plain analogue telephone device, and between an interconnected generic computer network built on the Internet Protocol (IP). This mutual integration is usually called a converged network. [1]

symmetric vs. asymmetric encryption

The main purpose of an encryption is two ensure confidentiality of data. Also, the encryption can be used for authentication. [7] [8]

General types of encryption: [7] [8]

  • symmetric using the shared private key — DES (3DES), AES, Kerberos
  • asymmetric using the private-public key pair — Diffie-Hellman, RSA, digital signs and certificates

Asymmetric cryptography is an implementation of so-called trapdoor hash function, where the hashing procedure can be inverted using a secret key. [8]

Another key difference there is the computational speed of the encryption. While symmetric encryption is very fast, asymmetric encryption is way slower due to more complex algorithm used. This can by bypassed by using the symmetric type for the interchanged data encryption, while asymmetric type is used for secure symmetric key encryption and distribution. [7] [8]

the VoIP protocol stack

To ensure the transportation integrity, a set of various VoIP protocols are introduced and to be implemented by the reliable systems. Probably the most used VoIP (signalling) protocol is called the Session Initiation Protocol (SIP). This protocol defines an interface for both sides to implement to create a media transportation session. [1]

When the session is initiated, the media stream tunnel can be initiated henceforth too. The stream itself is (under the SIP protocol) usually defined by the Real-time Transport Protocol (RTP). The encrypted version of the just-mentioned RTP protocol, is called SRTP (Secure RTP). [1]

To negotiate the session’s parameters, SIP implements the Session Description Protocol (SDP), which is used in the first phase/message sent by one conterpart (INVITE + SDP header). [1]

summary

protocol abbreviationprotocol full nameprotocol type
SIPSession Initiation Protocolsignalling protocol
SDPSession Description Protocolmedia session management protocol
(S)RTP(Secure) Real-time Transport Protocolreal-time media transportation protocol

WebRTC

Web Real-Time Communication (WebRTC) is a technology used for web applications to capture and stream multimedia (audio, video) content (even binary data could be exchanged between peers). The main advantage is that peers don’t need any additional plug-ins or software besides supported web browser. Connection between peers can often be direct, which means that no intermediary supportive servers are needed. There is also a support for interoperability with PSTN networks already implemented in WebRTC. [13]

RTP stream security

As RTP stream often transfer sensitive media data such as business conference calls, it is very important to ensuse security of such stream. For this purpose, Secure RTP (SRTP) protocol has been introduced.

SRTP, SDES and SDP

SRTP to RTP is of a similar paradigm as HTTPS to HTTP — RTP streams can be encrypted using SRTP. It is however not always as easy to enable encryption as intended, because used encryption techniques and protocols (and their combinations) could possibly not be supported on remote devices. Encryption has to be enabled on both sides to allow session initiation. [14]

The exchange of encryption keys can be executed via various channels and using multiple technologies. The original method of exchange was to use SDES (Session Description Protocol Security Descriptions) though the signalling channel — for example SIP channel. Those keys however can easily be catched on the SIP proxy and used for SRTP stream decryption. Those streams can also be tempered by the man-in-the-middle (MitM) attack — stream could be decrypted, recorded, changed and retransmitted to the original peer. [14] [15]

Fig. 1: Example of SDP media atributes defined by SDES including crypto part carrying the master key. A wireshark listing of SIP/SDP packet message body.

Master Key Identifier (MKI) is an optional field of SRTP protocol, that identifies the master key. The master key is used for secure symmetric keys generation (session keys) — VoIP media data encryption. Those symmetric keys are to be negotiated at the beginning of the call and are often included in SDP packet body. [11]

The main problem with SDP there is that the master key is distributed/transported in insecure plaintext form, meaning it is prone to MitM attack — it could be easily sniffed when the SIP session is initiated with INVITE packet. [11]

ZRTP (Zimmermann RTP) and SAS

ZRTP suits as enhancement of SRTP protocol. The main pro there is that the master key is distributed using Diffie-Hellman (DH) mechanism. After the successful master key exchange, the session is switched back to SRTP. During and before the exchange is completed, ZRTP protocol usually informs the user, that the call is not encrypted yet if the ZRTP mechanism is enabled (with the encryption enabled implicitly). [11] [14]

Besides DH exchange improvement, the Short authentication string (SAS) is also introduced to both call sides as another layer of call integrity. The SAS message can be then read by both sides to ensure that the call is encrypted and secured. This technique however is prone to speaker’s voice. If the other peer is a complete stranger, we have no certainty of the call authenticity — that the person speaking is not an attacker. [14]

The another problem of ZRTP is its support in various VoIP applications and hardware. Most importantly, ZRTP is not currently supported in WebRTC, or its support is very limited. [14]

SRTP-DTLS

Another security alternative for SRTP is SRTP-DTLS. The main mechanism of security ensurement there is that Datagram Transport Layer Security (DTLS) protocol, that is used to secure UDP traffic. DTLS is based on stream-oriented TLS protocol. This technique is widely used by web browsers and by WebRTC calls. [12] [14]

SIP proxy

Session Initiation Protocol proxy is a special type of software, that allows SIP-based VoIP call packets to bypass various network firewalls. The SIP proxy also could provide address translation in order to direct calls to the VoIP call peers/members. It has got support for authentication, authorization, accounting (AAA), and encryption. [10]

Kamailio

Fig. 2: Official Kamailio logo.

Kamailio is free and open-source SIP server. It can handle up to thousands of call setups per second. Kamailio can be used to build WebRTC conference applications, presence detection systems and instant messaging applications. [9]

As far as the protocol stack used in Kamailio is concerned, it can run on TCP, UDP, securely using TLS for VoIP, and for WebRTC it could use WebSockets. Network layer protocol IPv4 and also IPv6 are supported too. Moreover, it has embedded support for various backend systems like MySQL, Postgres, LDAP, Redis, MongoDB or SNMP. [9]

demonstration

To show the master-key delivery vulnerability of SRTP protocol, simple VoIP call capture and analysis is to be done in this section. As far as the signalling protocol is concerned, SIP is going to be used, and Kamailio SIP server is going to be used as SIP proxy server. Then two clients are to be connected/registered to the server to be ready to start/receive a call. The call itself is going to be captured on an egress interface of one client. Finally, the captured packets are to be filtered, and the filtered UDP stream will be analyzed and possibly decrypted.

Demonstration syllabus:

  • SIP proxy (Kamailio), signalling processing, direct RTP stream
  • Kamailio + MySQL for SIP phone registration
  • capture SDP packets/stream and get SRTP master key
  • capture SRTP stream
  • insert SRTP master key into decryptor(s)
  • desipher SRTP stream and play the media
  • include scripts listings

used software and hardware

software

  • Kamailio v5.7.1 SIP (proxy) server
  • MariaDB v11.1.2 RMDB
  • Docker engine v24.0.5
  • Fedora 38
  • Raspbian GNU/Linux 12 (bookworm)
  • iOS 17.1.1
  • Wireshark v4.0.8
  • tcpdump
  • Jami Qt 6.4.2 for Fedora 38
  • Jami for iPhone v3.52

hardware

  • Raspberry Pi 4B 8 GB RAM
  • iPhone SE (2nd edition)

kamailio configuration

For the further usage of the Kamailio SIP server, we are going to use Docker engine. Although the Alpine image is a way smaller than the Debian-based one, it lacks the support for MySQL client and for TLS (no option for apk install installment using the apk package manager). Therefore we are going to use xenial Docker image.

Kamailio Official Docker Image

1
docker pull ghcr.io/kamailio/kamailio:5.7.1-xenial

For the better setup, I am going to introduce a functional docker-compose file, which is going to include services, networks, and volumes settings all together. The compose file was written from scratch.

Service kamailio has to wait for the database engine to start and initialize properly, so there is a helthcheck for the mariadb service implemented, while the kamailio service can start up after the condition of the database container is set to “healthy”. [5][6]

kamailio-compose.yml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
version: '3.9'

networks:
  kamailio-net:
    name: kamailio-net

volumes:
  mariadb-data:
    
services:
  kamailio:
    image: ghcr.io/kamailio/kamailio:5.7.1-xenial
    #image: ghcr.io/kamailio/kamailio-ci
    container_name: kamailio-mzko
    volumes:
      - "./kamailio:/etc/kamailio"
      - "./kamailio/kamailio.default:/etc/default/kamailio"
    mem_limit: 64M
    mem_reservation: 8M
    depends_on:
      mariadb-mzko:
        condition: service_healthy
    networks:
      - kamailio-net
    ports:
      - target: 5060
        published: 5060
        host_ip: 10.4.5.131
        mode: host
        protocol: udp

      - target: 5060
        published: 5060
        host_ip: 10.4.5.131
        mode: host
        protocol: tcp

      - target: 5061
        published: 5061
        host_ip: 10.4.5.131
        mode: host
        protocol: tcp

  mariadb:
    image: mariadb:11.1.2
    container_name: mariadb-mzko
    hostname: mariadb-mzko
    volumes:
      - "mariadb-data:/var/lib/mysql"
    networks:
      - kamailio-net
    environment:
      MYSQL_ROOT_PASSWORD: mzko
    healthcheck:
      test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"]
      timeout: 10s
      retries: 5

The defualt config file directory structure can be obtained using this recommended procedure [2]:

1
2
3
docker create --name kamailio kamailio/kamailio-ci
docker cp kamailio:/etc/kamailio ./kamailio
docker rm kamailio

We want to use MySQL (MariaDB) as a database engine, so we have to tweak the Kamailio configuration a bit. [3]

/etc/kamailio/kamailio.cfg [3]:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#!define WITH_DEBUG
#!define WITH_MYSQL
#!define WITH_AUTH
#!define WITH_TLS

listen=udp:10.4.5.137:5060
listen=tls:10.4.5.137:5061

modparam("tls", "config", "/etc/kamailio/tls.cfg")
modparam("tls", "tls_force_run", 11)

Disable certificate checking for both server and client in /etc/kamailio/tls.cfg [3]:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
[server:default]
method = TLSv1.2+
verify_certificate = no
require_certificate = no
private_key = /etc/kamailio/kamailio-key.pem
certificate = /etc/kamailio/kamailio-cert.pem

[client:default]
verify_certificate = no
require_certificate = no

/etc/kamailio/kamctlrc [3]:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SIP_DOMAIN=IP_address_of_the_Kamailio
DBENGINE=MYSQL
DBHOST=localhost
DBNAME=kamailio
DBRWUSER=kamailio
DBRWPW=”kamailiorw"
DBROUSER=kamailioro
DBROPW=kamailioro
DBROOTUSER="root"
VERBOSE=1
PID_FILE=/var/run/kamailio.pid

Generate self-signed certificate and key using the openssl library [4]:

1
2
3
4
5
openssl req -x509 \
    -newkey rsa:4096 -keyout kamailio/kamailio-key.pem \
    -out kamailio/kamailio-cert.pem \
    -sha256 -days 3650 -nodes -subj \
    "/C=CZ/ST=CzechRepublic/L=Brno/O=savla.dev/OU=telecom/CN=kamailio.savla.net"

database configuration

For the kamailio container to start, the database has to be configured manually — at least one has to create user kamailio identified by password and a kamailio database. We can start only the mariadb container at first, as the kamailio container would fail and stop anyway.

1
docker compose --file kamailio-compose.yml up mariadb --detach

Then we can use mariadb client connector within the server’s container and log-in using the root credentials (password specified using the environmental constant in docker-compose YAML file).

1
2
3
4
5
6
7
8
9
docker exec -it mariadb-mzko mariadb -u root -p

create database kamailio;

create user 'kamailio'@'%' identified by 'kamailiomzko';
create user 'kamailioro'@'%' identified by 'kamailioromzko';

grant all on kamailio.* to 'kamailio'@'%';
grant SELECT on kamailio.* to 'kamailioro'@'%';

Run the docker compose stack using:

1
docker compose --file kamailio-compose.yml up --detach --force-recreate

Then we can reinitialize our database and its tables using:

1
docker exec -it kamailio-mzko kamdbctl reinit

SIP clients registration

At first, we need to create SIP accounts using the kamctl command against the docker container:

1
2
docker exec -i kamailio-mzko kamctl add 1000 1234mzko
docker exec -i kamailio-mzko kamctl add 2000 1234mzko

We can then examine the existing account using kamctl show <SIP account number>

1
docker exec -i kamailio-mzko kamctl show 2000
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
database engine 'MYSQL' loaded
Control engine 'RPCFIFO' loaded
mysql: [Warning] Using a password on the command line interface can be insecure.
*************************** 1. row ***************************
      id: 2
username: 2000
  domain: kamailio.savla.net
password: 1234mzko
     ha1: 818dd4c1344c0c4080692bafebe7ef86
    ha1b: f3e321820e79ad7c5db3de18703d6a8a

Then, we can add created accounts into Jami SIP client for desktop (account 1000), and for iOS (account 2000).

Fig. 3: SIP account registeration in Jami desktop application.

Fig. 4: SIP account registeration in Jami iOS mobile application.

capturing the secured call

To be able to access all network interfaces, we need to start Wireshark with root’s privileges:

1
sudo wireshark

Now, execute the call from one SIP account to another in Jami application.

Fig. 5: A call initiation from Jami iOS mobile application.

After the call is ended, we can stop the packet captuing in Wireshark. Now, we can start extracting the captured information like SDES master key/salt pair form SIP/SDP packets, then find the UDP VoIP stream to extract as RTP stream, and finally to decode/decrypt it and load back to Wireshark.

extracting the information and single stream

To get the key/salt pair, simply filter out SIP/SDP packets in Wireshark:

1
sdp

Fig. 6: Filtering out SDP packets and exploring the media attributes of to find SDES key/salt crypto pair.

Now we can click on the crypto media attribute and choose Copy » Value to get base64-encoded key/salt pair. Note that the source there is 10.4.5.131 (receiver) and the SIP response is 200 OK. Also, we will extract hexadecimal value of the key/salt pair (Key and Salt field, Copy » …as a Hex stream). [16]

1
2
3
4
5
6
7
crypto:1 AES_CM_128_HMAC_SHA1_80 inline:S4GcXVYJkOXgkhsMsEI0GyNaagpSdxUssACV17zu

# base64 key/salt pair
S4GcXVYJkOXgkhsMsEI0GyNaagpSdxUssACV17zu

# key/salt pair as a hexadecimal
533447635856594a6b4f58676b68734d7345493047794e6161677053647855737341435631377a75

Next we need to find the raw UDP stream (heuristically by typing udp to the filter bar, and by looking at source/destination ports). Note that we need to choose the same source IP address. Also. we don’t want any ICMP or STUN packets to intercept our UDP stream.

1
udp.srcport == 50806 && !icmp && !stun && ip.src == 10.4.5.131

Next step there is to decode UDP stream as RTP stream by click on the random UDP packet and choosing Decode As…, and double-clicking on the UDP row and Current column — from the options do choose RTP.

Fig. 7: Decoding the UDP stream into a RTP stream.

When we go Telephony » RTP » RTP Stream Analysis » Graph, we can see the time fluctuation of the RTP stream.

Fig. 8: RTP Stream Analysis, millisecond delay in time.

As we now can analyse the stream partially, we can export specified packets into a pcap file — File » Export Specified Packets… » Save as wireshark/tcpdump pcap file (and name it rtp_single_stream.pcap for example).

decryption using srtp-decrypt

There’s one C library, that allows us to decrypt extracted single (one-way only) RTP stream into a hexadecimal stream (another UDP/RTP stream). [16]

Before the srtp-decrypt program compiling process, we need to install the dependencies:

1
2
3
4
5
# for Debian-based systems
sudo apt install libpcap-dev libgcrypt-dev

# or for RHEL-based systems
sudo dnf install libpcap-devel libgcrypt-devel

Then we can clone the repository and build the project:

1
2
3
4
5
git clone https://github.com/gteissier/srtp-decrypt
cd srtp-decrypt

# compile the project
make

Execute the decryptor with SDES key by piping the pcap exported packets file into the executable [16] [18]:

1
2
3
srtp-decrypt/srtp-decrypt -k "S4GcXVYJkOXgkhsMsEI0GyNaagpSdxUssACV17zu" \
    < rtp_single_stream.pcap \
    > rtp_single_stream.txt

If you encounter errors like this:

1
frame x dropped: decoding failed ‘Permission denied’

be sure to provide the correct key/salt pair. Basically it means that the decryption process failed, the program has not been able to decrypt the given packet/frame. [16]

The process itself should be fairly quick. Now we can import the hexadecimal stream from decryptor back into Wireshark to examine it further:

  • File » Import From Hex Dump,
  • Offsets as Hexadecimal,
  • Encapsulation as UDP,
  • ports are not important in this step (we can make them up),
  • click Import.

Now, decode the UDP packet stream back into RTP stream by clicking on random packet and choosing Decode As… » Current: RTP again. [16]

The stream should now be playable: Telephony » RTP » RTP Player.

decryption using libsrtp’s rtp_decoder

Alternatively, we can use Cisco’s C library called libsrtp, which introduces a program called rtp_decoder. [17]

To fetch the project and build it, simple run this:

1
2
3
4
5
6
7
8
9
git clone https://github.com/cisco/libsrtp
cd libsrtp

# configure and compile the project
./configure
make 

cd test
./rtp_decrypt

Here, we can use the hexadecimal stream key/salt string we got earlier. Perform the decoding process [17]:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# hexadecimal key/salt pair variant
./rtp_decoder -a -t 10 -e 256 \
    -k "533447635856594a6b4f58676b68734d7345493047794e6161677053647855737341435631377a75" \
    < ../../rtp_single_stream.pcap \
    > ../../rtp_single_stream2.txt

# base64 key/salt pair variant
./rtp_decoder -a -t 10 -e 256 \
    -b "S4GcXVYJkOXgkhsMsEI0GyNaagpSdxUssACV17zu" \
    < ../../rtp_single_stream.pcap \
    > ../../rtp_single_stream3.txt

conclusion

Decoding/decrypting is a fairly complicated process — mainly it is not easy to extract a simple RTP stream from the full pcap capture file. It is required to have some Wireshark and GNU/Linux knowledge, as well as some knowledge about software source fetching, configuring and building. However, in the end it is a nice exercise in network communication debugging and analysing.

Last but not least, I should note that the decryption process was not fully successful as the final stream was not playable and was, by some reason, of duration zero in seconds.


references

ref numberref details
[1]ŠILHAVÝ, Pavel. Telekomunikační a informační systémy. Brno: Vysoké učení technické v Brně, 2014. s. 1-140. ISBN: 978-80-214-5027-1. (CZ)
[2]https://github.com/kamailio/kamailio-ci/pkgs/container/kamailio-ci
[3]https://lms.vsb.cz/pluginfile.php/2062871/mod_resource/content/9/exercise_5_v1.pdf
[4]https://stackoverflow.com/a/10176685
[5]https://stackoverflow.com/a/41854997
[6]https://mariadb.com/kb/en/using-healthcheck-sh/
[7]PUŽMANOVÁ, Rita. TCP/IP v kostce. 2., upr. a rozš. vyd. České Budějovice: Kopp, 2009. ISBN 978-80-7232-388-3 (CZ).
[8]https://lms.vsb.cz/pluginfile.php/2063923/mod_resource/content/4/Opory_pro_p%C5%99edm%C4%9Bty_Kybernetick%C3%A1_bezpe%C4%8Dnost_I_a_II.pdf
[9]https://www.kamailio.org/w/
[10]https://www.pcmag.com/encyclopedia/term/sip-proxy
[11]https://is.muni.cz/el/1433/jaro2015/PV235/33168763/46420026/3_IP_Protokoly_prenosu_dat_2014.pdf
[12]https://techdocs.audiocodes.com/session-border-controller-sbc/mediant-software-sbc/user-manual/version-740/content/um/SRTP%20using%20DTLS%20Protocol.htm
[13]https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API
[14]https://rtcquickstart.org/guide/multi/optimal-connectivity-rtp-encryption.html
[15]https://vocal.com/secure-communication/sdes/
[16]https://www.acritelli.com/blog/hacking-voip-decrypting-sdes-protected-srtp-phone-calls/
[17]https://www.zoiper.com/en/support/home/article/162/How%20to%20decode%20SIP%20over%20TLS%20with%20Wireshark%20and%20Decrypting%20SDES%20Protected%20SRTP%20Stream
[18]https://github.com/alexcme/srtp-decrypt/blob/master/README.md