WebRTC - Bringing Real Time Communications to the Web Natively

Would you believe me if I say that, using just a few lines of JavaScript and HTML5 you could transform the Photo Booth app (available on Mac OSX) into a cool web based application, or overlay real-time audio and video onto your favorite WebGL based 3D game canvas, or build a plugin-less version of WebEx?

Through this blog, I attempt to take you on a journey into the latest disruptive Web Standard called WebRTC. My goal in writing this blog, is to provide readers with some background information and dive a bit deeper into what WebRTC has to offer from the standards, and application developer perspective.

Before I jump in, let me introduce Cisco’s WebRTC crew –
Cullen Jennings, Ethan Hugg, Enda Mannion, Suhas Nandakumar (that’s me :)).

Background

The Web is evolving at a pace faster than ever before. The last few years has seen tremendous innovations in the Web Technologies, Applications, Infrastructure and Services. The advent of HTML5 has redefined the way Web Applications work by bringing in the capabilities & richness of native applications to the Web platform.

HTML5 technologies such as Web Workers, Browser-Native Media, Web Sockets and the like are redefining the roles and capabilities of the browser and the Web, and creating experiences that rival native applications.

Building along similar lines, is the introduction of WebRTC/RTCWeb technological standards into the HTML5 standards basket, which is concerned with bringing rich real-time, interactive communications natively to the browsers.

Real-time communications applications like softphones, conferencing applications are not new to the Web. Applications such as WebEx, JabberWeb and Skype already provide ways for people to communicate and collaborate on the Web today.
These applications do come with limitations however:

Need to install plugin to get things working.
With plugins comes the challenges of compatibility on the host platform.
With plugins comes along security issues, since the application no longer runs in the sandboxed environment of the browser.
With plugins comes the issue of maintenance. Newer versions of the browser or standards might break existing installations.
Applications based on plugins lack the rich flexibility of native browser resources due to privilege restrictions. This in turn limits the innovation possible with these applications.
Proprietary solutions brings in interoperability issues.

WebRTC – What is it ?

RTCWeb (Owned by IETF) and the WebRTC (Owned by W3C) standards is an evolving proposal to bringing the “Rich Interactive Secure Peer to Peer Communications” to the Web in a “Plugin-less Fashion”. These standard bodies together are responsible for defining the following aspects for enabling real-time communications as inherent part of the web infrastructure.

APIs and access rules for end-user devices such as microphones, cameras etc.
End-to-end security architecture and protocol.
NAT traversal techniques for peer connectivity.
Signaling mechanisms for setting up, updating and tearing down the sessions.
Support for different media types.
Media transport requirements.
Quality of Service, congestion control and reliability requirements for the session over the Best-Effort Internet.
Identity architecture and mechanisms for peer identification.
Codecs for audio and video compression.
Last but not the least, HTML and JavaScript APIs enabling application developers.

With such a detailed charter, WebRTC/RTCWeb has the potential to impact the way people communicate on the web. With the tremendous increase in the usage of browsers and always available nature of the Web, the combination of “Browser and the Web” revolutionizes real-time communications on one end and possibly poses potential challenges to legacy/traditional solutions of today.

The picture below captures various outcomes from the IETF and W3C standard bodies:

GetUserMedia API specification defines requirements for a Web application to access end-users media sources such as camera, microphone
PeerConnection API specifies SDP-based session description APIs and the state machine to session setup, update and tear-down between the peers.
Data Channel API will enable peer-to-peer exchange of arbitrary data, with low latency and high throughput.
Under the hood, the browser is responsible for:

Ensuring end-to-end security for media and data sessions via DTLS.
Performing NAT traversals procedures for connection setup based on Interactive Connection Establishment (ICE).
Establishing media transport based on RTP and UDP.
Setting up data-channel transport based on SCTP and UDP
Enabling feedback reports for the session based on RTCP.
Encoding and decoding audio and video streams.

These requirements may evolve over time till all the aspects of the standards are frozen.

Use-Cases and Architecture Preview

Shifting gears, let me introduce few sample use-cases that are quite easily achievable with WebRTC

1. Seamless Conferencing:

This use case represents a Web-Conferencing scenario built with lightweight HTML5 components and WebRTC APIs with no plug-in installation. Such an application can allow plug-n-play of components such as chat, file transfer, screen share with few lines of Javascript code.

2. Personal Shopper/Instant Customer Care:

This use-case captures consumer-to-business scenario where a web application provider like Amazon, might provide “Click to Call” service to their customers with few WebRTC APIs. Such a service would enable converting a mundane search into rich 2-way audio and video interaction with the customer care representative thus implying higher transaction conversion ratios.

3. Multimedia based Rich 3D Games:

This scenario enables audio, video, and data streams into gaming environments with WebRTC, HTML5 and WebGL APIs. Such an combination provides options for combining real-time media with WebGL canvas innovatively

Architecturally, a WebRTC based system falls into following broad categories

Browser <-> Browser Browser <-> VOIP End Point

The web server can be any application server that provides required identity and authentication procedures for the end-users at the minimum.

In either case secure media flows directly between the peer. In the VoIP scenario an intermediate gateway setup is required to handle signaling and any required translations depending upon the capability limitations by the VoIP endpoint. This might include things like, “unable to perform ICE check”, “no support for secure RTP”, and so on. A detailed analysis of architectural solutions and potential differences between these systems are out of the scope of this blog.

Cisco’s Involvement with the RTCWeb

Cisco has been actively participating in standards development and the implementation.
With respect to standards participation, Cisco has taken leadership roles in help shape the requirements from both the IETF and W3C perspective.

At the IETF, a working group (WG) called RTCWeb is been responsible for driving “on-the-wire” standards. Cullen Jennings from Cisco is Co-Chairing this WG. He is also Co-Author on the W3C specification that is responsible for defining the browser API requirements. Aside from these, there has been lot of thought leadership established across several areas of standards such as QoS, codecs, API development, and signaling.

On the implementation front, Cisco open-sourced its VoIP code-base from our soft-phones with the following components

RFC3261 Compliant SIP stack.
RFC4566 Compliant SDP Engine
Call Control Application Logic for Soft-phone Application.

The open source project can be found as a GitHub project under the name Ikran.
The Cisco team is working with the Mozilla for joint implementation of WebRTC standards into Firefox. For this purposes the components (2) and (3) from Ikran are being reused for implementing session control and session description aspects of the PeerConnection object.

WebRTC in Action – Getting Hands Dirty

It’s time to get hand’s dirty and try few demos in action. WebRTC for desktop is now in Firefox Nightly and also in Firefox Aurora releases. The difference being, Nightly versions has the latest and hottest up-to-date fixes while Aurora being pre-beta build is a slightly older but a stabler version.
For the purpose of this blog, let us consider using Firefox Aurora build, the setup instructions below applies for Firefox Nightly as well.
The demo page enables one to try out following aspects
– GetUserMedia based audio capture, video capture and picture snapshot.
– PeerConnection based 2-way audio/video call.
– DataChannel based session.

Let’s get started ….
Step1: Getting Firefox Aurora

Step2: Configure Aurora
Currently the code is behind preference setting. To enable the WebRTC code, browse to “about:config” and do the following
2a. Set media.navigator.enabled to true to enable calls to GetUserMedia only.
2b. Set media.navigator.permission.disabled to true to automatically gain permission to access camera/microphone
2c. Set media.peerconnection.enabled to true to enable PeerConnection functionality

Step3: Running the demos.
On your Aurora build, browse to WebRTC Demo Page and try out the demos listed above.

Interested in Learning More ?

1. Cullen Jennings has provided a detailed explanation about everything here.
2. Justin Uberti, from Google explains WebRTC implementation in Chrome here
3. IETF Standards Page
4. W3C Standards Page
5. Mozilla Wiki and Blog Pages

If interested, I am more than happy to discuss further with anyone who wants to hear the gory details.

Thanks for reading and enjoy the WebRTC revolution.

Sam Dutton says:

November 12, 2012 at 10:55 am

Nice article!

Shameless plug, but HTML5 Rocks also has a fairly detailed WebRTC article, from a JavaScript perspective: http://www.html5rocks.com/en/tutorials/webrtc/basics/.
Cullen Jennings says:

November 12, 2012 at 11:56 am

I’d like to point out another great resource is the book by Alan Johnston and Dan Burnett.

You can find it at

http://webrtcbook.com/
Gonzalo Salgueiro says:

November 12, 2012 at 3:55 pm

Terrific article. Great overview for those of us on the fringes of WebRTC. Thanks Suhas!!
Gonzalo Gasca says:

November 19, 2012 at 12:37 am

Great Article Suhas,
This helps a lot as we are working in a prototype for WebRTC, and would like to use Ikran now. I assume this FireFox implementation uses H.264 + SRTP (SDES) correct?
- Suhas Nandakumar says:
  
  November 19, 2012 at 1:03 pm
  
  Hi Gonzalo .
  
  Thanks .. Firefox Implementation mentioned above supports VP8 and DTLS/SRTP instead of H.264 and SDES/SRTP. This is due to following reasons
  1. DTLS/SRTP is a mandatory IETF requirement.
  2. There is no decision made on the mandatory to implement (MTI) Video codec at the IETF yet. VP8 was open-sourced as part of the webrtc.org project. Its supported in the Firefox as part of webrtc.org tree pulled in.
  
  On the other hand, ikran is not maintained any more, since the time we pulled it into the Firefox code-base. But it does provide H.264 support.
  
  Hope this answers your questions.
Gonzalo Gasca says:

November 20, 2012 at 3:12 am

Hi Suhas,

Thanks, that answered my question, as you mentioned I see H.264 functionality support for FF, our main interest is the ability to do the following with FF:

1. Obtain local media
2. Setup a connection between the browser and the peer (In this case we have a TelePresence Server 8710 MCU – H.264 SRTP(SDES)/DTLS)
3. Attach the media channels.

We have the Websockets server (registrar) + B2BUA (connection to SIP proxy and MCU) in place, but at this point we lack code in FF to place the call via H.264.

FF — ws –> (Registrar + B2BUA) — sip –> Proxy –> MCU

We used Ericsson Bowser and webrtc4all plugin with no luck (webrtc4all may work by using SDP Editor) but we would like to know if there is something similar project to Ikran so by just opening my Firefox browser and register the endpoint we can start placing calls.

Thanks
Ramsundar Kandasamy says:

December 25, 2012 at 5:38 am

WebRTC will be much more interesting when it comes to mobile.

Don’t forget to try WebRTC on Ericsson Bowser (android and ios). It supports G711 and H264 codecs for voice and video respectively.

Here is what we (at Ericsson) are doing in this space.

https://labs.ericsson.com/blog/bowser-the-world-s-first-webrtc-enabled-mobile-browser
Ramsundar Kandasamy says:

December 25, 2012 at 5:40 am

Hi,

WebRTC will be much more interesting when it comes to mobile.

Don’t forget to try WebRTC on Ericsson Bowser (android and ios). It supports G711 and H264 codecs for voice and video respectively.

Here is what we (at Ericsson) are doing in this space.

https://labs.ericsson.com/blog/bowser-the-world-s-first-webrtc-enabled-mobile-browser

Comments are closed.

Open at Cisco

WebRTC – Bringing Real Time Communications to the Web Natively

8 Comments

Open at Cisco

WebRTC – Bringing Real Time Communications to the Web Natively

8 Comments

CONNECT WITH CISCO

LET US HELP