What you really need to know about Video Conferencing Systems

How do I choose a Video Conferencing system?

Well, you could just buy the same as the person you want to have a conference with. However, this might not be the best solution. There are essentially two types of systems, proprietary and standards based. If the person you want to have a conference with uses a proprietary system, then you must buy the same as them, or persuade them to buy something different.

There are many questions that must be answered as you steer your way towards identifying which is the best Video Conferencing system that meets your needs. This document is about standards based systems and is intended as a guide to their selection. It lists the issues that should be considered and gives an overview of the various networks used by video conferencing systems.

The following technical papers are also available to provide more information:

International Telecommunications Union & The Internet Engineering Task Force.

Global telecommunications standards are set by the United Nations agency, International Telecommunications Union (ITU) and the Internet Engineering Task Force (IETF). Products that adhere to these standards allow users to participate in a conference, regardless of their platform. These standards for video conferencing ensure compatibility on a worldwide basis. The ITU has developed the H, G and T Series of standards whilst the IETF has developed Real-Time Protocol (RTP) & Resource Reservation Protocol (RSVP). These standards apply to different transport media.

Content and topics covered in this paper.

The table below lists the media networks, technologies and topics covered within this paper.

  • Update on where we are today
  • ISDN
  • 802.11 a/b/g/n Wireless
  • ADSL, SDSL & Internet
  • Location of Participants
  • Expection Levels - HD or SD
  • History of Video Compressions
  • System Type - Personal or Room
  • System Management
  • Network, Infrastructure & Devices
  • LAN, WAN, VPN & Intranet
  • POTS
  • 3G/4G mobile
  • Number of Participants
  • Available Bandwidth
  • Acceptable Quality
  • Which Network - IP or ISDN?
  • Installation & Usage Costs

Update on where we are today.

The speed and worldwide availability of ADSL and the Internet along with the national telephone companies, has virtually stopped the demand, availability and use of both POTS and more recently ISDN as a direct means of connecting video conferencing systems. In their place we now have Fast ADSL (Fibre), Cable (Fibre) and the media-enabled 3G/4G smartphones and tablets as well as next generation of Codecs and Gateways to transcode the new protocols.

You also need to be aware of new and emerging standards that might have an impact on what you purchase. The latest video compression used by video conferencing systems is H.264 and its derivatives H.264 High-Profile and H.264 SVC. As a guideline, basic H.264 offers twice the quality over its predecessor H.263 at the same bandwidth, or the same quality at half the bandwidth. H.264 High-Profile has even higher performance and the latest H.264 SVC is scalable and more flexible across networks. So if you are restricted in the available bandwidth, take a look at systems that support the latest video compressions.

However, H.264 SVC is still essentially proprietary with vendors such as Polycom, Radvision and Vidyo each having their own flavour of H.264 SVC. Hence for interoperability, the typical fall-back being to use H.264 High-Profile.

Interestingly, Yealink are now supporting H.265/HEVC (High Efficiency Video Coding); the latest ITU-T standard and one of several  potential successors to H.264 AVC. By comparison to H.264 AVC, H.265/HEVC has approximately double the data compression ratio, hence offering a considerable improvement in video quality at the same bitrate. H.265/HEVC also supports up to 8192x4320 resolution.

Furthermore, Microsoft have published a document that defines UCConfig Modes which relate to the various scalable layers found in H.264 SVC. In the latest Skype for Business, Microsoft is using the H.264 SVC technology developed by Polycom.

As a side note, Microsoft's proprietary RTV video codec used in Lync 2010 and Lync 2013 has now given way to Micosofts H.264 SVC in the latest Skype for Business.

There's also been changes in the way which H.323 endpoints support data sharing; with the development of the H.239 (Dual Video) standard and 'data-showing' being favoured and replacing the old T.120 'data collaboration' standard. H.239 defines how additional media channels are used and managed by video conferencing systems. It introduces the concept of 'data-showing', whereby the PC desktop is digitised and converted into a separate video stream and transmitted in parallel with the main 'talking heads' video stream - hence the term Dual Video.

Endpoints that support H.239 will receive the dual streams and display the desktop graphics and far-end video in separate windows. Endpoints that don't support H.239 will display the shared desktop graphics instead of the far-end video whilst data is shared and revert back to displaying the far-end video when sharing is stopped.

Generic SIP endpoints typically use BFCP, which like H.329, effectively sends the shared desktop or application as a second video stream that the receiving endpoint then displays in either a second window or in place of the 'talking heads' video whilst the sharing is active.

We use the term Generic SIP to identify devices that adhere to the SIP standard as opposed to devices that use Microsoft's version of SIP, which for clarity we call (MS-SIP).

When two endpoints establish a BFCP connection, they must determine which endpoint will act as a floor control server, then the other will act as a floor control client for that specific stream. If there are two streams, then again one endpoint must act as the floor control server, but it does not have to be the same endpoint for each stream.

In the meantime, as mentioned above, Microsoft have developed their own version of SIP; MS-SIP, and use it as the main signalling protocol within Lync 2013 and Skype for Business.

Microsoft have also developed their proprietary Remote Desktop Protocol - RDP and use it within Lync 2013 and Skype for Business for Application Sharing between clients. RDP is an extension of the ITU-T T.128 standard for Multipoint Applications Sharing that sits under the T.120 umbrella standard that was originally used by Microsofts NetMeeting application.

The main limitations of using RDP are low frame rate and high bandwidth consumption. To address these, Microsoft has recently developed Video based Screen Sharing - VbSS as an alternative method for Desktop Sharing. VbSS is supported by the latest Skype for Business 2016 client from (16.0.4266.1003) found within Office 2016. 

As other SIP and H.323 standards based videoconferencing endpoints typically DO NOT understand RDP or VbSS, this presents many challenges, especially if you want to share the Skype for Business clients applications with other SIP or H.323 uesrs.

Sharing the SIP or H.323 endpoints Desktop or Applications with Skype for Business or Lync 2013 clients is a lesser issue as these typically use BFCP or H.329 that effectively sends the Desktop or Application as a second video stream that the Skype for Business or Lync client can understand and hence display.

Network, Infrastructure & Devices.

Before you start, it is useful to understand what types of networks are available and used by video conferencing systems. The following are popular media transport networks used in video conferencing:-

  • ISDN - Integrated Digital Services Network
  • LAN - Local Area Network
  • WAN - Wide Area Network
  • VPN - Virtual Private Network
  • 802.11 a/b/g/n - Wireless Network
  • POTS - Plain Old Telephone Service
  • ADSL - Asynchronous Digital Subscriber Lines
  • SDSL - Synchronous Digital Subscriber Lines
  • 3G/4G - Mobile Network

They all have strengths and weaknesses that should be considered carefully before deciding upon which to use. Please take a look at the diagram below that tries to shows the networks, infrastructure and devices used in H.323, SIP and H.320 standards based videoconferencing.

Choose the endpoints that represent your requirements and follow the links that join them; this should give an indication of the infrastructure devices and their functionality that you may need.

Integrated Digital Services Network (ISDN).

As mentioned earlier, many national telephone companies have virtually stopped, or will soon stop, supporting ISDN. Hence, we will only provide a quick overview about ISDN and how it was used by endpoints within the H.320 video conferencing standard.

There are two available ISDN connections, Basic Rate Interface (BRI) and Primary Rate Interface (PRI). Essentially, a BRI provides two 64kbps B-channels and one 16kbps D-channel. In Europe, a PRI provides 30 x 64kbps B-channels and two 64kbps D-channel - total 2048kbps; whilst in North America a PRI provides 23 x 64kbps B-channels and one 64kbps D-channel, giving at total 1544kbps.

ISDN connections usually aggregate the BRI and share the same number for both B-channels. Known as ISDN-2, this provides a line speed of 128kbps is typically used by desktop video conferencing Systems over ISDN. For increased bandwidth, ISDN-6 provides a line speed of 384kbps and is typically used by group or room-based video conferencing Systems over ISDN. With ISDN-6, the sequence in which the lines are aggregated must be known and adhered too! Furthermore, if the connection is going to use some form of 'switch', this must be configured to pass both voice and data!

The ISDN connections are usually directly into the video conferencing system and it uses the H.221 framing protocol and adheres to the H.320 standard. Less common is to use an ISDN Dial-Up modem that effectively transmits IP over ISDN, in which case the video conferencing system would have to following the H.323 standard.

In the past, most H.320 conferences would have been between just two participants as ISDN is essentially a point-to-point connection. However, multipoint technology now makes it possible for groups of people to participate in a conference and share information. To hold a multipoint conference over ISDN, participants must use either a dedicated ISDN Multipoint Control Unit - MCU that connects and manages all the ISDN lines, or an endpoint with an embedded H.320 multipoint capability such as the now discontinued Polycom HDX 8000 or Lifesize Team 220 with optional ISDN connectivity.

H.320 is the ITU standard for ISDN conferencing and includes H.264, H.263, H.261 video; G.711, G.722, G.722.1, G.728 audio; H.239, T.120 data and H.221, H.231, H.242, H.243 control.

LAN, WAN, VPN and Intranet.

100 Mbps LANs with switches and routers are used in most companies today and these have enough bandwidth to support desktop conferences. With a LAN offering significantly more bandwidth than ISDN, the video quality within a conference is much higher and can approach that of HD television. Technology as also helped, we now have communications advancements such as Gigabit Ethernet (1000 Mbps), Faster Switches as well as Fibre Asynchronous Digital Subscriber Lines (ADSL), Synchronous Digital Subscriber Lines (SDSL), 802.11 b/g/n wireless and 4G Mobile networks that have increased the available bandwidth, whilst IP Multicasting (routers permitting) has reduced network loading in conferences involving more than two endpoints.

Unlike ISDN networks, LANs, WANs, VPNs, the Intranet, the Internet, ADSL, SDSL, Wireless, 3G/4G Mobile networks and Cloud solutions all use TCP/IP protocol and the H.323 (and SIP) standard defines how to assemble the audio, video, data and control (AVDC) information into an IP packet. Most companies use DHCP and allocate dynamic IP addresses to PC's. Therefore, in order to correctly identify a user, the H.323 endpoints are usually registered with a Gatekeeper and 'called' into a conference by their H.323 ID or alias. The Gatekeeper translates the H.323 ID or alias into the corresponding IP address. Another method of identifying H.323 users is for them to register their presence using Light Directory Access Protocol (LDAP) with a Directory Service such as Microsoft's Windows Active Directory or the freely available OpenLDAP.

To hold a multipoint conference over a TCP/IP network, H.323 (and SIP) systems require a Multipoint Conference Server (MCS). This is also referred to as an H.323 Multipoint Control Unit (H.323 MCU). This is not the same as an H.320 MCU; hence it is important to be clear about what you mean when using the term MCU.

To hold a multipoint conference over IP, participants must use either a dedicated Multipoint Control Unit such as the ClearOne Collaborate VCB connected to the IP network, an endpoint with an embedded H.323 multipoint capability such as the Polycom RealPresence Group 500 or subscribe to the Lifesize Cloud service that includes a multipoint capability.

H.323 is the ITU standard for LAN conferencing and includes H.264, H.263, H.261 video; G.711, G.722, G.722.1, G.722.1C, G.723.1, G.728, G.729, MPEG4 ACC-LD audio; H.239, T.120 data and H.225, H.245 control.

Wireless 802.11 a/b/g/n networks.

Standards based 802.11 a/b/g/n wireless networks are readily available forms of transport media for home, company and travelling users. With 802.11 b/g and now 802.11 n routers giving transmission speeds of up to 108 Mbps, there is sufficient bandwidth available to support audio, video and data sharing across wireless networks, especially when used in conjunction with the latest compression techniques and technologies.

Like with LAN above, H.323 is the ITU standard for conferencing across wireless networks.

Plain Old Telephone Service (POTS).

Twenty five years ago, the standard telephone system was and probably the most readily available form of transport media for home users. With V.92 dial-up modems giving transmission speeds of up to 56kbps, there was just about sufficient bandwidth available to support audio, video and data sharing with this media, especially when used in conjunction with the latest CPU's etc.

However, British Telecommunications (BT), like other national carriers, have ceased supporting access via the use of such dial-up TCP/IP connections and hence POTS for video conferencing is neither disirable or available.

H.324 was the ITU standard for POTS conferencing and includes H.263 video; G.723.1 audio; T.120 data and H.223, H.245 control.

ADSL, SDSL & Internet (including Cloud).

With its ever increasing popularity, people have sought to use the Internet in more ways than just a means of sending email or browsing interesting sites.

Like LANs, ASDL and SDSL are other forms of TCP/IP networks for accessing the Internet and hence can be used as a transport media in video conferencing systems. Both ADSL (including Fibre ADSL) and SDSL use a modem and router (or a router with built-in modem) in order to gain access to the Internet. What each user should do is get their Internet Service Provider (ISP) to provide them with a fixed Public IP address.

Alternatively, users could register their presence with a Dynamic DNS Service Provider such as DynDNS.org. But after rebooting, a modem that is allocated a dynamic IP address could then be allocated a different IP address and any changes will take time to propagate through the DNS Service before these changes are recognised.

This is how you know or determine the address of the endpoint that you want to conference with.

Obviously, the overall speed of the conference is limited to that of the slowest link. Basic ADSL, whilst being faster than ISDN, is only as fast as the slowest uplink when used for video conferencing. Although a basic ADSL may be quoted as being 2.0 Mbps, this is usually the download speed and in reality, you can only achieve an upload speed of something in the 384-512 kbps range due to contention. Hence SDSL is much better as the uplink speed matches the downlink and is faster. However, if available in your region, Fibre ADSL offers download speeds of 38.0 Mbps and upload speeds in the 6-8 Mbps ranges, so even better and easily fast enough for any video conferencing.

Again, users should get their DSL Service Provider to provide them with a fixed Public IP address for their video conferencing system, which should be located either behind their Firewall or Proxy using an internal IP address, within their Firewalls DMZ (De-Militarised Zone) or outside on the Internet. Otherwise, too many Firewall Ports will have to be opened in order to provide access, which defeats the objectives of having a Firewall.

H.323 is the ITU standard used by ASDL & SDSL for Internet conferencing and includes H.264 SVC, H.264 High Profile, H.264, H.263 video; G.711, G.722, G.722.1, G.722.1C, G.723.1, G.728, G.729, MPEG4 ACC-LD audio; H.239, T.120 data and H.225, H.245 control.

3G/4G mobile networks (and Public WiFi Hotspots).

The 3G/4G cellular mobile data networks (and Public WiFi Hotspots) are a readily available form of wireless delivery and with the media-enabled Smartphones and Tablets, there is sufficient bandwidth to enable IP-based multipoint audio and video conferencing to existing H.323 video conferencing systems when used in-conjunction with the latest Gateways and MCU's that also support these new protocols.

With greater coverage, the faster 3G and even faster 4G data networks have made using 3G-324M, an extension by the 3rd Generation Partner Project (3GPP) and 3rd Generation Partner Project2 (3GPP2) to the ITU H.324M standard for 3G mobile phone conferencing, defunct.

Like with LAN and 802.11 wireless, H.323 is the ITU standard for conferencing across 3G/4G mobile data networks.

Location of Participants.

H.323 (IP) and H.320 (ISDN) systems can interoperate with the use of an H.320 Gateway. Essentially, the H.320 Gateway provides translation and transcoding between different circuit-switched networks (ISDN) and packet-based networks (LAN, ASDL, SDSL, Wireless, 3G/4G), enabling the endpoints to communicate. Most H.320 Gateways have multiple BRI or PRI connections and can support several conferences simultaneously. For example, the latest Polycom ISDN Gateway can simultaneously support multiple calls.

Most H.320 Gateways work in conjunction with or require a basic H.323 Gatekeeper.

The first question that starts the process of identifying your video conferencing system is concerned with who and where are the people that you want to conference with. It is either a network or security issue that determines how the participating platforms are going to be connected together and hence which is the applicable standard that you should consider following.

As indicated above, there are two main standards used in video conferencing, H.323 (IP based) or H.320 (ISDN). However, the decision might not be yours. It may simply be the case that you have to use H.320 based ISDN for security reasons. But this is very rare and you are most likely to use some form of H.323 (IP based) system.

After you have decided which standard you want to adhere too, you can start looking at the platform, price and performance equation.

Do you want to conference just within your organisation, with suppliers or with the world?

If it's just within your organisation, do you have an existing LAN, WAN, VPN or Intranet? If so, then you already have a network in place that can be used for VC. Look at the section titled H.323 based Video Conferencing systems. Otherwise you need to create a corporate LAN, WAN or VPN, or look at alternative network types. Look at the section titled ADSL, SDSL & the Internet? and Wireless 802.11 a/b/g/n networks.

If it's to suppliers, are they on your corporate LAN, WAN or VPN? If so, then you already have a network that can be used. Look at the section titled H.323 based Video Conferencing systems. If not, then you need to extend the corporate LAN, WAN or VPN to include your suppliers or look at adding alternative network types. Look at the section titled ADSL, SDSL & the Internet? and 3G/4G mobile networks (and Public WiFi Hotspots).

If it's a complete mixture of different networks, including anybody anywhere with 3G/4G Smartphones or Tablets, then you should consider subscribing to the Lifesize Cloud.

As mentioned, if you are using IP and another participant is using ISDN, then you will need a Gateway and maybe an MCU that supports these devices.

Another very important consideration if you are IP conferencing between sites is security and the need to traverse any firewalls. Depending on the number of endpoints behind a firewall, you may need additional infrastructure and an appropriate Firewall Traversal solution.

Number of Participants.

To host a conference with three or more participants, H.323 systems require either dedicated a Multipoint Control Unit (H.323 MCU) or an endpoint with a built-in multipoint cabability. It really depends on the number as most endpoints with a built-in multipoint are limited to 6-8 participants.

The H.323 MCU's basic function is to maintain all the audio, video, data and control streams between all the participants in the conference and hence most H.323 MCU's use propriety or dedicated hardware. ClearOne's Video Conference Bridge, Collaborate VCB, is an all-in-one solution that includes an embedded ClearOne Collaborate Central Gatekeeper and a high-definition MCU capable of allowing Ad-Hoc Conferencing in both Continuous Presence or Voice-Activated Switching modes.

Alternatively, small groups who need a multipoint capability could use an endpoint with an embedded MCU capability. The Polycom RealPresence Group 500 has a 6-way embedded multipoint option that supports itself and up to 5 other sites in either a Voice-Activated or Continuous Presence session.

But realistically you still need to use an H.323 Gatekeeper when there are multiple H.323 endpoints as it allows you to use just one public IP address to the Internet, adds an extra layers of security by authenticating the Users and allowing you to hide the endpoints on the internal network, as well as helping with NAT/Firewall Traversal.

In general, a dedicated MCU will support several simultaneous sessions, more participants, higher bitrates, more screen layout options and more features than an embedded MCU option found in some endpoints.

Expectation Levels.

A crucial area in choosing a system is to discuss and then set the expectations levels of the users to be attainable. What is a realistic frame rate and window size for the available bandwidth will most probably be much lower than that expected by the users. However, for the users to get the most out of a video conferencing system, they must be content when using it. Their expectation levels must be aligned to what is realistic from the available systems.

Don't expect full 1080p high-definition video via a basic ADSL connection, the numbers just don't add-up!

Required versus Available Bandwidth.

Video conferencing is a form of communications involving the transfer information between two or more locations. The connection between these locations is the communications channel and is called the network. It is the network loading in terms of required bandwidth that needs to be considered. Bandwidth is the resource of a network. It is the term given to the rate of transfer of information, usually in kilobits per second (kbps); it is like the speed limit of the network and cannot be exceeded.

Analogies can be made between speed and bandwidth. If you want to know how long it would take to travel 1000 km when you are limited to 50 km/h; then a simply calculation of 1000 divided by 50 shows that it would take 20 hours. Likewise, if you wanted to transfer a megabit of data across a network with an available bandwidth of 128 kbps, it would take 8 seconds.

Available bandwidth is a limiting factor with conferencing and sending video creates lots of data. Consider a typical 720p video image size of 1280x720 pixels; then this represents 921600 pixels of information per single frame. Now consider how the color of each pixel is represented; this typically uses the 4:2:0 format. Finally, consider how many frames per second that you want to see; this is called frame rate. The human eye perceives 25 frames per second as continuous motion, but HD 720p video typically uses 60 frames per second.

The diagram below gives an indication of just how much more data is required to be sent (bigger area) per frame depending upon the resolution of the video image. 

Bearing these figures in mind, it's easy to see how a raw (uncompressed) HD 720p60 video stream can be around 1.0 gigabits per second. It is clear from this example that video can place enormous demands on the network and hence why available bandwidth is a limiting factor with video conferencing systems.

There are essentially two ways of reducing the impact of this limitation. One is to use a faster method of communications that increases the available bandwidth to the conference (eg. Fast Fibre ADSL); the other is to utilise methods that reduce the amount of data required to be transmitted by using systems that support the latest video compression technologies.

By applying H.264 Baseline Profile compression to an HD 720p60 video stream, the 1.0 gigabits can now be reduced to about 1024 kilobits per second (kbps). But by applying the very latest H.264 High Profile compression instead, the video stream can be reduced to an impressive 512 kbps - a 2000:1 reduction in the raw stream.

History of Video Compressions.

H.261 was the original ITU-T developed standard used in Video Conferencing. This was quickly followed by H.263 in 1995. After this, the ITU-T Video Coding Experts Group (VCEG) started work on a short-term effort to add extra features, (H.263 v2) and a long-term effort to develop a new standard that offers higher video compression efficiency and better resilience from packet and data loss. A standard that will significantly outperform H.263, with more features and support higher quality at low-bitrates.

In 2001, the ISO Motion Picture Experts Group (MPEG) recognised the potential of this ITU-T development and formed the Joint Video Team (JVT) that included people from MPEG and VCEG. The result is two identical standards: ISO MPEG4 Part 10 and ITU-T H.264, with the official name Advanced Video Coding (AVC).

The H.261, H.263 and H.264 algorithms are all designed to use and incorporate motion prediction as well as lossy compression techniques to further reduce the amount of information to be transmitted. Whilst H.261 and H.263 images are also limited to CIF and QCIF sizes, H.264 can support full 1080 high-definition images and graphics at WXGA resolution when used in H.239 data streaming.

The basic technique of motion prediction works by sending a full frame followed by a sequence of frames that only contain the parts of the image that have changed. Full frames are also known as 'key frames' or 'I-frames' and the predicted frames are known as 'P-frames'. Since a lost or dropped frame can cause a sequence of frames sent after it to be illegible, new 'I-frames' are sent after a predetermined number of 'P-frames'. It is the combination of both lossy compression and motion prediction that allows H.261, H.263 and H.264 systems to achieve the required reduction in data whilst still providing an acceptable image quality.

There is little functional difference between the elements of H.264 and those of the earlier H.261 and H.263 standards. The changes that do make the difference lie mainly in the detail within each element, how well the algorithm is implemented and whether it is performed in hardware or software.

With hundreds of experts involved in creating H.264, there were many options. Some being simpler and immediately implemented, whilst others were much more complex, but still included. Hence H.264 was organised into four profiles; Baseline, Extended, Main and High. Baseline is the simplest and uses 4:2:0 chrominance sampling and splits the picture into 4x4 pixel blocks, processing each block separately. Baseline uses Universal Variable Length Coding (UVLC) and Context Adaptive Variable Length Coding (CAVLC) techniques which have a big impact on the network bandwidth. Virtually all vendors support H.264 Baseline and some are now also supporting H.264 High Profile.

H.264 High Profile is the most powerful and efficient. This is achieved by using Context Adaptive Binary Arithmetic Coding (CABAC) encoding. High Profile also uses adaptive transformations to decide 'on-the-fly' how to split the picture into blocks - 4x4 or 8x8 pixels. Areas of the picture with little detail use 8x8 blocks whilst more complex and detailed areas use 4x4 blocks. 

Vendors are now introducing H.264 SVC (Scalable Video Coding) into their products. H.264 SVC is the latest adaptive technology that delivers high quality video across networks with varying amounts of available bandwidth. Formerly known as H.264 Annex G, H.264 SVC promises to increase the scalability of video networks.

The above diagram clearly shows that in stark contrast to other H.264 AVC family members (including H.264 High Profile) with which video endpoints send one stream for every resolution, frame rate and quality, H.264 SVC enabled video endpoints send just one stream that contains multiple layers of all the resolutions (spatial), frame rates (temporal) and quality depending upon what the endpoints and network can support. This approach allows for 'scalability' as each endpoint can determine which layers of video it needs without any additional encoding or decoding. The choice of video layers is independent and does not effect other endpoints. It also allows each endpoint to gracefully degrade the video quality when it or network gets busy.

However, the H.264 SVC codec is only part of the interoperability equation as it also involves networking components such as signalling and error correction, which are not currently included in the standard. Hence, H.264 SVC is still essentially proprietary with vendors such as Polycom, Radvision and Vidyo each having their own flavour of SVC. Eventually, a complete standardised version of H.264 SVC will emerge that will offer true interoperability. But until then, you need to stick with the same vendor across the endpoints.

Acceptable Quality.

There are several steps that can be taken to reduce the amount of data that has to be transmitted when conferencing. The obvious combination is to use the smallest acceptable resolution (window size) with the minimum acceptable frame rate. This determines the 'raw' volume of data before applying any compression to further reduce the overall amount of data; but these have a crucial effect on quality.

Limiting the actual video resolution (number of pixels) has a direct impact on the graininess and hence quality of the video. And reducing the number of frames per second has a direct effect on the smoothness and hence quality of the video. So there are compromises to be had when setting the minimum resolution and minimum frame rate to be used. Once these are set, it depends on the efficient of the video compression algorithm used that will determine the required bandwidth. But which compression algorithm used is determined by the video conferencing systems when they do the initial 'handshake' or Capability Exchange. You might have the latest and greatest system that supports H.264 High Profile, but if the other system only supports H.264 Baseline, then they will use H.264 Baseline.

System Type - Personal or Room.

There is a major difference in the usage concept between Personal and Group video conferencing systems. Group systems are usually in a specific room that has to be reserved in order to schedule when they can be used. This can be restrictive and takes away the spontaneity of using video conferencing. Furthermore, Group systems usually have a PTZ (Pan, Tile & Zoom) camera with remote controller and their own specific Graphical User Interface GUI) that needs to be learnt and navigated, hence they tend to be used by only a small number of people.

On the other hand, Personal video conferencing systems are either PC based and use the familiar Windows® GUI, or available as 'apps' for Smartphones and Tablets. However, neither are usually left running in the background in an always-on mode. If there is going to be a large uptake of video conferencing, then it must be always available and easy to use. You need to exploit the spontaneity of the occasion in order to get the most from video conferencing. The concept is like that of using the telephone. It's always there and easy to use. Likewise, video conferencing systems should be configured to be 'always-on' and answerable. The availability of video conferencing 'apps' on Smartphones and Tablets should help on a personal level in BYOD (bring your own device) situations, but companies need to roll out video conferencing to the desktop for mass deployment.

Which Network?

The decision on which networs to use is essentially a trade between essentialy requirements, quality, cost and topology. At the high performance, medium cost end there is Fibre ADSL, whilst at the low performance, low cost end there is the 3G/4G mobile option. Or a combination of both that will require either On-Premise infrastructure or a Cloud solution.

The big question here is 'What is an acceptable window size and frame rate'? If it is 5-10 fps at QCIF, then the 3G/4G mobile option with a Smartphone, Tablet or low cost H.323 system will provide a solution. However, most professional people will demand stable images at much higher HD resolution and at much higher frame rates and as such, Personal or Group systems are the solution. These systems can now achieve HD resolution (1280x720) images at 25 fps with low-bitrates, especially if they use the latest H.264 video compression.

You might be able to use Fast ADSL, especially if you are within range of a fibre enabled exchange. With ADSL, you share the service with other users on the exchange and there may be some contention, but a Fast Fibre ADSL would typically provide 20-25 Mbps download and 5-6 Mbps upload depending upon the distance from the exchange. Remember that you will need a fixed IP address and will need to overcome any security or firewall issues.

System Management.

Although the H.323 standard describes the Gatekeeper, as an optional component, it is in practice an essential tool for defining and controlling how voice and video communications are managed over the IP network. Gatekeepers are responsible for providing address translation between LAN aliases and IP addresses, call control and routing services to H.323 endpoints, system management and security policies. These services provided by the Gatekeeper in communicating between H.323 endpoints are defined in RAS; Registration/Admission/Status.

Gatekeepers provide the intelligence for delivering new IP services and applications. They allow network administrators to configure, monitor and manage the activities of registered endpoints, set policies and control network resources such as bandwidth usage within their H.323 zone. Registered endpoints can be H.323 Systems, H.320 Gateways or Multipoint Control Units (MCU's).

Only one Gatekeeper can manage a H.323 zone, but this zone could include several Gateways and MCU's. Since a zone is defined and managed by only one Gatekeeper, endpoints within a zone that also contain a built-in Gatekeeper (e.g. Gateway or MCU) must provide a means for disabling this functionality. This ensures that multiple H.323 endpoints can all be configured into the same zone and be controlled by a more powerful Gatekeeper such as that within the Polycom DMA or utilise the PBX like features of the comprehensive ClearOne's Collaborate Central, (previously known as Media Xchange Manager - MXM).

Installations and Usage Costs.

For H.323 based systems, the installation costs need to cover any upgrades to the network infrastructure such as faster switches and better routers.

You should consider network security and if a NAT/Firewall Traversal solution such as the Edgewater EdgeProtect is required. This single box solution provides security services and features such as SIP/H.460 far end NAT Firewall Traversal, H.323 Gatekeeper, SIP Registrar, User Authentication and Provisioning for Polycom RealPresence Group, Desktop & Mobile clients.

Finally, you should consider who and how to manage the network, endpoints and infrastructure. This needs to be done in-house if you deploy your own On-Premise systems. Alternatively, if you have no internal IT Support, then you may want to consider a subscription based solution such as Lifesize Cloud that includes User Authentication, firewall traversal, business-class security and data encryption.

C21 Video can advise, supply and support whichever system is appropriate to your needs. For more information and help in choosing the right system, please email: info@c21video.com