Read my words – 3500 concurrent channels with Asterisk!

One of the biggest questions in the world of Asterisk is: “How many concurrent channels can be sustained with an Asterisk server?” – while many had tried answering the question, the definitive answer still alludes us. Even the title of this post says “3500 concurrent channels with Asterisk” doesn’t really say much about what really happend. In order to be able to understand what “concurrent channels” really means in the Asterisk world, let us take a look at some tests that were done in the past.

Asterisk as a Signalling Only Switch

This scenario is one of the most common scenarios in the testing world, and relies upon the basic principle of allowing media (RTP) to traverse from one end-point to the other, while Asterisk is out of the loop regarding anything relating to media processing (RTP). Examine the following diagram from one of the publicly available OpenSER manuals:

Direct Media Path between phones via a SIP Proxy

Direct Media Path between phones via a SIP Proxy

As you can see from the above, the media path is established between our 2 SIP endpoints.

This classic scenario had been tested in multiple cases, with varying codec negotiations, varying server hardware, varying endpoints, varying versions of Asterisk – no matter what the case was, the results were more or less the same. Transnexus had reported being able to sustain over 1,200 concurrent channels in this scenario, which makes perfect sense.

Why does it make sense? very simple, as Asterisk doesn’t manage or mangle RTP packets, Asterisk performs less work and the server also consumes less resources.

Asterisk as a Media Gateway

Another test that people had done numerous times is to utilize Asterisk a Media Gateway. People used it as a SIP to PSTN gateway, SIP to IAX2 gateway, even as a SIP to SIP transcoder gateway. In any case, the performance here varied immensly from one configuration to another, however, they all relied on a simple call routing mechanism of routing calls between endpoints and allowing Asterisk to handle media proxy tasks and/or handle codec translation tasks.

Depending on the tested codec, I’ve seen reports of sustain over 300 concurrent channels of media on a single server, while other claim for around the 140 concurrent channels mark – this again mostly relied on various hardware/software/network configurations – so there is nothing new in there.

These tests tell us nothing

While these tests are really nice in the theoretical plane of thinking, it doesn’t really help us in the design and implementation of an Asterisk system – no matter if it is an IVR system, a PBX system or a time entry phone system for that matter – it simply doesn’t provide that kind of information.

The Amazon EC2 performance test

In my previous post, Rock Solid Clouded Asterisk, I’ve discussed the various mathmatics involved in calculating the RoI factors of utilizing Cloud computing. One thing the article didn’t really tell us, did it really work?

Well, here are some of the test results that we managed to validate:

  • Total number of Asterisk based Amazon EC2 instances used: 24
  • Total number of concurrent channels sustained per instances (including media and logic): 80
  • Average length of call: 45 seconds
  • Total number of calls served: 2.84 Million dials
  • Test length: approximately 36 hours

According to the above data, each server was required to dial an approximate 3300 dials every hour. So, let’s run the math again:

  • 3300 Diales per hour
  • 55 Dials per minute
  • As each call is an average of 45 seconds, this means that each gateway generates 20 calls
    per second, and within 4 seconds fills the 80 channels limit per server.

According to the above numbers that we’ve measured, each of the Amazon EC2 instances used was utilized to about 50% of its CPU power, while consuming a load average of 2.4, which was mostly caused by I/O utilization for SIP and RTP handling.

Conclusion

When asking for the maximum performance of Asterisk, the question is incorrect. The correct question should be: “What is the maximum perfromance of Asterisk, utilizing X as the application layout?” – where X is the key factor for the performance. Asterisk application performance can vary immensly from one application to another, while both appear to be doing the exact same thing.

When asking your consultant or integrator for the top performance, be sure to include your business logic and application logic in the Asterisk server, so that they may be able to better answer your question. Asterisk as Asterisk is just a tools, asking for its performance is like asking how many stakes a butcher’s knife can cut – it’s a question of what kind’a steaks you intend on cutting.

Sangoma USBfxo: too little, too late…

Sangoma recently introduced a new FXO product, the USBfxo. The USBfxo is a dual FXO port device, connected to your Asterisk server via a USB connection. Now, while I do admire the way Sangoma keeps trying to kick it up a notch with new products, but isn’t Sangoma a little late to jump on the USB train?

Sangoma USBfxo Device

Sangoma USBfxo Device

Xorcom had been in this business for 4 years now and I see no reason why would the Sangoma product be any better than the Xorcom product. In addition, if Sangoma is targeting their product at the very low-end PBX systems, in my book, they actually missed the product line. In my view, if Sangoma wants to put a proper USB device on the market, it should have a minimum of 4 ports on it, 3 FXO and 1 FXS. You are probably wondering why I’m propsing such a weird combo, well, the reason is simple – Fax machines and they yet to be improved Asterisk FAX capabilities, and the fact that people still use FXS port of physical fax machines. I’m one of the biggest Asterisk and VoIP promoters I know, and even I use a physical fax machine at some points in time. True I used Hylafax and IAXmodem to receive most of my fax transmissions, but when it comes to sending faxes, nothing beats a physical machine.

So, as I started saying, Sorry Sangoma, too little, too late … better luck next time!

Rock Solid Clouded Asterisk

This post is somewhat a combination of posts from previous posts, mainly, the posts about virtualization and my latest posts about the utilization of Amazon EC2. As some of you may know, a part of what I do at GreenfieldTech is develop various API’s for the Asterisk Open Source PBX systems. Two of these API’s are the IVR API and the Dialer API. This post if called “Rock Solid Clouded Asterisk” as it will describe the latest production environment that I’ve implemented, using these API’s and Amazon EC2 virtualization framework.

The network diagram

Our implementation consisted of the following general schematic:

Network Diagram

Network Diagram

The application logic was based upon a JAVA based web-service, implementing the XML-RPC server side of the IVR API, and a dialer management system that controlled the dialer API located on the remotely located dialers – hosted on Amazon EC2 instances. For simplicity, and we were very much aware this would reduce the overall capacity, we’ve located both the dialer framework and the IVR API execution on each of the servers, while allowing the server s to communicate internally.

Some constraints

As much as we wanted to run many Amazon AMI instances, we were limited to running 5 elastic IPs with a single Amazon AWS account. As a result, we’ve registered 5 accounts, and executed a total of 24 AMI instances with 24 elastic IP’s.

An additional constraint we had realised, but had no way of actually knowing its limitation was the actual number of concurrent calls per server. Initially, we’ve reached the following numbers and configuration on a physical server:

  • Intel Quad Core XEON
  • 2GB RAM
  • 1GB Network Uplink
  • CentOS 5.2 64bit
  • Total capacity: 120 concurrent calls of Dialer+IVR on a single server

Per our theory, if we managed to reach a similar capacity using amazon c1.medium instances, we would be very happy.

The results

After conducting a test utilizing a single AMI instance, we’ve reached the following results:

  • Dual Core instance (c1.medium)
  • 180GB Disk Storage
  • 8GB of RAM
  • Fedora Core 8 32bit
  • Total capacity: 80 concurrent calls of Dialer+IVR on a single instance

A decrease of 33% in comparison to the performance observed on a physical server. Ok, so we weren’t all that happy with these results, until we started doing the financial math, realising that using Amazon EC2 with that Dialer+IVR framework would yield a savings of almost 80% in operational costs.

Doing the math

The normal co-located option

Our aim was to reach a capacity of around 2800 concurrent channels. Per the normal physical setup, our hardware requirements would be to use at least 24 servers. At a price of 1500$ per server, that sums up to a total of 36,000$. Adding the time required to install 24 servers, the overall expense for 24 servers would be around the 42,000$ mark. To sustain a total of 2800 concurrent calls, using the g711 codec, we would be required to carry a total of 300Mbps internet uplink – basically talking about 10,000$ of bandwidth.

So, taking all of the above into consideration, we will need a total of 52,000$ just to maintain the hardware installation and operational cost. Taking into consideration that the system would be used at full for no more than a period of 30 hours, we end up with a total of: 1733$ per hour.

The Amazon EC2 option

Now, let’s calculate for Amazon EC2:

2800 concurrent channels translates into 35 instances. Price per c1.medium instance per hour is 0.2$. So, rack that up and you get: 210$ for operating 35 instances for 30 hours.

Elastic IP costs are 0.01$ per hour per server – a total of 10.5$ for 30 hours.

Bandwidth costs are 0.17 per each GB, so according to 300Mbps for 30 hours, with each call duration at 1 minute sums up to be: 5M of data per call. Calculating 2800 concurrent channels for 30 hours gives: 25,200,00 MB, or 25TB of traffic. According to Amazon, first 10TB are at 0.17$ per GB, and then the price goes down. So, let’s take a worst case of 0.17$ per GB. A total of 4284$ for operating 30 hours.

A total of: 4,468 US Dollars, Price per hours calculated at: 148$.

The savings

Per the task at hand, the utilization of Amazon EC2 yielded a savings of 92%

So, is Amazon EC2 good for any usage?

The answer is a definite NO! If your requirement is for a system that works 24×7, like a PBX system or a call center, then your utilization of Amazon EC2 would be identical to leasing a co-located server at any of the world wide co-location providers. If your application is of sporadic nature, or is utilized for short bursts of time, Amazon EC2 is a wonderful tool for lowering your overall expenses. Sure, it will require some work to get running, but the overall savings is more than worth-while.

Virtualizing Asterisk – Digium Asterisk World, Feb 2008

Well, I just got back from the ITExpo show in Miami, Florida. I have to admit that I really enjoyed the venue, although I didn’t really have time to walk the floor. The main reason that I was unable to walk the floor was due to the fact that I gave a talk, as part of the Digium Asterisk World venue, which was co-located with TMCnet’s ITExpo.

My talk was about the possibilities and incentives for Virtualizing Asterisk using VMWARE and Amazon EC2. Following below is the presentation that I gave.

Virtualizing Asterisk – Presented at Digium Asterisk World, Feb 2008, Miami, Florida

So long SigValue – Hello Asterisk + EC2!

As some of you may know, I’ll be attending the ITExpo in Miami Beach, Florida. The subject I’ll be lecturing about is “Virtualizing Asterisk”. However, I have to be honest, I really need to change the subject to be called “Asterisk in the Cloud“.

Ever since the introduction of Amazon EC2, people had been trying to get Asterisk to run properly inside an EC2 instance. While installing a vanilla Asterisk on any of the Fedora/RedHat variant instances in EC2 isn’t much of a hassle, getting the funky stuff to work is a little more tricky.

One of these tricky bits (which I hadn’t yet found a solution for) is the issue of supplying a timer for Asterisk’s MeetMe application. In the old days (prior to Asterisk 1.6), Asterisk required the utilization of a virtual timer driver, provided by Zaptel in the past and now the DAHDI framework. The problem is, that while you are fully capable of compiling and installing DAHDI on an Amazon EC2 instance – the problem starts once you want to use it.

A few words about Amazon EC2

For those not familiar with Amazon EC2, its general infrastructure is based upon the XEN virtualization project. XEN is a para-virtualization framework, meaning that is performs some of the work utilizing the underlying Operating System kernel and some of the work performed with a special Kernel in the virtualized Operating System instance. This poses an interesting issue with every type of application that relies on hardware resources and their emulation.

To learn more about the XEN project, go to http://www.xen.org.

So, where’s the big deal?

So, if you can compile your code and run it in an instance, as long as you have the kernel headers and kernel source packages – you should be just fine – right? WRONG!

Amazon EC2 deploys its own Kernel binary image upon bootstrap, causing what ever compilation you may have done to the Kernel to go away (unless you’re creating a machine from real scratch). Another issue is a version skew between the installed Operating System kernel modules, the actual kernel and the installed compiler. For example, the instance that I was using had the XEN capable kernel compiled with gcc version 4.0.X, while the installed operating system was gcc version 4.1.X – so, no matter what I did to compile my kernel modules or binary kernel, I would always end up in a situation where loading the newly compiled kernel modules will generate an error.

Did I manage to solve it? – NOT YET. I’m still working on it, and I have to admit, that considering the fact that I have over 10 years of Linux experience and had compiled kernels from scratch many times, this one has gotten me a little baffled – I guess I’ll just need a few more nights and a case of Red-Bull to crack this one open.

So, what can we do with EC2?

In my view, EC2 + Asterisk is the ultimate IN/NGN services environment – and I have proof of that. A recent lab test that I did with one of my customers showed a viable commercial alternative to Sigvalue when using Asterisk and EC2 structures. The main reason for our belief in using EC2 was the following graph:

IN/NGN usage over 24 hours

IN/NGN usage over 24 hours

What we’ve noticed was that while our IN/NGN system was generating traffic, it’s general usage showed peak usage for a period of 2.5 hours, with a gradial increase and decrease over a period of almost 10 hours. Immediately that led us to a question: “Can we use Amazon EC2 to provide an automatd scaling facility for the IN/NGN system, allowing the system to reduce its size as required?”

To do this, we’ve devised the following IN/NGN system:

Amazon EC2 Enabled IN/NGN Platform

Amazon EC2 Enabled IN/NGN Platform

Our softswitch would have a static definition of routing calls to all our Asterisk servers, including our EC2 instances which had static Elastic IP numbers assigned to these. The EC2 Controller server was incharge of initiating the EC2 instances at the pre-defined times, mainly, 30 minutes prior to the projected increase in traffic. Once the controller reaches its due timer, it will automatically launch the EC2 instances required to sustain the inbound traffic.

For our tests, we’ve initiated 5 AMI instances, using the EC2 c1.medium instance. This instance basically includes 2 cores of an AMD opteron, about 8GB of RAM and about 160GB of Hard drive – more than enough. Initially, we’ve started spreading the load evenly across the servers, reaching about 80 concurrent channels per instance, and all was working just fine. We managed to reach a point where we were able to sustain a total of about 110 concurrent channels per instance, including the media handling – which is not too bad, considering that we are running inside a XEN instance. The one thing that made the entire environment extremely light weight is the GTx Suite of APIs for Asterisk. Thanks to the GTx Suite of APIs, scalability is fairly simple, as all application-layer logic is controlled from a central business logic engine, serving the Asterisk servers via an XML-RPC based web service. Thanks to Amazon, practically infinite, bandwidth allocation – the connections from the Asterisk servers to the US based central business logic was set at a whopping 25mSec, thus, there was no visible delay to the end user.

It is clear that the utilization of Asterisk and EC2 operational constructs can allow a carrier to establish their own IN/NGN environment. However, how these are designed, implemented and operated are at the hands of the carrier – and not a specific vendor. If the carriers around the world will take to this approach, time will tell. As a recent survey stated that 18% of the US PBX market is currently dominated by Open Source solution, having Digium dominate 85% of these 18% (~15%), I’m confident that we will see this combination of solutions in the near future.