I love the feeling of unboxing a brand new IP phone, specifically, when it’s one that comes from Digium. Yes, I’m a little biased, I admit it – but I’ll do my best to refrain from dancing in the rain with this post.

So, during ITExpo 2017 (Ft. Lauderdale, Florida), Digium unveiled their new D65 Color Screen IP phone. Malcolm Davenport and the good people at Digium were inclined to send me a couple of phones for testing, which I was fairly happy to do – specifically due to the addition of the Opus Codec to the hardware.

If you are not familiar with Opus – you had most probably been living under a rock for the past 3-4 years. Opus is the codec that makes tools like Skype, Hangouts and others work so well. Unlike the traditional g7xx codecs, Opus is a variable bit rate codec, provides HD voice capabilities, has superior network conditions handling (via FEC) and in all – is a far better codec for any VoIP platform. You’re probably asking what is FEC? I’ll explain later.

Consistency and simplicity are a must – and Digium phones are both. One of the things I really like about Digium phones is that they are simple to configure, even without DPMA. The screens are identical to the previous models and are so tight together, that getting a phone up and running takes no longer than a few seconds.

Minor disappointment – the phones were shipped with a firmware that didn’t include the Opus codec – so I had to upgrade the firmware. Ok, no big deal there – but a minor nuisance.

So, I proceeded to get the phone configured to work with our Cloudonix.io platform. What is cloudonix.io? Cloudonix is our home-grown Real Time Communications Cloud platform – but that’s a different post altogether. This nice thing about Cloudonix is that it utilizes Opus to its full extent. Ranging from dynamic Jitter Buffering, Forward Error Correction across the entire media stack, Variable bit rate and sample rate support (via the Cloudonix.io mobile SDK) – in other words, if the Digium phones performs as good as the Cloudonix.io mobile SDK – we have a solid winner here.

So, I hooked the phone up and then proceeded to do some basic condition testing with Opus. All tests were conducted in the following manner:

  • Step 1: Connectivity with no network quality affects
  • Step 2: Introduction of 5% packet loss (using `neteq`)
  • Step 3: Introduction of 10% packet loss (using `neteq`)
  • Step 4: Introduction of 15% packet loss (using `neteq`)
  • Step 5: Introduction of 20% packet loss (using `neteq`)
  • Step 6: Introduction of 25% packet loss (using `neteq`)
  • Step 7: Extreme condition of 40% packet loss (using `neteq`)

Test 1: Media Relay and server located under 150mSec away

  • Step 1: Audio was perfect, HD Voice was exhibited all the way
  • Step 2: Audio was perfect, HD Voice was exhibited all the way
  • Step 3: Audio was good, HD Voice was exhibited all the way, minor network reconditioning at the beginning, till FEC kicks fully in
  • Step 4: Audio was good, SD Voice was exhibited all the way, minor network reconditioning at the beginning, till FEC kicks fully in
  • Step 5: Audio was fair, SD Voice was exhibited all the way, moderate network reconditioning at the beginning, till FEC kicks fully in
  • Step 6: Audio was fair, SD Voice was exhibited all the way, major network reconditioning at the beginning, till FEC kicks fully in
  • Step 7: Audio was fair, SD Voice was exhibited all the way, extreme network reconditioning at the beginning, till FEC kicks fully in

Test 2: Media Relay and server located under 250mSec away

  • Step 1: Audio was perfect, HD Voice was exhibited all the way
  • Step 2: Audio was perfect, HD Voice was exhibited all the way
  • Step 3: Audio was good, SD Voice was exhibited all the way, minor network reconditioning at the beginning, till FEC kicks fully in
  • Step 4: Audio was good, SD Voice was exhibited all the way, moderate network reconditioning at the beginning, till FEC kicks fully in
  • Step 5: Audio was fair, SD Voice was exhibited all the way, major network reconditioning at the beginning, till FEC kicks fully in
  • Step 6: Audio was fair, SD Voice was exhibited all the way, major network reconditioning at the beginning, till FEC kicks fully in
  • Step 7: Audio was fair, SD Voice was exhibited all the way, extreme network reconditioning at the beginning, till FEC kicks fully in

Test 3: Media Relay and server located under 450mSec away

  • Step 1: Audio was perfect, SD Voice was exhibited all the way
  • Step 2: Audio was perfect, SD Voice was exhibited all the way
  • Step 3: Audio was good, SD Voice was exhibited all the way, minor network reconditioning at the beginning, till FEC kicks fully in
  • Step 4: Audio was good, SD Voice was exhibited all the way, major network reconditioning at the beginning, till FEC kicks fully in
  • Step 5: Audio was fair, SD Voice was exhibited all the way, major network reconditioning at the beginning, till FEC kicks fully in
  • Step 6: Audio was fair, SD Voice was exhibited all the way, extreme network reconditioning at the beginning, till FEC kicks fully in
  • Step 7: Audio was fair, SD Voice was exhibited all the way, extreme network reconditioning at the beginning, till FEC kicks fully in

Ok, I was willing to accept the fact that if I’m able to carry a good audio call, for almost 3-4 minutes, while `neteq` was introducing a static 20% packet-loss condition – sounds like a winner to me.

All in all, till I get my hands on the Digium D80 for testing it’s Opus capabilities, the D65 is by far my “Go To Market” IP Phone for desktop Opus support – 2 thumbs up!

A recent post on the Elastix community page yielded the following:

Guayaquil, Ecuador, May 4th 2015 – PaloSanto Solutions, the company behind the Elastix Project, has established today a GIT repository for the development of a brand and distro agnostic GUI for unified communications servers. Additionally it conceives multi-tenancy.

The project will be independent and will start off with the mutiltenant GUI from Elastix MT. Currently this development is focused on using Asterisk as the telephony manager at its core, however, the project seeks to establish the basis for the inclusion of other projects that currently exist in the industry like FreeSwitch.

This project will be initially hosted and moderated by PaloSanto Solutions; it is an idea together with the PBX in a Flash team.

“We believe that by using Elastix GUI to kick off the project will help in assimilating it faster”, said PBX in a Flash’s CEO, Ward Mundy. “PaloSanto Solutions’ decision to create a GIT repo for development cooperation is also important because it will attract new actors to the project that will enhance its functionality”, he added.

“Since the release of Elastix MT we believed that there is the possibility to continue revolutionizing unified communications, and to establish tools that had not been integrated before”. Said Edgar Landivar, CEO of Elastix. “With the establishment of this repo we hope to establish a standard in the industry that will replace the concept of VoIP PBX”, he concluded.

The project is available today at https://github.com/elastixmt/elastix-mt-gui, and rules are being set regarding contributions to the development so that all interested people can join immediately.

Well, I think that I can understand where Elastix/PIAF are going with this. For a while now, I’ve seen various communications and rants roaming the community, regarding the way the FreePBX GUI is dominating the “Distro Market” – so to speak.

In general, I see this as a good thing, as it means that people will start re-focusing on technology. On the other side, we will end up – again – with the normal Open Source/Commercial debates (should we do? why shouldn’t we do?).

I know that people are going to jump me for what I’m about to say right now, but since the introduction of Asterisk 13 and ARI, the gaps between Asterisk and FreeSWITCH are rapidly closing in my book. I’ve already deployed several systems using ARI – and I don’t use a GUI at all. FreePBX in my book had become a large set of code, assembled partially by people who know what they are doing and partially not. With the Sangoma buy-out, I suspect that we’ll see more and more “Sangoma Centric” modules and features – and that’s normal. Same will apply to the Palo Santo alternative, as it becomes more and more acceptable by the market.

Is it truly the destiny of an Open Source product? to become dominated by their creators? to become a mere vessel of their creators to market their solutions and services. A recent conversation I had with a prospective client went somewhat sour, with me at a loss of answers to a very simple question. The client asked me: “Is there a FreePBX high availability solution? – one that is off the shelf”, my responses were these:

  • You can use the Schmoozecom High Availability solution
  • You can use the Digium High Availability solution
  • You can use the Xorcom High Availability solution

He replied – “What? isn’t there an Open Source alternative? something that had been tested and verified?” – my answer was: “We can always use Linux-HA, Heartbeat, mond and other Open Source tools to create the solution. But it won’t be as slick and neat as the commercial solutions, simply because that requires time.

About 9 years ago, Digium had a product called “Asterisk Business Edition”. The business edition was a highly regressed version of Asterisk, with slightly less features, aimed at being an Enterprise Grade product to work against. Digium realized fairly fast that the product had no place and had abandoned it several years later. Open source is about choice, make your choice, aid others is making a choice – but don’t take our freedom for choice. In my view, the distributions should target themselves to be “Market Aggregators”, not “Market Shapers” or “Market Makers” – this is not the stock market and no wolves den. The Asterisk Exchange is a good idea, if Digium would turn that into something that is embedded into AsteriskNOW and allows people to buy and install, directly from the UI, that would be a winner.

The Apple Store and Google Play became a success not because Apple or Google promoted them, they were made into success by the developers and people who wrote new things. Asterisk, FreeSwitch, Kamailio, OpenSIPS, FreePBX, Asterisk GUI – at least as see it, should be regarded as platforms for innovation, the things that make us go: “Hmmm…. interesting”, the immortal “Facsinating” comment made Mr. Spock.

Just my 0.02$ on the issue…

Managers! Project Managers, Sales Managers, Marketing Managers, Performance Managers – they are all obsessed! – Why are they obsessed, because we made them that way – it’s our fault! Since the invention of the electronic spreadsheet, managers relied on the same tools for making decisions – charts, tables, graphs – between us, I hate Excel or for that respect, any other “spreadsheet” product. Managers rely on charts to translate the ever complex world we live in, into calculable, simple to understand, dry and boring numbers.
Just to give a rough idea, I have a friend who’s a CEO of a high-tech company in the “social media” sector. He knows how to calculate how much every dollar he spent on Google adwords, is translated back into sales. He is able to tell me exactly how much a new customer cost him and he is very much capable of telling these number just like that. When we last met he asked me: “Say, how do you determine your performance on your network? is there proper and agreed upon metric you use?” – it got me thinking, I’ve been using ASR and ACD for years, but, have we been using it wrong?

So, the question is: What is the proper way of calculating your ASR and ACD? and is MoS a truly reputable measure for assessing your service quality.

Calculating ACD and Why is MoS so biased

ACD stands for Average Call Duration (in most cases), which means that it is the average call duration for answered calls. Normally, an ACD is a factor to determine if the quality of your termination is good – of course, in very much empirical manner only. Normally, if you ask anyone in the industry he will say the following: “If the ACD is over 3.5 minutes, your general quality is good. If your ACD is under 1 minute, your quality is degraded or just shitty. Anything in between, a little hard to say. So, in that respect, MoS comes to the rescue. MoS stands for Mean Opinion Score – in general terms in means, judging from one side of the call, how does that side see the general quality of the line. MoS is presented as a float number, ranging from 0 to 5. Where 0 is the absolute worst quality you can get (to be honest, I’ve never seen anything worse than 3.2) and 5 represents the best quality you can get (again, I’ve never seen anything go above 4.6).

So, this means that if our ACD is anything between 1 minute and 3.5 minutes, we should consult our MoS to see if the quality is ok or not. But here is a tricky question: “Where do you monitor the quality? – the client or the server? the connection into the network? or the connection going out of the network? in other words, too many factors, too many places to check, too much statistical data to analyse – in other words, many graphs, many charts – no real information provided.

If your statistical information isn’t able of providing you with concise information, like: “The ACD in the past 15 minutes to Canada had dropped 15 points and is currently at 1.8 minutes per call – get this sorted!”, then all the graphs you may have are pointless.

Calculating ASR and the Release Cause Forest

While ISDN (Q.931) made the question of understanding your release cause fairly simple, VoIP made the once fairly clear world into a mess. Why is that? Q.931 was very much preset for you at the network layer – SIP makes life easier for the admin to setup his own release causes. For example, I have a friend who says: “I translate all 500 errors from my providers to a 486 error to my customers” – Why would he do that? why in gods name would somebody deliberately make his customers see a falsified view of their termination quality – simple: SLA’s and commitments. If my commitment to a customer would be for a 90% success service level, I would make sure that my release causes to him won’t include 5XX errors that much. A SIP 486 isn’t an error or an issue, the subscriber is simply busy – what can you ask more than that?

As I see it, ASR should be calculated into 3 distinct numbers: SUCCESS, FAILURE and NOS (None Other Specified). NOS is very much similar to the old Q.931 release of “Normal, unspecified” – Release Cause 31. So what goes where exactly?

SUCCESS has only one value in to – ANSWER, or Q.931 Release cause 16 – Normal Call Clearing

FAILURE will include anything in the range of 5XX errors: “Server failure”, “Congestion”, etc.

NOS will include the following: “No Answer”, “Busy (486)”, “Cancel (487)”, “Number not found (404)”, etc

Each one of these should get a proper percentage number. You will be amazed at your results. We’ve implemented such a methodology for several of our customers, who were complaining that all their routes were performing badly. We were amazed to find out that their routes had 40% success, 15% failure and 45% NOS. Are we done? not even close.

The NOS Drill Down

Now, NOS should drilled down – but that analysis should not be part of the general ASR calculation. We should now re-calculate our NOS, according to the following grouping:

“BUSY GROUP” – Will include the number of busy release codes examined

“CANCEL GROUP” – Will include the number of cancelled calls examined

“NOT FOUND” – Will include any situation where the number wasn’t found (short number, ported, wrong dialing code, etc)

“ALL OTHERS” – Anything that doesn’t fall into the above categories

This drill down can rapidly show any of the below scenarios:

  • BUSY GROUP is not proportional – Normally will indicate a large amount of calls to similar destinations on your network. Normally, may indicate one of the following issues:
    • It’s holiday season and many people are on the phone – common
    • You have a large number of call center customers, targeting the same locations – common
    • One of your signalling gateway is being attacked – rare
    • One or more of your termination providers is return the wrong release code – common
  • CANCEL GROUP is not proportional – Normally will indicate a large number of calls are being canceled at the source, either a routed source of a direct source. Normally, may indicate one of the following issues:
    • You have severe latency issues in your network and your PDD (Pre Dial Delay) had increased – rare
    • Your network is under attack, causing a higher PDD – common
    • You have a customer originating the annoying “Missed Call” dialing methodology – common
    • One of your termination providers has False Answer Supervision due to usage of SIM gateways – common when dialing Africa
  • NOT FOUND GROUP is not proportional – Normally will indicate a large number of calls are being rejected by your carriers. Normally, may indicate one of he following issues:
    • One of your call center customers is using a shitty data list to generate calls – common
    • One of your call center customers is trying to phish numbers – common
    • One of your signalling gateways is under attack and you are currently being scanned – common
    • One of your upstream carriers is returning the wrong release code for error 503 – common

So, now the ball is in the hands of the tech teams to investigate the issue and understand the source. The most dangerous issues are the ones where your upstream carrier will change release causes, as these are the most problematic to analyse. If you do find a carrier that does this – just drop them completely, don’t complain, just pay them their dues and walk away. Don’t expect to get your money’s worth out of them, the chances are very slim for that.

 

 

During this years’ Asterisk Developers’ Conference, one of the subjects I’ve raised an issue for Asterisk is: “Federating Multiple Asterisk Instances”. Now, for the seasoned Asterisk user/developer, the answer would be simple – use Kamailio/OpenSIPS for that scalability, and use Asterisk as a Media Gateway or application server.

But I ask the following: “What if we could federate Asterisk without the need for an external component? What if we could federate Asterisk in such a way where our users aren’t event aware of the federation process, and it’s fully autonomous? What would actually be required in order to do that?”

I’m normally confronted with these questions on a day to day basis, looking at the problem from different angles – thinking to myself: “Ok, I know the normal box here – but where are the outer limits? what can I do to make it more robust on one hand, without truly making a mess out of it.”

A federated database is defined as: “A federated database system is a type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized. Since the constituent database systems remain autonomous, a federated database system is a contrastable alternative to the (sometimes daunting) task of merging several disparate databases. A federated database, or virtual database, is a composite of all constituent databases in a federated database system. There is no actual data integration in the constituent disparate databases as a result of data federation.” – http://en.wikipedia.org/wiki/Federated_database_system

So, we would like to virtually create a “map-reduce” functionality for Asterisk? can we truly create a map-reduce’ish functionality for Asterisk? should it be internal? should it be external?

In order to accomplish this, we are required to create a federator – a device capable of handling the information regarding each users, device, trunk, provider and other wise SIP/IAX2 entity connected to our system. The federator for all practical purposes is a data store, be it a key-value store, a database, a shared memory environment or some other form of data distribution layer.

Here are some key issues that true federation may be required to tackle:

  1. Geo-Position Agnostic – A truly federated system should render services identically across the board, regardless of where the user is located.
  2. Services Agnostic – A truly federated system doesn’t care if the user is connected to an Asterisk server version 12 or 13, it should behave identically.
  3. Version Agnostic – A truly federated infrastructure can leverage older version and even other software, without changing the underlying federation layer.
  4. Predictable Scalability – A truly federated infrastructure will allow for growth to be planned linearly, with discrete measure methods.

So, you want a tip on how to start federating your systems? here’s step number 1 – there is no central registry, there is no SIP proxy, there is only the cloud and the services it renders. Start thinking from this point and see where you go.

As some of you know, over the past 9 months, I’ve been heavily involved in the establishment of Humbug. For those who may not know, Humbug is a Call Analytics and Fraud Analysis SAAS. Now, differing from many of the current telephony SAAS projects, we are not based on Amazon EC2 or some other public cloud infrastructure, we build our own cloud environment. Why do we build our own cloud? simple, we need to keep your data secured and confidential. At Humbug, we see ourselves as a cross between Google Analytics – in our ability to analyze and handle data and Verisign – in our security and confidentiality requirements and methodologies.

Question be asked, why do people trust Verisign to provide SSL certificates around the world. What makes Verisign’s CA better than a privately owned CA – the answer is simple, it’s a third party 2 entities can entrust at the same time. Humbug aims to provide the same lever of trust, simply because we regard your data as sacred and valuable.

Since about 2 months ago, we’ve been contacting various Asterisk integrators around the world, inviting them to evaluate Humbug services. Now, while some integrators and vendors were somewhat reluctant, others were more than happy to join. We now have over 250 monitored systems around the world, with system being monitored and analyzed in Israel, USA, UK, Brazil and more.

The thing that amazed me in regards to some of the integrators who decided not to participate was that they claimed: “we provide our customers our own brew of fraud analysis service, we don’t require your SAAS”. Now, while I can accept the fact that an integrator would offer such a SAAS as an in-house service, I can’t see why a customer would rely on these services. In my view, relying on your integrator to provide fraud analysis services is like relying on the integrator of your alarm system to provide hired guard services – it just doesn’t make any sense to me. Why doesn’t it make sense? in Hebrew we say: “Go prove that you have a sister”. Imagine that your PBX integrator offer you such a service, then, in some obscure manner, your PBX gets hijacked and you get slammed with 50K$ worth of phone calls to Somalia. Now, your integrator would say: “Hmmmmm… that’s odd, we didn’t even get those CDR events to our system… you really got hacked bad…” – sure, if you only rely on CDR records to do your analysis (which is what 99.9% of integrators do). There is much much much much more to fraud analysis than just CDR analysis – if it all began and finished with CDR analysis, then by far Cvidya, Verint, NICE and many others would have been made redundant.

Allowing your integrator to provide you with fraud analysis SAAS is like putting the fox to guard the hen house, when things louse up (and they may), he’s the first one to bail out saying: “It’s not my fault”.

Humbug takes a totally different approach to fraud analysis, specifically, in the way we regards the various PBX systems and integrators. We are vendor agnostic and integrator agnostic – we will provide you with the clear and concise information you require in order to make an educated decision as to how you were de-frauded (if de-frauded) and provide you a faster alerting and response time. Our recent adventures had lowered our fraud alert response time from 60 minutes, down to 14 minutes in some cases. Most fraud analysis system carry a 24-36 hour turn around time, by that time, you can be out of 50K$ – our aim is to lower that number to no more than a 100$ in the worst case. Ambitious? yes, down right crazy? probably so, but we always say: “Aim for the moon, you’ll land on a star!” – so we know we’ll get there.