Managers! Project Managers, Sales Managers, Marketing Managers, Performance Managers – they are all obsessed! – Why are they obsessed, because we made them that way – it’s our fault! Since the invention of the electronic spreadsheet, managers relied on the same tools for making decisions – charts, tables, graphs – between us, I hate Excel or for that respect, any other “spreadsheet” product. Managers rely on charts to translate the ever complex world we live in, into calculable, simple to understand, dry and boring numbers.
Just to give a rough idea, I have a friend who’s a CEO of a high-tech company in the “social media” sector. He knows how to calculate how much every dollar he spent on Google adwords, is translated back into sales. He is able to tell me exactly how much a new customer cost him and he is very much capable of telling these number just like that. When we last met he asked me: “Say, how do you determine your performance on your network? is there proper and agreed upon metric you use?” – it got me thinking, I’ve been using ASR and ACD for years, but, have we been using it wrong?
So, the question is: What is the proper way of calculating your ASR and ACD? and is MoS a truly reputable measure for assessing your service quality.
Calculating ACD and Why is MoS so biased
ACD stands for Average Call Duration (in most cases), which means that it is the average call duration for answered calls. Normally, an ACD is a factor to determine if the quality of your termination is good – of course, in very much empirical manner only. Normally, if you ask anyone in the industry he will say the following: “If the ACD is over 3.5 minutes, your general quality is good. If your ACD is under 1 minute, your quality is degraded or just shitty. Anything in between, a little hard to say. So, in that respect, MoS comes to the rescue. MoS stands for Mean Opinion Score – in general terms in means, judging from one side of the call, how does that side see the general quality of the line. MoS is presented as a float number, ranging from 0 to 5. Where 0 is the absolute worst quality you can get (to be honest, I’ve never seen anything worse than 3.2) and 5 represents the best quality you can get (again, I’ve never seen anything go above 4.6).
So, this means that if our ACD is anything between 1 minute and 3.5 minutes, we should consult our MoS to see if the quality is ok or not. But here is a tricky question: “Where do you monitor the quality? – the client or the server? the connection into the network? or the connection going out of the network? in other words, too many factors, too many places to check, too much statistical data to analyse – in other words, many graphs, many charts – no real information provided.
If your statistical information isn’t able of providing you with concise information, like: “The ACD in the past 15 minutes to Canada had dropped 15 points and is currently at 1.8 minutes per call – get this sorted!”, then all the graphs you may have are pointless.
Calculating ASR and the Release Cause Forest
While ISDN (Q.931) made the question of understanding your release cause fairly simple, VoIP made the once fairly clear world into a mess. Why is that? Q.931 was very much preset for you at the network layer – SIP makes life easier for the admin to setup his own release causes. For example, I have a friend who says: “I translate all 500 errors from my providers to a 486 error to my customers” – Why would he do that? why in gods name would somebody deliberately make his customers see a falsified view of their termination quality – simple: SLA’s and commitments. If my commitment to a customer would be for a 90% success service level, I would make sure that my release causes to him won’t include 5XX errors that much. A SIP 486 isn’t an error or an issue, the subscriber is simply busy – what can you ask more than that?
As I see it, ASR should be calculated into 3 distinct numbers: SUCCESS, FAILURE and NOS (None Other Specified). NOS is very much similar to the old Q.931 release of “Normal, unspecified” – Release Cause 31. So what goes where exactly?
SUCCESS has only one value in to – ANSWER, or Q.931 Release cause 16 – Normal Call Clearing
FAILURE will include anything in the range of 5XX errors: “Server failure”, “Congestion”, etc.
NOS will include the following: “No Answer”, “Busy (486)”, “Cancel (487)”, “Number not found (404)”, etc
Each one of these should get a proper percentage number. You will be amazed at your results. We’ve implemented such a methodology for several of our customers, who were complaining that all their routes were performing badly. We were amazed to find out that their routes had 40% success, 15% failure and 45% NOS. Are we done? not even close.
The NOS Drill Down
Now, NOS should drilled down – but that analysis should not be part of the general ASR calculation. We should now re-calculate our NOS, according to the following grouping:
“BUSY GROUP” – Will include the number of busy release codes examined
“CANCEL GROUP” – Will include the number of cancelled calls examined
“NOT FOUND” – Will include any situation where the number wasn’t found (short number, ported, wrong dialing code, etc)
“ALL OTHERS” – Anything that doesn’t fall into the above categories
This drill down can rapidly show any of the below scenarios:
- BUSY GROUP is not proportional – Normally will indicate a large amount of calls to similar destinations on your network. Normally, may indicate one of the following issues:
- It’s holiday season and many people are on the phone – common
- You have a large number of call center customers, targeting the same locations – common
- One of your signalling gateway is being attacked – rare
- One or more of your termination providers is return the wrong release code – common
- CANCEL GROUP is not proportional – Normally will indicate a large number of calls are being canceled at the source, either a routed source of a direct source. Normally, may indicate one of the following issues:
- You have severe latency issues in your network and your PDD (Pre Dial Delay) had increased – rare
- Your network is under attack, causing a higher PDD – common
- You have a customer originating the annoying “Missed Call” dialing methodology – common
- One of your termination providers has False Answer Supervision due to usage of SIM gateways – common when dialing Africa
- NOT FOUND GROUP is not proportional – Normally will indicate a large number of calls are being rejected by your carriers. Normally, may indicate one of he following issues:
- One of your call center customers is using a shitty data list to generate calls – common
- One of your call center customers is trying to phish numbers – common
- One of your signalling gateways is under attack and you are currently being scanned – common
- One of your upstream carriers is returning the wrong release code for error 503 – common
So, now the ball is in the hands of the tech teams to investigate the issue and understand the source. The most dangerous issues are the ones where your upstream carrier will change release causes, as these are the most problematic to analyse. If you do find a carrier that does this – just drop them completely, don’t complain, just pay them their dues and walk away. Don’t expect to get your money’s worth out of them, the chances are very slim for that.