Telephony Fraud – Further Analysis

Following yesterday’s post, I’ve decided to take another set of data – this time following the start of the year, with a specific data profile. What is the profile? I will describe:

  1. The honeypot server in this case was a publically accessible Kamailio server
  2. The honeypot changed its location and IP every 48 hours, over a period of 2 weeks
  3. The honeypot was always located in the same Amazon AWS region – in this case N.California
  4. All calls were replied to with a 200 OK, followed by a playback from an Asterisk server

In this specific case, I wasn’t really interested in the attempted numbers, I was more interested to figure out where attacks are coming from. The results were fairly surprising:

The above table shows a list of attacking IP numbers, the number of attempts from each IP number – and the origin country. For some weird reason, 97% of potential attacks originated in Western Europe. In past years, most of the attempts were located in Eastern European countries and the Far-East, but now this is Mainland Europe (Germany, France, Great Britain).

Can we extrapolate from it a viable security recommendation? absolutely not, it doesn’t mean anything specific – but it could mean one of the following:

  1. The number of hijacked PBX systems in mainland Europe is growing?
  2. The number of hijacked Generic services in mainland Europe is growing?
  3. European VoIP PBX integrators are doing a lousy job at securing their PBX systems?
  4. European VPS providers pay less attention to security matters?

If you pay attention to the attempts originating in France, you would notice a highly similar IP range – down right to the final Class-C network, that is no coincidence, that is negligence.

Now, let’s dig deeper into France and see where they are attempting to dial:

So, on the face of it, these guys are trying to call the US. I wonder what are these numbers for?

Ok, that’s verizon… let’s dig deeper…

Global Crossing? that is interesting… What else is in there???

 

So, all these attempts go to Landlines – which means, these attempts are being dialed most probably into another hijacked system – in order to validate success of finding a newly hijacked system.

Well, if you can give me a different explanation – I’m all open for it. Also, if any of the above carriers are reading this, I suggest you investigate these numbers.

 

 

Telephony Fraud – Still going strong

Who would believe, in the age of Skype, Whatsapp and Facebook – telephony fraud, one of the most lucrative and cleanest form of theft – is still going strong. Applications of the social nature are believed to be harming the world wide carrier market – and carrier are surely complaining to regulators – and for a legitimate reason. But having said that, looking at some alarming fraud attempt statistics, thing will show you a fairly different story.

So, analysing fraud is one of my things, I enjoy dropping honeypots around the world, let them live for a few days and then collect my data. My rig is fairly simplistic:

  1. A have a Homer (www.sipcapture.org) server to capture all my traffic
  2. A have an amazon AWS cloudformation script that launches up instances of Asterisk, FreeSwitch and Kamailio
  3. All instances are pre-configured to report anything back to Homer
  4. Upon receiving a call – it will be rejected with a 403

Why is this a good honeypot scheme? simple – it gives the remote bot a response from the server, making it keep on hitting it with different combinations. In order to make the analysis juicy, I’ve decided to concentrate on the time period between 24.12.2016 till 25.12.2016 – in other words, Christmas.

I have to admit, the results were fairly surprising:

  1. A total of 2000 attacks were registered on the honeypot server
  2. The 2 dominant fraud destinations were: The palestinian authority and the UK
  3. All attacks originated from only 5 distinct IP numbers

Are you wondering what the actual numbers are? Here is the summary:

Row Labels 185.40.4.101 185.62.38.222 195.154.181.149 209.133.210.122 35.166.87.209 Grand Total
441224928354 19         19
441873770007       204   204
76264259990     1     1
17786514103         2 2
972592315527   1774       1774
Grand Total 19 1774 1 204 2 2000

As you can see, the number 972592315527 was dailed 1774 from a single IP – 185.62.38.222. I can only assume this is a botnet of some sort, but the mix of IP numbers intrigued me. So, a fast analysis revealed the following:

Amsterdam? I wonder if it’s a coffee shop or something. The thing that also intrigued me was the phone number, why would the bot continue hitting the same mobile phone number? I couldn’t find any documentation of this number anywhere. Also, the 97259 prefix automatically suggests a mobile number in the PA, so my only conclusion would be that this is a bot looking for a “IPRN” loop hole – which is again fraudulent.

So, if this what happens in 48 hours – you can imagine what happens over a month or a year.

DISCLAIMER:

The above post contains only partial information, from a specific server on a network of worldwide deployed honeypots. The information provided as-is and you may extrapolate or hypothesize what it means – as you see fit. I have only raised some points of discussion and interest.

Should you wish to join the lively discussion on HackerNews, please follow this link: https://news.ycombinator.com/item?id=13354693 for further discussion.

 

 

 

Goodbye Elastix – we will miss you

Last week marked a sad point in the history of Open Source, the highly acclaimed and established Asterisk distribution was taken down from the Internet, leaving all of its users, followers, eco-system, resellers, integrators and more with a gigantic void to be filled.

While the void will be filled at some point, I can’t but help but observe the joy and cheerfulness of the proprietary telecommunications industry, as 3CX had rapidly taken over the Elastix business in such brutal manner. According to the various discussions in the Open Source community, the entire thing was cause by, a so called “violation of copyright” or “violation of IP” of some sort, within the Open Source communities. In the past, as far as I know, when various distributions or projects violated each other’s copyright, they would notify one another – and would ask to rectify the situation. Apparently, this hadn’t happened here – or if it happened, it wasn’t published in an open manner – as you would expect.

One of the things that the community started shouting was: “Elastix had been trixboxed”. Honestly, I don’t see the similarity between the two cases. When fonality acquired trixbox, they had a clear indication of where they are going. This is not 3CX acquired Elastix, this is 3CX obliterated Elastix. This is something completely different – and with major personas in the open source community indicating that a certain, well known and renowned, Open Source persona was involved in this happening, I can only be highly offended by the everlasting stench of people’s own ambition and personal hatred towards things that are not their own.

I admit it, I never really used Elastix in my projects, I found it to be bloated, inflated with software that shouldn’t be there, too slow for my taste and with a lack of proper project leadership, patches went in and out like crazy. Yet, I can’t argue with their success and the acceptance of the product around the world. I remember being at VoIP2Today in Madrid only a few weeks ago, and there were Elastix boxes sitting on tables. Yes, Elastix wasn’t my first choice for an Office PBX, but yes, they were a choice – the idea of a commercial company coming in and removing that choice off the table – is just annoying and troubling at the same time.

My hope is that some Elastix developers will simply post the entire source code to Github or some other public repository, slapping a BSD/MIT license on their work – telling the world: “Here is our creation, the proprietary daemons decided it should die – but no one can kill an idea!” – and Elastix will keep on living in the Open Source like other projects. If the world will forget it, then so be its fate – but if the world needs it, let the world take it in two hands and raise it up to the sky and say: “You shall not die!”

 

Statisics, Analytics – stop whacking off

Managers! Project Managers, Sales Managers, Marketing Managers, Performance Managers – they are all obsessed! – Why are they obsessed, because we made them that way – it’s our fault! Since the invention of the electronic spreadsheet, managers relied on the same tools for making decisions – charts, tables, graphs – between us, I hate Excel or for that respect, any other “spreadsheet” product. Managers rely on charts to translate the ever complex world we live in, into calculable, simple to understand, dry and boring numbers.
Just to give a rough idea, I have a friend who’s a CEO of a high-tech company in the “social media” sector. He knows how to calculate how much every dollar he spent on Google adwords, is translated back into sales. He is able to tell me exactly how much a new customer cost him and he is very much capable of telling these number just like that. When we last met he asked me: “Say, how do you determine your performance on your network? is there proper and agreed upon metric you use?” – it got me thinking, I’ve been using ASR and ACD for years, but, have we been using it wrong?

So, the question is: What is the proper way of calculating your ASR and ACD? and is MoS a truly reputable measure for assessing your service quality.

Calculating ACD and Why is MoS so biased

ACD stands for Average Call Duration (in most cases), which means that it is the average call duration for answered calls. Normally, an ACD is a factor to determine if the quality of your termination is good – of course, in very much empirical manner only. Normally, if you ask anyone in the industry he will say the following: “If the ACD is over 3.5 minutes, your general quality is good. If your ACD is under 1 minute, your quality is degraded or just shitty. Anything in between, a little hard to say. So, in that respect, MoS comes to the rescue. MoS stands for Mean Opinion Score – in general terms in means, judging from one side of the call, how does that side see the general quality of the line. MoS is presented as a float number, ranging from 0 to 5. Where 0 is the absolute worst quality you can get (to be honest, I’ve never seen anything worse than 3.2) and 5 represents the best quality you can get (again, I’ve never seen anything go above 4.6).

So, this means that if our ACD is anything between 1 minute and 3.5 minutes, we should consult our MoS to see if the quality is ok or not. But here is a tricky question: “Where do you monitor the quality? – the client or the server? the connection into the network? or the connection going out of the network? in other words, too many factors, too many places to check, too much statistical data to analyse – in other words, many graphs, many charts – no real information provided.

If your statistical information isn’t able of providing you with concise information, like: “The ACD in the past 15 minutes to Canada had dropped 15 points and is currently at 1.8 minutes per call – get this sorted!”, then all the graphs you may have are pointless.

Calculating ASR and the Release Cause Forest

While ISDN (Q.931) made the question of understanding your release cause fairly simple, VoIP made the once fairly clear world into a mess. Why is that? Q.931 was very much preset for you at the network layer – SIP makes life easier for the admin to setup his own release causes. For example, I have a friend who says: “I translate all 500 errors from my providers to a 486 error to my customers” – Why would he do that? why in gods name would somebody deliberately make his customers see a falsified view of their termination quality – simple: SLA’s and commitments. If my commitment to a customer would be for a 90% success service level, I would make sure that my release causes to him won’t include 5XX errors that much. A SIP 486 isn’t an error or an issue, the subscriber is simply busy – what can you ask more than that?

As I see it, ASR should be calculated into 3 distinct numbers: SUCCESS, FAILURE and NOS (None Other Specified). NOS is very much similar to the old Q.931 release of “Normal, unspecified” – Release Cause 31. So what goes where exactly?

SUCCESS has only one value in to – ANSWER, or Q.931 Release cause 16 – Normal Call Clearing

FAILURE will include anything in the range of 5XX errors: “Server failure”, “Congestion”, etc.

NOS will include the following: “No Answer”, “Busy (486)”, “Cancel (487)”, “Number not found (404)”, etc

Each one of these should get a proper percentage number. You will be amazed at your results. We’ve implemented such a methodology for several of our customers, who were complaining that all their routes were performing badly. We were amazed to find out that their routes had 40% success, 15% failure and 45% NOS. Are we done? not even close.

The NOS Drill Down

Now, NOS should drilled down – but that analysis should not be part of the general ASR calculation. We should now re-calculate our NOS, according to the following grouping:

“BUSY GROUP” – Will include the number of busy release codes examined

“CANCEL GROUP” – Will include the number of cancelled calls examined

“NOT FOUND” – Will include any situation where the number wasn’t found (short number, ported, wrong dialing code, etc)

“ALL OTHERS” – Anything that doesn’t fall into the above categories

This drill down can rapidly show any of the below scenarios:

  • BUSY GROUP is not proportional – Normally will indicate a large amount of calls to similar destinations on your network. Normally, may indicate one of the following issues:
    • It’s holiday season and many people are on the phone – common
    • You have a large number of call center customers, targeting the same locations – common
    • One of your signalling gateway is being attacked – rare
    • One or more of your termination providers is return the wrong release code – common
  • CANCEL GROUP is not proportional – Normally will indicate a large number of calls are being canceled at the source, either a routed source of a direct source. Normally, may indicate one of the following issues:
    • You have severe latency issues in your network and your PDD (Pre Dial Delay) had increased – rare
    • Your network is under attack, causing a higher PDD – common
    • You have a customer originating the annoying “Missed Call” dialing methodology – common
    • One of your termination providers has False Answer Supervision due to usage of SIM gateways – common when dialing Africa
  • NOT FOUND GROUP is not proportional – Normally will indicate a large number of calls are being rejected by your carriers. Normally, may indicate one of he following issues:
    • One of your call center customers is using a shitty data list to generate calls – common
    • One of your call center customers is trying to phish numbers – common
    • One of your signalling gateways is under attack and you are currently being scanned – common
    • One of your upstream carriers is returning the wrong release code for error 503 – common

So, now the ball is in the hands of the tech teams to investigate the issue and understand the source. The most dangerous issues are the ones where your upstream carrier will change release causes, as these are the most problematic to analyse. If you do find a carrier that does this – just drop them completely, don’t complain, just pay them their dues and walk away. Don’t expect to get your money’s worth out of them, the chances are very slim for that.

 

 

The natural evolution… of people and startups

It has been a while since I’ve written, I guess I didn’t have much to write about. Ok, I admit it, I had a shit load of stuff to write about, however, I simply never got around to actually do it. A full time job, two babies – or as they call it – real life, has somewhat managed to creep up to me and take its toll on my writing time. Over the course of the past few months, I’ve been heavily thinking about the role of the founder in a startup. Many people regard the founder of a startup as the CEO or some other key role in the company, but it’s not always like that. The question that I asked myself, and I have been for quite some time, is: “as a founder of a startup, what is the most important thing I need to know?” – I recently realized that the most important thing, is also the hardest thing to do.

As I see it now, the most important thing a founder needs to know how to do is: “When to walk away!” – Yes, you heard me right, understanding that you as a founder have to walk away from your creation is the hardest thing you’ll ever do. In many cases, walking away may seem to you like bailing out or giving up, but this is not the case. Sometimes, walking away means: “I’m entrusting my creation with someone else, simply because I can’t progress it any more”. It is very easy to become infatuated with your startup, it’s just like a baby. Initially, it’s just an idea, then after a long incubation time, suddenly it takes form. You hire people, slowly afterwards, you have a proof of concept, then you start selling, then different things can happen – some good, some bad. At that point in time, that’s a critical path, this is the time where you might find yourself at an impasse with your partners, investors, customers, VC’s, etc. What should you do? stay and fight? or just pack up and leave? – well, it all depends on your situation. I won’t detail the situation when you need to fight and stand your ground, as those are fairly easy. The harder ones relate to when to leave.

I would say, that if any of the following are upon you, it’s time to leave:

  • Your investors are pushing toward an illogical track, although, they have no idea how the market nor the niche works.
  • You ran into financial issues and leaving is a logical step in order to continue the life of the company.
  • You are at an impasse with the CEO or other members of management, the issue can’t be bridged.
  • You have lost trust with your partners.

At any of those, you must pack up and leave. The reasons are simple enough, all of those relate to direct trust between people. When trust is no longer there, it’s better to pack it up and move forward, simply because it won’t go to good places.