Deepviz += python | Deepviz SDK lands on PyPI

Today is another wonderful day here at our office, as it’s finally time to release our Python SDK library to provide users with a quick way to include our Deepviz technologies into already existing projects and platforms.

Since we went live in public beta with Deepviz in the last November, we mainly focused on the infrastructure to make it stable and able to handle peak loads. We are now able to successfully process 150.000 samples / day and the infrastructure is designed to scale as needed – the sky is the limit!

But this is not the only thing we worked on. We have now experimental support for 64bit PE files! You can upload 64bit PE files and they will be processed along with 32bit PE files.

One more thing: we worked hard to cleanup, refactor, optimize our set of REST APIs which allow everyone to quickly interact with Deepviz services and integrate our Threat Intelligence and Malware Analyzer services into existing platforms. Yet it is not enough, we want to make Deepviz integration as straightforward and painless as possible. Today we’re landing on the Python world with our python-deepviz library.

Python-deepviz is a free library released under MIT license and hosted on the PyPi repository. The library is still in beta – more functionalities will be added soon – but it will allow you to easily upload and download samples as well as retrieve analysis reports and, last but not least, play with our threat intelligence platform!

It is as simple as registering a free Deepviz account and installing the library using pip:

 

pip install python-deepviz

 

Once done, and once you have retrieved your API key from your account profile, you’re ready to go! While we’re in beta, all API keys have unlimited access to Threat Intel APIs, unlimited access to Sandbox analysis reports APIs, 500 sample submissions and 20 sample downloads per month.

Below are some examples of what you can do with our python SDK.

  • Upload a sample and wait for the scan to complete, then retrieve the report analysis:

from deepviz import sandbox
import hashlib
import time

API = "0000000000000";
sbx = sandbox.Sandbox()

_hash = hashlib.md5(open("malware.exe", 'rb').read()).hexdigest()

sbx.upload_sample(path="malware.exe", api_key=API)
result = sbx.sample_result(md5=_hash, api_key=API)

if result.status != 'success':
    print result.msg
else:
    while "No result found" in result.msg:
        print "not ready"
        time.sleep(30)
        result = sbx.sample_result(md5=_hash, api_key=API)

    print "Detection: %s" % result.msg['classification']['result']
    print "Accuracy:  %s" % result.msg['classification']['accuracy']

 

  • Or you may want to retrieve our full scan report:

from deepviz import sandbox

API = "0000000000000"

sbx = sandbox.Sandbox()
result = sbx.sample_report(md5="00000000000000000000000000000000",
                           api_key=API)
f = open("report.txt", "wb")
f.write(str(result.msg))
f.close()

 

  • But you can also download parts of the report, in this case you can use our filters:
"network_ip",
"network_ip_tcp",
"network_ip_udp",
"rules",
"classification",
"created_process",
"hook_usermode",
"strings",
"created_files",
"hash",
"info",
"code_injection"

And run the following query (just an example if we want to retrieve rules matched, network connections and classification):


API = "0000000000000"
sbx = sandbox.Sandbox()

_hash = "00000000000000000000000000"
result = sbx.sample_report(md5=_hash,
                           api_key=API,
                           filters=["rules", "network_ip", "classification"])
print result.msg

returning

{
	u'rules': [u'dropExe',
	u'badConnection',
	u'badIpUrlInStrings',
	u'dropDll',
	u'runDroppedExe',
	u'loadDll',
	u'loadImage',
	u'invalidPEChecksum',
	u'IESettings'],
	u'network_ip': {
		u'UDP': [],
		u'TCP': [u'183.61.19.194',
		u'42.120.226.92']
	},
	u'classification': {
		u'result': u'Malware',
		u'accuracy': 95.5
	}
}

About our Threat Intelligence SDK, you can query our database for domains, IP data as well as run generic and more detailed searches.

  • Let’s retrieve details for a specific IP and a specific domain:

 


from deepviz import intel
ThreatIntel = intel.Intel()
result = ThreatIntel.ip_info(api_key=API, ip=["1.22.28.94", "1.23.214.1"])
print result.msg


from deepviz import intel
ThreatIntel = intel.Intel()
result = ThreatIntel.domain_info(api_key=API, domain=["google.com"])
print result

 

  • However you can also retrieve the new domains registered in a specific time window and used by malware (in the last 3 days, in the following scenario):

from deepviz import intel
ThreatIntel = intel.Intel()
result = ThreatIntel.domain_info(api_key=API, time_delta="3d")

for domain in result.msg:
    print domain

giving back the following result:

avsystemcare.com
nlj.vc
ypx.uz
xgrhuccfyyyv.com
qbc.vc

 

  • You can also run a generic search based on strings, to retrieve all samples, IPs, domains related to a specific keyword:

from deepviz import intel
ThreatIntel = intel.Intel()
# Let's retrieve only 5 elements per category instead of all of them
result = ThreatIntel.search(api_key=API,
                            search_string="justfacebook.net",
                            start_offset=0, elements=5)
print result.msg

This results in:

{
	u'IP': [],
	u'MD5': [u'de14ac3e52078cc63d0cf565eda8e9ef',
	u'b1300a5a9967a36e33642be94a907679',
	u'4989f1a03b87890ac9378453cae841cb',
	u'8cc5eec871cf13ff815c9e6731b16874',
	u'a2a848fe914fd99b5d4da9cd5fed6b5f'],
	u'TLD': [u'net.net',
	u'thoughtarticle.net',
	u'justfacebook.net',
	u'memberarticle.net',
	u'stickagree.net']
}

 

  • Or let’s run an advanced search using parameters! Let’s search all samples connecting to the domain justfacebook[.]net and determined by our Malware Analyzer as malicious:

from deepviz import intel
ThreatIntel = intel.Intel()
ThreatIntel.advanced_search(api_key=API,
                            domain=["justfacebook.net"],
                            classification="M")
print result.msg

These are just quick examples about how you can use our Deepviz APIs. A more complete and complex example is the following:

  • let’s retrieve all domains registered in the last 7 days, then for each one of them let’s retrieve all the samples’s MD5s connecting to them and for each sample retrieve the matched behavioral rules:

from deepviz import intel, sandbox
API="0000000000"
ThreatIntel = intel.Intel()
ThreatSbx = sandbox.Sandbox()
result_domains = ThreatIntel.domain_info(api_key=API, time_delta="7d")
domains = result_domains.msg
for domain in domains.keys():
    result_listsamples = ThreatIntel.advanced_search(api_key=API, domain=[domain], classification="M")
    if isinstance(result_listsamples.msg, list):
        if len(domains[domain]['tag']):
            print "DOMAIN: %s ==> %s samples [TAG: %s]" % (domain, len(result_listsamples.msg), ", ".join((tag['key'] for tag in domains[domain]['tag'])))
        else:
            print "DOMAIN: %s ==> %s samples" % (domain, len(result_listsamples.msg))
        for sample in result_listsamples.msg:
            result_report = ThreatSbx.sample_report(md5=sample, api_key=API, filters=["rules"])
            print "%s => [%s]" % (sample, ", ".join((rule for rule in result_report.msg['rules'])))
    else:
        print "DOMAIN: %s ==> No samples found" % domain

this will return:

DOMAIN: ypx.uz ==> 1 samples [TAG: adware.downware, kazy]
a971afcbb74c30dd2d5832523c9d795f => [recentlyRegisteredDomainStrings, unknownHook, loadDll, dropDll, suspiciousSectionName, highEntropy, antiDebugging, invalidSizeOfCode, antiVM, invalidPEChecksum, sleep, IESettings]

DOMAIN: qbc.vc ==> 1 samples
2d711a1d8f25fb40120ceb38f4d41a98 => [recentlyRegisteredDomainStrings, suspiciousSectionName, highEntropy, antiDebugging, invalidSizeOfCode, antiVM, epLastSection, writeExeSections]

DOMAIN: avsystemcare.com ==> 8 samples [TAG: trojan.qhost, trojan.rbot, trojan.noupd]
000dde6029443950c8553469887eef9e => [badIpUrlInStrings, suspiciousSectionName, highEntropy, invalidSizeOfCode, invalidPEChecksum, writeExeSections]
aba074b2373e8ea5661fdafb159c263a => [epOutOfSections, badIpUrlInStrings, invalidSizeOfCode, invalidPEChecksum, epLastSection, writeExeSections]

With this blog post we wanted to provide you with some introduction to the library, so that you can start playing with it and transform your Deepviz experience in a yet more productive and efficient experience. Please note that the Python SDK as well as the APIs are still in beta and we’re working hard to make them even better and with better performance.

Now it’s up to you: try it out, play around and let us know your feedback through our Support PageAnd, of course, keep an eye open here on our blog post and our Twitter channel …more cool stuff is on its way 🙂

The basics of clustering behind Deepviz – part 2

In our previous blog post we introduced the basics of clustering and the first steps to follow when you want to start clustering data (malware in this case). In this blog post we want to cover the next steps, i.e. what needs to be done once you have selected the right attributes and the best measure to validate and compare attributes. In this follow-up we’ll discuss the basics of clustering algorithms.

The next logical step once you selected the attributes and calculated the distance between the elements is grouping all elements into specific sets which share common characteristics, such as contacted IPs, URLs, imported APIs and whatever else the researcher selected as attributes.

Clustering algorithms group items based upon their mutual distance. To make it easier: the choice of including an item in a specific set is based on its distance from the set itself.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is one of the most used density based clustering algorithms. The algorithm is based on the idea that items which form high-density spatial regions can be considered as a cluster. How can this approach be applied to our malware analysis?

As showed in our previous blog post, we have computed the distance matrix of a set of items, thus we know the distance of each element from all the others.

The algorithm takes as input 3 parameters:

  • min_ptseps – which defines our idea of density
  • distance_matrix

Consider the following picture:

cluster

 

  • density: refers to the number of points inside the area described by a radius named eps.
  • core-points:  are points with a density greater than min_pts. Core points are always assigned to a cluster
  • border-points: are points with a density less than min_pts yet interesting because of the presence of a core-point with distance less or equal to eps.
  • noise-points: are points not belonging to core-points nor to border-points

 

Basing on the definition of core-sample, any cluster has at least min_pts points in it. The higher eps value, the less restrictive clustering will be. Samples with a medium distance will be grouped in the same cluster and noise points will be considered as borders (or perhaps new core points). Min_pts and eps values can be changed in order to define our idea of a malware family.

As a side note, remember that a noise point isn’t less relevant than a cluster, it really depends if we want to focus on finding new variants or instead on well known malware families.

A more detailed description of the algorithm can be found here: DBSCAN

Let’s now proceed with a more practical example.

In our previous blog post we have based our malware distance matrix on the “contacted URLs” attribute. Now we want to make some clustering based upon malware contacted IPs.

From our threat intel’s network webpage we have retrieved an interesting IP: 1.234.83.146

 

ip_1

 

Using our Threat Intelligence APIs, we have found that the IP 1.234.83.146 is contacted by 353 samples at the time of this blog post.
search_intel

 

As a first step, we compute the distance matrix using Jaccard Distance based on the contacted IPs’s list of each sample.

As the Jaccard Distance value varies from 0 (two samples has the same contacted IPs list)  to 1 (two samples has no common IPs) we can apply DBSCAN algorithm with

  • eps: 0.5
  • min_pts: 1 (noise-points will be considered a cluster)

cluster1

DBSCAN found two different clusters:

Cluster 1
290f3104a53cc5776d3ad8b562291680
0124995e09a3f5be548c4e5cadc116a1
96695193ac9870f973b9267cfe6c7009
5b875a5570014cad5e657cba72b451e9
5a72a1a53720ef4501e87e7d1c82a9ba
bd132a4410580bfc065d9260c956a79f
4fa660009cba0b3401f71439b885e067
ae7038d91f0b0af1e9422f3d7fa9a013
aa86216fc7585878e10b714cf2149933
7a764dba191d2bf20206bf4499eb2917
[...]

Cluster 2
6368cc6d88c559bb27da31ef251a52a1
5b41df0eccd56c7a0a6c441b99c77d1f
23a79b803e870f4c21a8d697a788e1a1
1052b6252a07e91a3ff300028100b338
8f68c9a4a1769f57651a6a26b0ea2cf9

cluster1.docx (Full list)

 

Please note that DBSCAN is not applied to the points in the picture. The picture itself is just a spatial representation of the distance matrix.

Let’s try to use different input parameters:

  • eps: 0.2
  • min_pts: 1

cluster2

As expected, DBSCAN has identified more clusters.

 

Cluster 1
290f3104a53cc5776d3ad8b562291680
0124995e09a3f5be548c4e5cadc116a1
96695193ac9870f973b9267cfe6c7009
5b875a5570014cad5e657cba72b451e9
5a72a1a53720ef4501e87e7d1c82a9ba
bd132a4410580bfc065d9260c956a79f
ae7038d91f0b0af1e9422f3d7fa9a013
aa86216fc7585878e10b714cf2149933
7a764dba191d2bf20206bf4499eb2917
aa34f88764f54592afe967f1e983d8e3
[...]

Cluster 2
4fa660009cba0b3401f71439b885e067
64409a372ff880026ff44d27b5441f80

Cluster 3
6368cc6d88c559bb27da31ef251a52a1
5b41df0eccd56c7a0a6c441b99c77d1f
23a79b803e870f4c21a8d697a788e1a1
1052b6252a07e91a3ff300028100b338

Cluster 4
27bd99bf75491447fb3383d1f54f4e40
a8fb2f72e3afe57964200154692e5b9b
630e12b9a1731fcc9d9f793086090b60
053dacd3cfb45d9d7f69601a4fe06000

Cluster 5
c7b19f8250b70ae5bd46590749bf9660
b23826aefbbd36166c976df201fbdd2f

Cluster 6
8f68c9a4a1769f57651a6a26b0ea2cf9

cluster2.docx (Full list)

 

This is just a basic explanation but it’s how you can leverage clustering algorithms to identify new unidentified samples and/or well known malware families once you have extracted the right data, that data which is relevant for malware isolation and identification.

Here below some links related to the isolated samples as well as to our threat intelligence portal:

Deepviz Threat Intel

290f3104a53cc5776d3ad8b562291680

4fa660009cba0b3401f71439b885e067

6368cc6d88c559bb27da31ef251a52a1

27bd99bf75491447fb3383d1f54f4e40

c7b19f8250b70ae5bd46590749bf9660

8f68c9a4a1769f57651a6a26b0ea2cf9

The basics of clustering behind Deepviz – part 1

In the last years we’ve seen many AV companies trying to approach machine learning and artificial intelligence to help isolating new malware and rising malware families in a quick and effective way. The main problem is that many people look at machine learning like it’s a magic wand – you start using machine learning, give to the algorithm as many samples as you can and the magic is done! Not really.

When we started developing Deepviz we wanted to build not only a powerful and fully scalable automated malware analyzer infrastructure, but even a strong and effective threat intelligence platform, able to use all the data extracted by our malware analyzer and correlate it in such a way to take the best out of it.

Machine learning is the key behind how Deepviz is able to identify new malware and find similar samples, correlate them and spot bigger malware families. However implementing an effective machine learning approach is not as easy as many people think – for instance it doesn’t work like “give to the intelligence as many details as possible and it will make the trick“.

In this first blog post we want to shed some light behind our approach to machine learning and how this is greatly helping us identify new malware every day.

Clustering is an analysis technique which aims to identify significant groups in a given dataset. Clustering malware can be very useful as it allows us to find malware families or samples that share common characteristics.  In order to cluster a set of data, all what we need is to have a representation of the distance between the individual elements of the dataset.

Each element is represented by a feature set, a set of attributes that are in some way significant for the object itself. There are two main points that must be taken into consideration here:

  • the first one is to understand how to select attributes. As said above, people who think that extracting millions of attributes from a given object is what is needed for clustering don’t take into consideration that the greater number of attributes considered, the more computational time and effort is required to estimate similarity. Anyway we need to find a sufficient number of attributes that will allow us to isolate specific behaviors of several malware families. For instance running a clustering algorithm using only entropy attribute of PE files as a feature set will lead to wrong clusterization. This is where our malware analyzer plays the biggest part, extracting the most significant attributes needed for malware analysis.
  •  

  • the second point is finding the best measure to validate and compare attributes of malware. Each malware can be represented by numeric (e.g. section entropy) or nominal (e.g. contacted IPs) attributes. In math, a similarity measure is a function that describes how much two objects are similar. Euclidean distance is one of the well known and used measure to compare numeric values. What about nominal values? Given two malware with a list of contacted IPs or URLs, how can we compare these two sets? How can we use them in the clustering process?

Let’s take an example. This picture shows all MD5s contacting the domain complifies.ru:

cluster-process

Our clustering algorithm grouped the samples in 4 different families. The picture above is just the last step of whole process.

In order to find similar sets, we need to know how every element is similar – or different – to each other. These values can be represented through a distance matrix.

Based of what already written, we need to:

  • Choose a feature set, composed by one or more attributes that can be used to evaluate the distance between each element of a given dataset
  • Choose an appropriate distance measure which will be applied to the attributes

Then the needed steps are:

  • Compute for each element the distance between itself and every other element in the dataset
  • Apply a clustering algorithm to group similar items. In our example the obtained sets are malicious families

When we wrote about calculating distance, we said that Euclidean distance is massively used for calculating distance between numeric values. Now we want to introduce the Jaccard distance as measure for nominal values.

Let’s give an example and consider a feature set composed by only one attribute: the list of contacted URLs for the samples 26414a9d627606c4974d8c3f372b0797 and 27f72541c93e206dcd5b2d4171e66f9a:

 

Sample: 26414a9d627606c4974d8c3f372b0797

http://google.com
http://mity.complifies.ru/api
http://www.indyproject.org/
mity.complifies.ru
http://www.w3.org/2000/10/XMLSchema
http://www.borland.com/namespaces/Types
www.dhtmlcentral.com
http://www.w3.org/2000/10/XMLSchema-instance
http://download.torrentex.ru/download.php
http://www.w3.org/1999/XMLSchema
y.aswu.willeave.ru
http://y.aswu.willeave.ru/api
download.torrentex.ru
http://www.w3.org/2001/XMLSchema-instance
http://schemas.xmlsoap.org/soap/encoding/
http://www.w3.org/1999/XMLSchema-instance
http://www.w3.org/2000/xmlns/
http://www.w3.org/XML/1998/namespace
http://www.borland.com/namespaces/Types-IAppServerSOAP
Sample: 27f72541c93e206dcd5b2d4171e66f9a

http://google.com
http://mity.complifies.ru/api
http://www.indyproject.org/
mity.complifies.ru
cuidu.sevential.ru
http://cuidu.sevential.ru/api
http://www.vmware.com/0
http://www.w3.org/2000/xmlns/
https://www.verisign.com/rpa0
https://www.verisign.com/cps0*
http://www.w3.org/2001/XMLSchema
http://www.w3.org/2001/XMLSchema-instance

Jaccard similarity measure is one of the most used similarity measures used to compare sets of nominal values. It is defined as the size of the intersection divided by the size of the union of two give sets. If there are no intersecting elements, the Jaccard measure is 0. If the two sets share the same elements, then the Jaccard measure is 1. Here below the formula:

jaccard-1

Considering the above given examples:

  • set A is composed by 19 elements
  • set B is composed by 12 elements
  • Interesection set is composed by 4 elements

Thus the Jaccard similarity between the two sets is 0.15.  Note that a similarity measure can be converted into a distance measure with the following formula:

distance = 1 - similarity

 

If we calculate the Jaccard distance between all the samples in the set we obtain the distance matrix:

distance-matrix

Distance between A and B is 0.15, between A and C is 0.8 and so on. The distance between an element and itself is obviously 0. On a side note, take into consideration that the distance between A and B is the same between B and A. The distance measure must maintain the symmatric property. In order to optimize the computational time of the matrix we can use the symmetric property and compute only half of the matrix.

Once computed, the distance matrix can be used as input of a clustering algorithm. Clearly, the more effective and significant attributes are extracted and used to build the distance matrix, the better the clustering algorithm will perform.

In the next blog post we will cover the basics of clustering algorithm. Stay tuned!

KeyBase stealing trojan from Deepviz perspective

Yesterday has been another wonderful day here at our office, as we made another great improvement our Deepviz threat intelligence platform. If you log into our Threat Intelligence service at intel.deepviz.com you will find on the left sidebar another cool icon. We’ve finally launched our live feed of URLs contacted by malware analyzed in the past hour, updated on a hourly basis, sorted by active hosts.

While this is great for tracking down active C&C servers and keeping an updated black list of domains, it is also a great tool for researchers. Last night, while going through the list of active URLs, we have seen the following:

live_url

This definitely catched our attention – so we wanted to have a further look at it with our intelligence data: contactmike.com.ng

The domain has been set up in June earlier this year, here the WHOIS:

Registrant name: Ikenna Ikediugwu
Registrant email: support@globalhosting247.com
Registrant info: Upperlink Limited
Creation date: 2015-06-10 16:05:55
Updated date: 2015-06-10 16:08:46
Expiration date: 2016-06-10 16:05:55

More interestingly this website has been contacted by this MD5, b0a599da894c5f992949ea101c8b1520 , automatically determined by our Deepviz Code Analyzer as malware with 99,2% of confidence.  Looking at the rules matched it immediately looked like a password stealing trojan.

Among the interesting things, we have found the following dumped strings and network connections:

http://contactmike.com.ng/kbpanel/
c:\users\support\documents\visual studio 2013\Projects\KeyBaseEx\KeyBaseEx\obj\Debug\KeyBaseEx.pdb

This looked like KeyBase infostealing trojan being sold on the black market earlier this year, confirmed when we tried to connect to that URL:

capture_login

We had a further look into our database, looking at all samples containing the string keybaseex.pdb and matching malicious classification. I used our intel search API (api.deepviz.com):


POST https://api.deepviz.com/intel/search

{
"apikey": "xxxxxxxxxxxxxxxxxxx",
"strings": [
"KeyBaseEx.pdb"
],
"classification": [
"M"
]
}

which gave back the following results:

{
"status": "success",
"data": {
"Total": 131,
"MD5": [
"1e66e3ea4720c082137dbf2a6e6e5286",
"cdfa3e80617de07be84ac39aa0a097bd",
"5e1cdfada02fd0780830b26a95797858",
"4acd3f41ac883f2724dd3302ad63cb0c",
"b0a599da894c5f992949ea101c8b1520",
"a7f357e6ee1b3f3427fb0aba7402beda",
"584a7e75166efd8e3ab95a72beb9b15b",
"030f1e1dbbf7bb37bf37efa61d6309e5",
"3d9bcb112cd79d11bd4c762721e234e6",
"eb54feb9f24612ff735361238308e05c",
"d53b1b2107de1816355b2fcb8a3f1cb6",
"afc7c96de70bfd10644bf364414f3342",
"22d57aa9f49573d63362fa143509c90a",
"893d133458b7be98c8299f0ac83f8c52",
"4d489c000f483407c8f4e7fa7fcc6997",
"acb98693bbc689d8c27f0b43cb714020",
"53bfe9488479316d7a69aa5b27cfd404",
"a98a2d3563301541e60294f5a7cf76f6",
"82f321611101c3e445a9687acc4e72be",
"b09739ad9a60feab8b60c82a8e399fe1",
"c8a251354e694810dd696fa88594de4a",
"ecbc2527152135ccafb70471bcec6bb3",
"78db303c5f570d2f91747ecd7ea66bb4",
"a0a349709b2548d33261a117cfd0e36f",
"a196920ab5ddb90d09b138721c55627d",
"5856633b945b97132a1dc2b6dd695ee2",
"e546dcec370cf6e730c9f70393770710",
"1db2ff11512defeebbde73d0744dc231",
"8c8123e7640967762b3f188b6e9f8dc4",
"22dd0c433c98688d607677f622de8605",
"67a327a3782701ecc42764d99b1a4b11",
"4cd639b178815aa75635378a9a5fbe0c",
"c304f8128a6b9b8734b6d2ba05a3ec06",
"be48cf2d44c7b3588e3448d240bcfa32",
"f113fd15d735b6a534343d5ee1382c77",
"832de14e17809d9871755908ba331c9b",
"4731cfb4755961f4c56daa3dd5f4ecb9",
"8c37cd09e47d674f01f3203131f60c10",
"21981774f875778166f50150947c4b00",
"afb54a76e30b09edd51fa467b39af4a4",
"9d8f55c89ab47f28fa918bac022cd98f",
"e094e08400cfa13aa36afc6e6be22463",
"9cd73678aa6508a003b1b913883a82ab",
"97b77afa8dd9dda2d085494b6c4f1750",
"830fa047eee3f14292a89230254ce1eb",
"17cd9b0f9ba735c257cca8886e17f130",
"4589b66510e1666eabb28ae5282295f8",
"7d89b80f7ed67f5be33dec3058135e8d",
"516535dcabac9dcd81c1129b3716ef07",
"bdb864591f2909ade1879afeb98014ef",
"7a4a2dc396909bc517d976fbfba38a24",
"22de23d8005c70e7755ee83611f15ad2",
"8b40aeaa38e2a2ec4bc3bae03958d6ff",
"4d4bea91f653f7f2db7f7e59bae8da40",
"3a791ceb36b67b19c29c435819aa2f60",
"2c768a4e7b052382e40127131717bbd0",
"256a01593c256a566c4816fb4353b132",
"d28e260f0c7aa91fe74c57dcf5f98e71",
"fb8ebd556b3c68edf2ad6d8bba9b504b",
"8c14a95413c30f74651443f63964214a",
"bc78ea7135263ed0841b2922566f7fa1",
"c7e241d649295c0ade290453fef7ff73",
"d40299a4a300c3b2b946c96d104b6454",
"f15ffcf873ba5e914ff34fcfeea0f045",
"1c4738ea497c72742656d736dd08c37c",
"34568e35fdd8a33f9e624c81301c25de",
"c267a7a36d3f47efe5954b894a9bcdd4",
"d55c248c3c6c6022ecdcb1997444ddc8",
"499a9966da1569a5c7acc57a670cd695",
"8f93c16c041e727f47d8acfe7f259807",
"2f613e01631379d063574367f8fd2e2b",
"a7181bdfd04277f446199e07c278b92c",
"78f06c02557241568cfb83b0528f5085",
"1ee64c17e4735218adbe4f3725e8f25d",
"63ea0412db0ab8bccfe747d17f15ba61",
"7ce4e6332eab25cc1743eac68bde1294",
"d58f4c7f1dcfb2d8de80cefdba747916",
"de303f6f9e28031aeea625d6beb8e157",
"c2740c928062748b59e893019ddeeb6d",
"cdd3bc2276c2e5aac89a740319023e33",
"cdd7ddade4b23c45407414e7dfb34426",
"646058f43f6ebc11134bf6678307ba91",
"2b2b5a3ff9b018957c73bc832493e631",
"e8ff1214ee98cf8fae98c6d97fb6edf5",
"e2733d1418f2ca17c15542ddbe941bf0",
"83ad2f1a122c53634a8ed1699e61be11",
"3fc8c89c53552ec08c78d9cbaba871de",
"9c471bb67bebf12bff65e88b30c83a2d",
"287620614b7f0c02a7d201e992b537a0",
"839ca1346b7562ee727aaeb71870210f",
"7fa831c5d0b0c8a0c0d0ff5b1c807c87",
"225b4f95bee096649fd12c05d9d4ea21",
"8ab033f272a7cfd9c31134be1f307153",
"b3d0e732b9e1fcada499288fcdf0e4bf",
"5038e46a2597675d0b2df312fbb2c72f",
"1a12bc0917807ed57d04be033fae3377",
"a43586d3052e183d0fbafac9de454282",
"7b217c7752bc70c5230cced1e1f8e7bd",
"19b4a5a7e7a430b35b8f88eff3bda7f7",
"fae45bd217fe2dfe5eb44110d05158bb",
"af5780d42970f6cb3439db4946ad2076",
"c92b2956532b7a02b6c4c0214cd731c4",
"ef4ba9425ad151af066eccf2f0b9e2f2",
"8655fbce3933161193702efe27f40879",
"5bd7db8b24e593d6483bbb0f98f3698d",
"6bbd53304a2c9e1042b298501690d8d5",
"54a3c8a97edc941947195cb523510364",
"d366ba790a0990ea4a876109232e95de",
"abb787eccca4bd63754b21aa23024079",
"e3dbcabae789ef04be1edd6ece5f2b22",
"a984b5b7020176531f82f49918173196",
"951f44778a482dfb2213e91ed5593fe9",
"e039fb1b040da4021da859994f24fef9",
"b67148045d08522b990c32be11cc0bd7",
"00adac707c6bf4cbeea50616154206ec",
"63f9ba883959cd6fc2f211b2e2a9733c",
"60964fe841d950bce9f8c6c1e9dcd0be",
"c008d9377b5f5458b1b96227dcd798dd",
"83987e7299bb7f0d25f9672e67beed41",
"6eb8c3c8e4181604ccede8adad0e163f",
"3ba8fa430fec6948c95e8f477c38e54f",
"7a23a80e2c572aa769402dae1efdd2ae",
"157e2d1385d09b9fea81bb00f7d55faf",
"f36e33e8045fcd422c55088e7a877a9c",
"7e4e32b042565d5cede168a47a6bbf59",
"69da2988afeb94b390938d7f0d945a7a",
"cde07ac30a52f809f04c878ce56872c0",
"ab8bde759541921f360b2d96c21e6c7b",
"8c700b1ec985bf23136768f433cf86bf",
"813563eef969f691224c64769390ee92",
"4683b7b9c248a3804040256c8999de22"
]
}
}

These are all the samples we have received and analyzed so far containing the string KeybaseEx.pdb. Using our automated clustering engine we get back the following result:

keybase_cluster

All of them are related to the same trojan, KeyBase. Next step was finding whether their C&Cs are still up and running and, not surprisingly, we’ve found many of them are up and some of them are receiving live data – even today!

KeyBase Panel

While it’s scary that they contain captured data for months and they are still collecting credentials, another critical issue is that KeyBase C&C panel is poorly designed, thus allowing external people to see part of the logged data without having administrative credentials. KeyBase features screen grabbing and keylogging functionalities, and sadly the captured screenshots are open to the world if you know where to look at, the panel doesn’t restrict access to the folder.

This is a list of active C&C servers we have found so far:

kreativewebsite.com/jss/web/
contactmike.com.ng/kbpanel/
2fastsms.com/kbpanel/
biz.karelia.pro/keybase/
filezilla.usa.cc/scardo-bros/
giimagemedia.com/server1/asc/
attecco.com/wp-note/php/
giimagemedia.com/server1/ne/
doncglobal.com/kalus/web/
sivaafi.net/images/l/kbpanel/
sidemlogistics.com/sys/dbb/kbpanel/
tehranmobaddel.com/php/
calibis.usa.cc/jayguy/
winpy.usa.cc/css/
usersmrt.sslsecurityencryption.com/logs/
azabideon.nut.cc/mar/isch/
userg.progadgetsystems.com/logs/
medilincinxq.eu/abebe/
phonesandtabletsfix.com/kbpanel/
nonso.usa.cc/teco/
southplannersuppliers.com/solid/
omniscientstraps.com/kbpanel/
calibis.usa.cc/2020/
future-furnitures.com/kbpanel/
calibis.usa.cc/solace/
creativelinkspk.com/php/kbpanel/

This is exactly where Deepviz could help you and your infrastructure in making your layered defenses stronger. Deepviz Code Analyzer automatically analyzed and classified all KeyBase samples based on our AI machine learning detection and similarity engine, and with our Threat Intelligence Platform the extracted data from the malware analysis is correlated and put together to isolate and identify new members of the same malware family thanks to our clustering engine.

Here below some reports from our analyzer:

ecbc2527152135ccafb70471bcec6bb3

4731cfb4755961f4c56daa3dd5f4ecb9

b0a599da894c5f992949ea101c8b1520

 

November 2015 Intel Statistics

Our first month since we are in public beta is over, and we’ve already received an impressive amount of feedback and interest from end users and companies. We want to sincerely thank everybody for your help and valuable feedback.

Without further adieu lets dive into some statistics about the first month of data that we have processed and collected so far through our Deepviz Threat Intelligence platform.

The top ten of countries hosting malware command and control servers  see the United States in the first position, followed by Ukraine and China:

  1. United States
  2. Ukraine
  3. China
  4. Japan
  5. Romania
  6. Russian Federation
  7. Korea, Republic of
  8. Taiwan
  9. India
  10. Germany

We also monitored all domains registered in November and contacted by malware, with the most prevalent here (you can safely click the links, they are linking to our threat intelligence service):

reliancepublications.co.in

kk8000.com

streetappear.net

electricappear.net

captainbright.net

winfixer.com

3468.in

3475.in

4745.in

4634.in

gladfell.net

equalcompe.net

equalfell.net

equalcount.net

groupfell.net

3463.in

gladcompe.net

mymgjzacbyx.com

Among the various malware families we’ve identified, we have seen an interesting trend of infostealers, malware able to steal stored browser data by intercepting network traffic and/or sniffing browser’s config files.

infostealers
Clusters of infostealers active in November 2015

Here are some of the MD5s representing the top 5 clusters, it is possible to keep investigating by looking at similar samples on the Threat Intelligence webpage linked by each one of them.

Cluster 1 (Zbot)

133a7e1442cfd2f1f224116adfeb1b06
2514e8969e902848dd1486b2a8e84a60
48bf955df062c656a80f208ec9e75400

Cluster 2 (Kelihos)

6532271e09bbd40838208d6bb292f23d
cfaf9eaf671061a7ccdb32cf5bb7c3d8
049d71f93a9536ee5eee8a44f94032fe

Cluster 3 (Dorkbot)

2a722adb4c58c54ec9a614253bd82c86
4046aee8908aacfb061525bd0b1105fb
2671ad6c7d3bbc8dc2c2e8a5c97aabc6

Cluster 4 (Vawtrak)

a12370b7d63426da992a1fe07ce31c6a
1a7416e792fc7f51ec7fcc97d3a12fb0
886b1a2f616e2e0b04f3235b8d629e24

Cluster 5 (Tinba)

d4ab4c8549be22098a037dce8d7afb8d
962df1c2505e62be69e89478889d0ab6
1adf5e5866ecb1261457003041d24831

November has been an interesting month not only for Deepvi as a company but also because of the identification and detection of new Cryptowall 4.0 Ransomware, the last build of the Cryptowall family.

Here is a list of interesting IPs contacted by the malware:

184.168.221.53
103.21.59.9
184.168.221.59
143.95.52.38
37.140.192.166
93.186.202.54
184.168.47.225
199.83.129.153
160.153.66.46
143.95.248.187
173.237.190.55
101.99.75.11
64.247.179.218
103.224.22.13
103.27.61.200
52.91.146.127
195.208.1.153
198.20.114.210
176.114.1.110
66.7.210.114

While some of those IPs are unique to Cryptowall 4.0, what’s really interesting is that some of them have been used in the past for other malware campaigns spreading:

Trojan.Pushdo

Trojan.Fareit

 

We will be launching 64bit support for our sandbox very shortly – so stay tuned!

Intro to Threat Research Manager tool

We’ve been busy working on many enhancements which we will be releasing later this week. While we are doing that, we thought it would be good to blog about a tool we have written to help Threat Researchers improve the way they handle infections when hunting for malware.

Deepviz isn’t just a platform for malware analysis, nor it’s just a huge database of threat intelligence data – we wanted to design it to be as flexible as possible.

  • Do you want to check whether the file you downloaded is a known malicious file? Deepviz Malware Analyzer with our automatic classifier will provide you with the correct answer.

  • Are you a malware researcher and do you want to better understand how a file is going to behave on your system? Deepviz Malware Analyzer will provide you with exact behavior so that you can review it and decide yourself.

  • Are you a malware researcher and/or a IT security company interested in more details regarding some indicators you might have collected from a security incident (e.g. Hashes, web domains, IPs, suspicious strings and so on…)? Deepviz Threat Intelligence platform is there for you, with many different features like the ability to find similar samples, clusters of malware families connecting to specific domains and tons of other brilliant features we will show you in the next blog posts.

  • Are you a bigger IT security company and/or ISP, or anybody else interested in integrating our platform? We provide you with a flexible set of RESTful APIs you can easily implement and use.

 

However – following our past experience as malware researchers – we know that there is one specific thing of malware research which is absolutely painful: doing live malware research sessions and keeping track of what you have done, what samples have been dropped on your lab environment, analyze each one of them and prepare your research results.

So, while designing Deepviz, we thought that it could easily become a wonderful tool for addressing this problem and make your research session much easier and straightforward, allowing you to not loose the focus on your main goal.

That’s what Deepviz Threat Research Manager is about!

Threat Research Manager is an awesome free tool that will record your research sessions, capture all the dropped files and automatically upload them to our Deepviz Malware Analyzer, ready to be processed and classified. Once your research session is ended, you will find all your analyzed samples in your account panel, along with the completed analysis ready to be reviewed by you.

With Threat Research Manager you can create as many sessions as you want, rename them with something which will remind you about the session’s goal, delete old sessions, suspend them, etc.

There might be case when you are doing your malware research session and you want to pause it because you need to start another unrelated research on a new sample. You can put the first session on hold, start a new session, then restore the previous one when you’re done and simply go ahead where with the previous session.

Here a short video showing you how to use Deepviz Threat Research Manager.

Threat Research Manager is totally free and you can grab your copy from hereYou will need to signup for your own free account to get your personal API key.

We hope you’ll enjoy it and we would love to hear your feedback, to improve the tool and make it as useful as possible!

Deepviz enters public beta

Today is a big day for Deepviz!

We worked incredibly hard in the past weeks to make this happen, but I can officially say that starting from now we are in public beta. What does this mean? I will go through some things that you need to know to quickly become friend with Deepviz.

Deepviz is a fully integrated threat intelligence platform, powered by a cloud-based automated malware analyzer environment and a fully scalable, cloud-based, threat intelligence database which processes and correlates the feed of data extracted from the malicious samples analyzed by the malware analyzer environment. The whole infrastructure is based on OpenStack (designed to be AWS compatible) and it has been implemented to quickly and easily scale as needed.

The malware analyzer platformdescribed here more in detail – can process up to 80.000 samples per day. The biggest thing here, anyway, is that we have designed the platform to scale horizontally in a matter of minutes and thus the possibility to add more processing nodes and increase the number of processed samples. The platform’s ability to extract relevant details and behaviors from malicious samples is simply awesome if you want to fully understand the malware behavior, both from a filesystem perspective and a network perspective, but we also wanted to give our users a short and quick answer to the final question: is the uploaded sample a malware or not?

Sandbox stats

The malware analyzer platform is backed by a self-learning / machine learning classifier which is costantly retrained with both malicious and good samples. In our internal tests the classifier was able to succesfully identify and block new malware which already bypassed many antivirus solutions.

The threat intelligence platform instead is a fully scalable engine powered by a cluster of ElasticSearch nodes which are indexing the malware analyzer’s feed of data in realtime, doing automatic data correlation and aggregating results to spot new malware families and similarities between processed samples.

We are really proud of the threat intelligence platform because we built it to make its usage as easy as possible but at the same time powerful enough to allow users building up their own queries and search rules.

This is the reason why we built a threat intelligence UI to allow the usage without the need to implement one single line of code, but we also built a set of REST APIs that you can implement in your own code and use the threat intelligence as you prefer.

The Threat Intelligence UI can be reached at intel.deepviz.com, and it will be available for free for all the users registered with a Deepviz account until the end of the public beta. Since then, it will be available only with a subscription – but we will also have a dedicated free plan for people who will help us by submitting malicious data.

We have prepared two simple dashboards, reachable by the left side bar: network activity and malware overview.

In the network activity dashboards we show the recently registered domains contacted by the malware succesfully processed in a 6 hours / 24 hours / 3 days / 7 days time frame. The same is for individual IPs contacted by malware.

Malware activity dashboard

By right-clicking on the domain or IP and clicking on “Search for samples” it is possible to retrieve the actual MD5s which tried to connect to it.

In the malware overview, in the first chart, we highlight the samples we consider to be most interesting, sorted by an Identification score and by matched rules. Identification score is a threat score, the matched rules are the number of rules matched by our malware classifier. The samples with lower threat score and higher number of matched rules are the more interesting samples because they could be a new malware.

In the lower chart we correlate the data related to all the processed samples in a specific time frame to spot new malware families.

Malware overview

 

Of course the UI will allow you to search for specific MD5s, domains and IPs. More advanced search queries will be available in the next days when we’ll release an advanced search form.  However, if you can’t wait and you already want to use our advanced threat intelligence search, you can implement our APIs in your code and start immediately using them.

At api.deepviz.com it’s possible to retrieve the list of available APIs along with related examples  about how to use them. All the APIs are reachable by a simple HTTPS JSON request and the implementation should be straightforward  – anyway if you get stuck into any issue please feel free to get in touch with us and we’ll assist you step by step.

APIs

One last thing: please keep in mind that while we’re in public beta we’re still fixing many minor things, adding new features, changing here and there. If you find any error, any discrepance, or just some slowdown/downtime, it shouldn’t happen but it does 🙂 Just let us know and we’ll do all we can to provide you with the best experience possible.

I don’t want to make this blog post any longer, what I just want to say is: feel free to register for an account at www.deepviz.com , play around with it, feel free to contact us either publicly or privately through our support page and make your own suggestions, let us know your ideas, pose your questions.

We will be there waiting for you, to try and build up together your powerful tool against cybercrime.