Cyber Crime and the Deep Web

Cyber-risk Management

Article by David Gerlach
Applied Systems
Tuesday, September 20, 2016

There is a vast section of the Internet which is hidden and not accessible through regular search engines and web browsers.

This part of the internet is known as the Deep Web, and it is much larger than the size of the Web that we know.

Put simply, it is the part of the internet that is hidden from view.

  • Surface Web
    • 4% of WWW content
    • Also known as the ‘Surface Web’ or ’Visible Web’, it is content that can be found using search engines such as Google or Yahoo
    • Search engines like Google use pieces of software called ‘web crawlers’ whose primary purpose is for the discovery of web pages on the Internet.
    • It is under constant surveillance by the government
  • Deep Web
    • 96% of WWW content
    • Also known as the ‘Invisible Web’, it is the content that isn’t indexed y search engines
    • It is not linked to pages on the Surface Web
    • It is hard to keep track of
  • The Deep Web is estimated to be at least 500x the size of the Surface Web

“Searching on the Internet today can be compared to dragging a net across the surface of the ocean. While a great deal may be caught in the net, there is still a wealth of information that is deep, and therefore, misses” – MK Bergman

  • Traditional search engines create their indices by spidering or crawling surface Web pages
  • Deep Web sources store their content in searchable databases that only produce results dynamically in response to a direct request
    • The Deep Web is the largest growing category of new information on the Internet
    • Deep Web content is highly relevant to every information need, market and domain
    • Deep Web sites tend to be narrower, with deeper content, than conventional surface sites
    • Total quality content of the Deep Web is 1,000 to 2,000 times greater than that of the Surface Web
    • More than half of the Deep Web content resides in topic-specific databases
    • A full ninety-five per cent of the Deep Web is publicly accessible information – not subject to fees or subscriptions

The term “Deep Web” has been introduced over the past few years to denote Internet content that search engines do not reach, particularly:

  • Dynamic Web Pages: Pages dynamically generated on the HTTP request
  • Block Sites: Sites that explicitly prohibit a crawler to go and retrieve their content by using, CAPTCHAs, pragma no-cache HTTP headers, or ROBOTS.TXT entries, for instance.
  • Unlinked Sites: Pages not linked to any other pages, preventing a We crawler from potentially reaching them.
  • Private Sites: Pages that require registration and log-in/password authentication
  • Non-HTML/Contextual/Scripted content: Content encoded in a different format, accessed via Javascript or Flash, or are context dependent (i.e. a specific IP range or browsing history entry)
  • Limited-access networks: Content on sites that are not accessible from the public Internet infrastructure.
  • Tens of thousands of domains are associated with either HTTP or HTTPS protocol
  • Many users are not aware there is more to the Deep Web than just standard Web sites
    • Other networking protocols are also used
      • IRC, IRCS
      • Gopher
      • FTP
      • Telnet
      • SMTP, Mailto, POP
      • IMAP
    • Hundreds of domains use with the IRC or IRCS protocol
      • Chat servers that can be used as rendezvous for malicious actors (aka hackers) to meet and exchange goods
      • Often times used as a communication channel for botnets
  • Eighty-five percent of the Web users use search engines to find needed information
  • When using the Surface Web, you access data directly from the source with some form of Web browser (IE, Chrome, Firefox, Safari etc.)
  • This direct approach tracks the information downloaded, from where and when it was accessed, and your location
  • Normally, information on the Deep Web cannot be access directly
  • Data is not held on any single page, but rather in databases, which, makes if difficult for search engines to index
  • Files are shared through any number of computers connected to the internet that hold information you need usually in an encrypted form.
  • This methods of sharing encrypted data makes it difficult for your location, and the kind of information you access, to be reached or monitored.
  • Believe it or not, there is plenty of legal activity that foes on in the Deep Web and you might not realize that data being accessed actually resides there
    • Performing background checks
    • Databases that house adoption information
  • Who else uses it:
    • Academics
      • Academic libraries
      • Old versions of web pages
    • Journalists
      • The Arab Spring protests
      • Freedom Foundation
    • Whistleblowers
      • Edward Snowden
      • Julian Assange
    • There may be a wealth of information out there in the Deep Web, but you should be careful about what you look for. Just like Alice – the deeper you go, the more trouble you could find yourself in.

The Dark Side of the Deep Web

Due to its anonymity, The Deep Web has also become a popular nesting ground for criminal activity. The includes such things such as:

  • Drugs
  • Weapons trading
  • Child exploitation
  • Hit men for hire*
  • Cyber Crime (SSN, Credit Cards, other PIL information)

This is known to many as the “Dark Web”. The Dark Web is only a part of the Deep Web.

Almost all sites on the so called Dark Web hide their identity using some form of encryption service

*Thought there are groups on the Deep Web claiming to offer this service, there has been no legitimate proof of their existence.

  • Among the different strategies in place to bypass search engine crawlers, the most efficient for malicious actors are so-called “darknets”
  • Darknets refer to a class of networks that aim to guarantee:
    • Anonymous and untraceable access to Web content
    • Anonymity for the site and site owner itself
  • Darknets are generally identified as well by a non-standard domain name that required using the same software to be resolved to a routable endpoint
  • Darknet and alternative routing infrastructures (Limited-Access Networks):
    • Sites hosted on an infrastructure that require a specific software to reach the content provider
    • Examples of such systems: TOR’s hidden services or sites hosted on the Invisible Internet Project (I2P) network

Limited-Access networks cover all those resources and services that wouldn’t be normally accessible with a standard network configuration

  • As such, they offer interesting possibilities for malicious actors to act partially or totally undetected by law enforcers
  • Much of the public interest in the Deep Web lies in the activities that happen inside darknets

The Onion Router (TOR) is a free web browser, which is a variant of Firefox. You can run it on all the common platforms such as Windows, Mac OS X, and Linux.

  • Originally developed by the US Naval Research Laboratory and first introduced in 2002
    • Allows for anonymous communications by exploiting a network of volunteer nodes/relays(i.e. more than 3,000 to date) responsible for routing encrypted requests
    • Traffic can be concealed from network surveillance tools
  • To take advantage of the TOR network, a user needs to install software that acts as a SOCKS proxy
    • The TOR software conceals communications to a server by selecting a number of random relay nodes to form a circuit
    • As your packets go from relay to relay, it decrypts just enough data to know which TOR relay the packet came from and where the next hop is.

Reference: Deepweb and Cybercrime – It’s Not All About TOR – A Trend Micro Research Paper 2013

  • Adopting a multi-layered encryption mechanism has the following advantages
    • A server that receives a request coming from the TOR network will see it as being issued by the last node in the TOR circuit (i.e. the exit node) but there is no straightforward way to trace a request back to its origin
    • Every node within the circuit only knows the previous and next hop for a request but cannot decipher the content nor find out its final destination
    • The only TOR node that can view the unencrypted request is the exit node but even this does not know the origin of the request, only previous hop in the circuit

    Reference: Deepweb and Cybercrime – It’s Not All About TOR – A Trend Micro Research Paper 2013

Digital Currency (Bitcoin)

  • A Worldwide Digital Currency that is decentralised an not controlled by any government or institution
  • Bitcoin is sent using the Internet directly from person to person with no bank or intermediary
  • Bitcoin is a fast and anonymous way to send money
  • Anyone can set up a Bitcoin account (No Qualifications)
  • No fees, no chargebacks and no borders
  • Thousands of businesses in the US and an estimated 100,000 worldwide
  • Growing fast as more merchants accept it daily
  • There are no refunds, chargebacks or fees to accept bitcoin and the money is received instantly when sent
  • Anyone can purchase Bitcoin from a Bitcoin exchange using dozens of different currencies and payment methods to buy and sell it (
  • Bitcoin is sent from person to person similar to how paypal works with an email. All you need is a BTC address to send to and it will arrive instantly
  • Bitcoin relies on miners who verify transactions that are sent
  • Bitcoin is traded on the open market so there is always real value determined
  • Everyone who uses Bitcoin becomes part of the bank of Bitcoin
  • Miners use special software to solve math problems that verify all transactions and they are rewarded with newly issued Bitcoin exchange for using their computer power
  • As more miners come online, the network gets more anonymous and the math gets harder
  • Bitcoin would not work without miners
  • Bitcoin is a digital currency designed with anonymity in mind
  • Because of this ability it’s frequently used when purchasing illegal goods and services
  • Although Bitcoin transaction are anonymous, they are public
    • Every transaction in the Bitcoin blockchain is publicly available meaning investigators can examine them
    • However tracking transactions as they move through the system is difficult, but not impossible
  • Interestingly, there has been an increase in Bitcoin laundering services to add further anonymity to the system
    • This is achieved by “mixing” your Bitcoins
    • Transferring them through a spidery network of micro-transactions before returning them
    • User ends up with the same amount of money (minus a small “handling fee”)
    • Transaction become even more difficult to track

Below the Surface: Exploring the Deep Web – A Trend Labs Research Paper 2016

Cyber Crime and the Deep Dark Web

  • Guns
  • Drugs
  • Counterfeit Currency
  • Fake ID’s and Passports
  • All Sorts of Stolen Accounts
  • Rent a Hacker
  • Stolen Credit Cards
  • Counterfeit Credit Cards

Because of its anonymity, the Deep Web is perfectly suited for the malware trade

  • It’s perfect for hosting command-and-control (C&C) infrastructure
  • In the field of computer security, C&C infrastructure consists of servers and other technical infrastructure used to control malware in general, and, in particular, botnets
  • A botnet operator will infect computers by sending out viruses or worms through various infection vectors, such as email or compromised websites where the malicious application is the bot
    • The bot on the newly infected host will log into a command and control server and await commands
    • In many cases, the command and control server will be an IRC channel or a web server
    • A botnet user acquires access to the botnet from the botnet operator
    • Instructions are sent through the command and control channel to each bot in the botnet to execute actions, such as mass email spam, distributed denial of service (DDoS) attacks, or information theft
  • Unfortunately, given all of the benefits cybercriminals reap by hosting the more permanent parts of their infrastructures on TOR-hidden service, more and more malware families shift to the Deep Web in the future

They often host phishing kits, malware, or drive-by-downloads. Additionally they run shady marketplaces used to trade hacking tools, etc.

  • Early banking Trojans spread via phishing email. Malware communicates with a list of C&C servers
  • IP addresses are retrieved by downloading an encrypted file from some hard-coded TOR site
  • This provides the advantage of anonymising the location of a criminal server but not the users who access it
  • Now spread via phishing, malvertisement, and/or compromised web sites. Now a list of domains calculated using an embedded formula
  • Infected computer goes down that domain list looking for a server that is still operational and responsive
  • By using TOR an algorithm instead of hardcoded domains, automated attempts at mitigation are rendered inadequate

Another major malware family that uses the Deep Web is Crypto Locker

  • Crypto Locker refers to a ransomware variant that encrypts victims’ documents before redirecting them to a site where they can pay to regain access to their files.
  • Most leverage TOR to host payment sites. They require Bitcoin as a form of payment for the key to unencrypt files
  • Can automatically adjust the payment page to account for victim’s local language
  • TOR helps protect the cybercriminals by making their environments more robust to possible takedowns
  • Doxing, in simple words, is a process that involves collecting someone’s private information using the internet
    • Such private information may include one’s name, location, email address, phone numbers, age and so on
    • Often hackers’ attempt to “unmask” a rival, when there is a falling out amongst peers, essentially linking his/her real-world identity to his/her online one

“The phenomenon of doxing or exposing private information is by no means restricted to hackers versus hackers; it’s also quite common for hackers to target companies, celebrities, and other public figures.” Trend Micro

  • Often information leakage can be cause by a company insider such as in the Snowden case with WikiLeaks
    • Cases like this involve the Deep Web and anonymous submission of new leaks to a form page
  • Cloudnine – lists possible dox information for public figure including:
    • Several FBI agents
    • Political figures like Barack and Michelle Obama, Bill and Hilary Clinton, Sarah Palin, US Senators and others
    • Celebrities like Angelina Jolie, Bill Gates, Tom Cruise, Lady Gaga, Beyonce, Dennis Rodman and more
  • It’s hard to know if these details are factual or not but often certain private information


The Deep Web is the largest part of the Internet, yet the majority of the population doesn’t even know about it, or even access it.

The Deep Web heavily uses protocols outside the standard HTTP/HTTPS, most commonly IRC, IRCS, Gopher, XMPP and FTP.

Darknets such as TOR, represents a viable way for malicious actors to exchange goods, legally or illegally in an anonymous fashion.

The cybercriminal underground definitely operates in the Deep Web

  • Stolen Accounts, passports, and identities of high profile personalities are sold in professional-looking forums with complete pricing information and descriptions

Prevalent malware families like VAWTRAK and CyrptoLocker are using TOR as part of their configuration.

Anonymity in the Deep Web will continue to raise a lot of issues and be a point of interest for both law enforcers and Internet uses who want to circumvent surveillance.

Author: David Gerlach
Applied Systems