Information Security magazine: How honeyclients make a better Internet

Is it a coincidence that the Seinfeld Bee Movie is out the same day I can have lots of opportunity to pepper my own promo with lots of honey-related puns? I will resist, not that there is anything wrong with that.

We all know that our Web browsers are security sink holes, and that we run the risk of inadvertently downloading something nasty when we visit some random Web site. Sadly, the risk is huge, and in doing research for a story for this month’s Information Security on honeyclients initially got me very depressed.

Microsoft and a rag-tag collection of open source volunteers are working on two independent efforts at honeyclients. In one of those Internet oddities, the project leaders are both named Wang — but are unrelated.

The two projects go by somewhat different names but leverage the idea of a honeynet, or a way to collect new security exploits across the Internet. Honeynets aren’t new – they have been around for several years and have a wide following in security circles. Yet, security researchers from private industry, academia, and government have developed the honeyclient variation.

The difference between honeynets and honeyclients is subtle but important. Honeynets are passive collectors, waiting for some hacker to connect to them and leave any evidence, and just still idle on some random network on the Internet. Typically, they consist of a Web server and a stripped-down operating system with tracking software that registers when a hacker stops by and tries to compromise the system. While they are great at documenting the exploits used, they have one big disadvantage: they can’t go out and actively search for the bad guys who are running Web sites designed to infect unsuspecting visitors.  They have to wait until someone tries to connect to them.


That is where honeyclients come into play. Instead of a Web server sitting idle, the honeyclients run Web browsers and are more active machines that seek out the infected sites around the Internet.


“Browsers and other client-side applications have become more and more the weakest link in the security chain,” says Thorsten Holz, one of the founders of the German Honeynet Project and co-author of a recent O’Reilly book on honeynets. “The vendors now take a closer look at hardening the OS, but client-side applications still have tons of vulnerabilities.”


There are several reasons for this shift from server to browser attacks, according to Michael Sutton, the security evangelist for SPIDynamics, a security software company recently acquired by HP. “This has been driven both by advancements in secure coding practices for server side software and more importantly by the explosion of phishing and identity theft attacks. Attackers have realized that it is easier to find a weak point when targeting employees and end users vs. a hardened server, which is actively protected.”


The data is fairly depressing. There are compromised Web sites in most any subject category, according to honeynet researchers: “Anybody accessing the Web is at risk regardless of the type of content they browse for or the way the content is accessed. Adjusting browsing behavior is not sufficient to entirely mitigate such risk. Even if a user makes it a policy to only type in URLs rather than following hyperlinks, they are still at risk from typo-squatter URLs.”



Honeyclients have three components:

  • An automated script-based system that drives the PC and Web browser to visit a series of URLs in the hopes of finding a compromised Web server,
  • A recording program that documents changes to the PC – just like the one used on the honeynet, and
  • A series of virtual machines that are constructed so that multiple PC and browser sessions can be run on the same physical system. After each session is completed and any changes are recorded, the virtual machine is restarted with a clean image before trying the next URL in the sequence. This ensures that a known starting point is used for all connections, and that a potentially infected machine is contained as well.


Because the honeynet server isn’t a destination site for any ordinary user, security researchers say that any access recorded by the server is probably from a hacker or someone up to no good. In contrast, researchers using honeyclients must discern which site it visits is malicious and which is benign, since they are using a collection of URLs that they don’t know ahead of time what their security state might be.


Microsoft’s approach is called HoneyMonkey and is part of their research group. ( As you might imagine, it is focused on Windows and Internet Explorer specifically. The other effort is part of a larger open source group that has a series of programs ( that looks at both IE and Firefox browser sessions, and grew out of the honeynet projects.


Let’s examine how the two approaches differ in the types of browser configurations used, how they are constructed, how many clients are known to be running at any given moment, what happens when they find a malicious site and whether the code is freely available for security researchers to examine and use themselves. We also talk about some of the lessons that IT security managers can learn from these efforts and the types of threats that they uncover (see sidebars 1 and 2 respectively).


Microsoft’s HoneyMonkey


The Microsoft project is not an actual product that is available to the public but an ongoing research team that is one piece of an overall program towards improving Windows and Internet security. This program has the following components:

  • The Flight Data Recorder component to track OS configuration changes caused by malicious sites,
  • The HoneyMonkey URL collection component,
  • A search page link scanning component


The project has evolved over time, to match the increasing sophistication that Internet hackers have gained. The HoneyMonkey project first began with a more general effort to try to better document Windows crashes and “blue screens” and track down their causes, what the company called building a “flight data recorder.” This “tracks everything that updates the file system and Windows registry,” says Yi-Min Wang, manager at Microsoft Research’s Cybersecurity and systems group.


Once constructed, Wang wanted to expand its focus beyond just finding bad Web sites, and examine the entire ecosystem that a hacker operates to drive traffic to these sites using highly placed results on search pages.


“We now have a much broader understanding on how malicious sites fit into the bigger picture,” he says. “People use these Internet scams by first, getting placed in search places, then getting lots of traffic to visit their sites, and then exploiting the browsers of these visitors by placing malicious software on these machines and charging the authors of that software for these placements.”


As a result, finding the malicious Web sites is just the first step in a long chain of digital tracking, according to Wang. The bad sites also have to be removed from search results pages, so that unsuspecting visitors won’t click on links leading them to these sites. And the malware discovered needs to be sent to security specialists who can write the antidotes or create protection signatures so that future interactions can be prevented.


“There are now 2000 PCs running this software for different purposes, including 1,000 production servers,” he says. Each PC runs Virtual PC along with some custom code to drive Internet Explorer to visit a series of Web sites, and then record any changes to the operating system and browser configuration.


“We get a list of malicious URLs from the 2,000 PCs, and then seed this list to a second network of ten PCs where we visit the site again, but this time with a fully patched PC to see if a hacker can still get through to the PC. If they can that is a very serious exploit,” he says.


The project began back in July 2005, and it discovered a then-unknown exploit that has since been labeled Jview. This points out the real benefit of the honeyclient projects: because they are looking at the results of the underlying OS and browser configuration and not scanning for attack signatures or other behavioral patterns, they can uncover new forms of malware that may not be reported or publicized, giving security researchers a jump on the bad guys. “In the first few months when I ran this project, I discovered two instances of possible zero-day threats,” he says.


Given that Microsoft runs this operation, they also have their legal team involved when they find the actual owner of a malicious site. “Every time we detect a new malicious site, our legal department sends a takedown notice to the site’s ISP. This is a pretty expensive process. I don’t think the open source people are involved in this legal activity,” says Wang. Microsoft filed its first lawsuit last year as a result of using its HoneyMonkey data.


Open HoneyClient


A second effort is underway, led by researchers from Mitre Corporation, a government contract research shop, and with others from Germany and New Zealand. They have taken a different approach, more in keeping with a classical open source effort. They began by working to extend the original work on the honeypot server-based project. They publish the code and VMware images that can be used to construct their honeyclient system, and it is geared toward both IE and Firefox browsers. “We also have different configurations that we are testing, such as ranging from Windows XP with SP2 to XP without any patches,” says Kathy Wang, a lead information security engineer at Mitre and no relation to her Microsoft colleague.


Testing different XP versions is critical because it mimics the actual user experience, she says. “This is because machines running pirated versions of XP aren’t going to be able to obtain SP2 patches. We also are planning to look at more than browser exploits. This includes peer-to-peer applications and Domain Name System clients,” she says.


The open source efforts are looking at similar things as Microsoft’s HoneyMonkey: changes to the underlying Windows operating system, such as modified registry keys or new or delete files in the system folders, as well as processes that have been changed or created, too. They look for these changes that happen without any user intervention or consent that can tip off researchers that they are dealing with a malicious Web server that may be trying to take control over the user’s PC. The main difference is that they don’t have any legal firepower behind them, and rely on publicity and cooperation from security vendors and ISPs to block the malicious sites that are discovered.


Mitre started in 2005 with seven machines, the New Zealand group at Victoria University has another dozen. There certainly are more systems scattered all over the world but the exact number is unknown because anyone can and will download their code and install it at will.


So far, the group at Mitre has found at least ten new malware variants. “All of these are ones that the major anti-virus products weren’t able to initially detect,” says Ms. Wang. Another hotbed of honeyclient activity is with a German/New Zealand group of researchers, who found 306 malicious URLs earlier this year, from 194 hosts from an initial population of over 300,000 URLs.


The German/New Zealand group also did tests to compare whether Firefox is more or less vulnerable to exploits than IE. While IE’s security issues have gotten more press attention, on SecurityFocus Firefox has more known issues and patches. The team concludes that Firefox is actually a better and safer browser, in spite of all of these patches. “We suspect that attacking Firefox is a more difficult task as it uses an automated and immediate update mechanism.” IE has the ability for automatic updates, but only when users enable the overall Windows update feature. “Since Firefox is standalone application that is not as integrated with the operating system as Internet Explorer, we suspect that users are more likely to have this update mechanism turned on. Firefox is truly a moving target.”   (


The German/New Zealand team has also developed tests that anyone can run on a suspected Web server: You merely have to enter a suspect URL here and the service will report on whether it suspects the site of running malware:


What is next for the open systems honeyclient project teams is to coordinate how all of their downloaded tracking systems scan the overall Internet, similar to how SETI@Home coordinates the scanning of radio signals from outer space. They are working on extensions to the honeyclient project that will enable wide scale distribution of their software.


Ms. Wang sums it up nicely: “It is time to start learning by winning this war, and we need to find the attackers and stop them before they come to us and compromise our machines. Most of us are far too reactive in defending our systems, and we need to be a lot more proactive. Once we get a lot more players in this arena, then we have a club where we can share information on trends and attack vectors. Then you don’t have to be defenseless from zero day attacks.”



Sidebar 1: How an IT Manager can benefit


The honeyclient research efforts have very real practical application for IT and security managers, and can help improve browser and network security practices in everyday corporate use.


A good place to start to understand the scope of honeyclients is to download the German/New Zealand paper referenced earlier along with the data set of suspected URLs and description of potential mitigation actions that an IT security administrator can take to try to avoid Web-based infections. They have suggestions for creating URL blacklists, what to patch and when, and choosing the right desktop browser software.


One of their recommendations is to use a browser with minimal market share: “The tests we conducted show that a simple but effective way to remove yourself as a targeted user is to use a non-mainstream application, such as Opera. Despite the existence of vulnerabilities, this browser didn’t seem to be a target,” said the paper’s authors. Of course, one problem with picking a less-known browser is that many sites don’t work well when viewed with it.


Next, IT managers need to ensure that their users machines are running with best practices, including running personal firewalls or host-based intrusion detection software and upgrading to IE 7 if they are using IE and running users in non-administrative modes to prevent possible infections. Unlike earlier versions, IE7 runs a separate “sandbox” by default to limit its exposure.


Thorsten Holz also reminds IT managers to “make sure that you also patch third-party applications and your client software,” particularly those that make up any supported browser plug-ins, viewers and other secondary pieces that are commonly used along with the main browser software. “Everyone now should have a pretty good understanding of the Microsoft patch cycle, but do you also patch all of your Shockwave or Flash clients too?”


Academic and research institutions can also download the honeyclient software and participate in the testing process, too. The team behind the project would like wider dissemination since “the more people who run it and report data back, the more correlation we can do,” says Ms. Wang. “We want to know if some organization is being targeted by a malicious Web server that is going after particular IP addresses,” such as those owned by a financial or government institution.



Sidebar 2: The Web malware infection cycle


“We are fighting a very hard battle, our adversaries are very motivated,” says Mitre’s Kathy Wang.” They have a super easy way of making money without a lot of consequences with law enforcement, they are very clever and can get around things.”


So how can a hacker make money at browser exploits? It is a rich and varied ecosystem, supported by many different players and income streams. (A more complete explanation can be found here in a paper from Exploit Prevention Labs:


First, someone develops the exploit code itself, typically a rootkit, keylogger, browser toolbar or some other kind of program. This code is then sold to a third party, who places it on a variety of Web sites around the Internet. These may be legitimate sites that have been compromised, or infected banner ads that are inserted on an ad-serving network, or adware distributors. When a visitor connects to these sites, the code is downloaded without their knowledge and placed on their machine. These machines form the basis of a botnet that can be controlled by the hacker.


But that is just the beginning of the process. The sites need traffic, and the best way they can get this is to be found by search engines that will direct visitors to them. “A lot of sites are doing redirection, the URL goes to a server and that is what serves up the exploit. So we have to trace each redirect to see who is doing the exploit, and that is done automatically,” says Yi-Min Wang of Microsoft. There are also so-called typo-squatter domains that are a letter or two off from popular destinations that try to capture legitimate traffic, too.


The botnets are used to visit sites that are owned by other parties and collect pageviews that will elevate them in the search engine rankings, so that even more traffic will come their way. “Some sites don’t have any malicious software and just serve up banner advertisements and profit from the traffic,” says Microsoft’s Wang.


The bad guys are getting better at spotting the honeyclients, according to Mitre’s Wang: “Because we use VMware server, the hackers are looking for obvious signs that the incoming request is coming from a VM environment, such as querying for an I/O port, instruction set, and device driver information.”









0 thoughts on “Information Security magazine: How honeyclients make a better Internet

  1. Malicious code on websites is an ever increasing problem. Previously, you could pretty much avoid hitting this type of stuff if you steered clear of different categories of websites. Kind of like in the physical world where you were “safer” if you didn’t walk through that bad part of the city. Unfortunately, bad guys are finding more ways to inject their malicious code into legitimate, highly trafficed websites. This makes the problem infinitely worse…

    I am actually heading up the WASC Distributed Open Proxy Honeypot Project ( and we are seeing more and more examples of sites sending malicious javascript to clients to try and exploit browser flaws and install malware. I am actually giving a project update presentation at the upcoming WASC/OWASP AppSec 2007 conference ( and I will be discussing 2 specific examples of this. One tries to exploit the MS Windows Media Player 10 Plugin Overflow vulnerability (MS06-006), while the other was actually part of a web defacement that used JS to try and install an EXE program through both ActiveX and VBS.

    Scary stuff out there.

    Ryan C. Barnett
    ModSecurity Community Manager
    Breach Security: Director of Training
    Web Application Security Consortium (WASC) Member
    CIS Apache Benchmark Project Lead
    Author: Preventing Web Attacks with Apache

  2. One reader writes:

    Sadly, the dangers appear to infest even the most reputable sites. A guy I work with — an otherwise quite saavy senior technical architect–was ripped off for $3000 last week through an unfortunate series of events that began with his clicking a Buy It Now button on an eBay-hosted page that he’d arrived at from an eBay search. The link took him off site to a well execute phished page, where he was informed that merchant didn’t accept Paypal and would prefer a wire transfer. According to my friend, there were moments that didn’t quite smell right, but the stink was never rank enough to overcome the excitement of having gotten a great deal (on an motorcycle, I believe), especially since the transaction had started on a legitimate eBay page. He was never aware of having left eBay, although when he reconstructed it later he discovered that none of the pages after the Buy It Now button were actually hosted by eBay.

    When he realized what had happened, my friend contacted eBay, only to be told that they had no responsibility. I’m sure that’s true, but I wonder how many eBay customers would realize that. The whole story strikes me as somewhat shocking. For all the warnings about the dangers of phishing, etc., sites like eBay and Amazon are so well-established and legitimate in the eyes of consumers that I’m sure many people disregard their natural caution in dealing with them, as did my colleague. Clearly, doing so can be a costly mistake.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.