When anonymous web data isn’t anymore

One of my favorite NY Times technology stories (other than, ahem, my own articles) is one that ran more than ten years ago. It was about a supposedly anonymous AOL user that was picked from a huge database of search queries by researchers. They were able to correlate her searches and tracked down Thelma, a 62-year old widow living in Georgia. The database was originally posted online by AOL as an academic research tool, but after the Times story broke it was removed. The data “underscore how much people unintentionally reveal about themselves when they use search engines,” said the Times story.

In the intervening years since that story, tracking technology has gotten better and Internet privacy has all but effectively disappeared. At the DEFCON trade show a few weeks ago in Vegas, researchers presented a paper on how easy it can be to track down folks based on their digital breadcrumbs. The researchers set up a phony marketing consulting firm and requested anonymous clickstream data to analyze. They were able to actually tie real users to the data through a series of well-known tricks, described in this report in Naked Security. They found that if they could correlate personal information across ten different domains, they could figure out who was the common user visiting those sites, as shown in this diagram published in the article.

The culprits are browser plug-ins and embedded scripts on web pages, which I have written about before here. “Five percent of the data in the clickstream they purchased was generated up by just ten different popular web plugins,” according to the DEFCON researchers.

So is this just some artifact of gung-ho security researchers, or does this have any real-world implications? Sadly, it is very much a reality. Last week Disney was served legal papers about secretly collecting kid’s usage data of their mobile apps, saying that the apps (which don’t ask parents permission for the kids to use, which is illegal) can track the kids across multiple games. All in the interest of serving up targeted ads. The full list of 43 apps that have this tracking data can be found here, including the one shown at right.

So what can you do? First, review your plug-ins, delete the ones that you really don’t need. In my article linked above, I try out Privacy Badger and have continued to use it. It can be entertaining or terrifying, depending on your POV. You could regularly delete your cookies and always run private browsing sessions, although you do give up some usability for doing so.

Privacy just isn’t what it used to be. And it is a lot of hard work to become more private these days, for sure.

Everyone is now a software company (again)

Several years ago I wrote, “everyone is in the software business. All of the interesting business operations are happening inside your company’s software.” Since then, this trend has intensified. Today I want to share with you three companies that should come under the software label. And while you may not think of these three as software vendors, all three run themselves like a typical software company.

The three are Tesla, Express Scripts, and the Washington Post. It is just mere happenstance that they also make cars, manage prescription benefits and publish a newspaper. Software lies at the heart of each company, as much as a Google or a Microsoft.

In my blog post from 2014, I talked about how the cloud, big data, creating online storefronts and improving the online customer experience is driving more companies to act like software vendors. That is still true today. But now there are several other things to look for that make Tesla et al. into software vendors:

  • Continuous updates. One of the distinguishing features of the Tesla car line is that they update themselves while they are parked in your garage. Most car companies can’t update their fleet as easily, or even ever. You have to bring them in for servicing, to make any changes to how they operate. Tesla’s dashboard is mostly contained inside a beautiful and huge touch LED screen: the days of dedicated dials are so over. These continuous updates are also the case for The Washington Post website, so they can stay competitive and current. The Post posts more total articles than the NYTimes with double the reporting staff of the DC-based paper. That shows how seriously they take their digital mission too.
  • These companies are driven by web analytics and traffic and engagement metrics. Just like Google or some other SaaS-based vendor, The Washington Post post-Bezos is obsessed with stats. Which articles are being read more? Can they get quicker load times, especially on mobile devices? Will readers pay more for this better performance? The Post will try out different news pegs for each piece to see how it performs, just like a SaaS vendor does A/B testing of its pages.
  • Digital products are the drivers of innovation. “There are no sacred cows [here, we] push experimentation,” said one of the Post digital editors. “It is basically, how fast do you move? Innovation thrives in companies where design is respected.” The same is true for Express Scripts. “We have over 10 petabytes of useful data from which we can gain insights and for which we can develop solutions,” said their former CIO in an article from several years ago.
  • Scaling up the operations is key. Tesla is making a very small number of cars at present. They are designing their factories to scale up, to where they can move into a bigger market. Like a typical SaaS vendor, they want to build in scale from the beginning. They built their own ERP system that shortens the feedback loop from customers to engineers and manages their entire operations, so they can make quick changes when something isn’t working. You don’t think of car companies being so nimble. The same is true for Express Scripts. They are in the business of managing your prescriptions, and understanding how people get their meds has become more of a big data problem. They can quickly figure out if a patient is following their prescription and predict the potential pill waste if they aren’t. The company has developed a collection of products that tie in an online customer portal to their call center and mobile apps.

I am sure you can come up with other companies that make normal stuff like cars and newspapers that you can apply some of these metrics to. The lessons learned from the software industry are slowly seeping into other businesses, particularly those businesses that want to fail fast and more quickly as their markets and customers change.

SecurityIntelligence blog: Tracking Online Fraud: Check Your Mileage Against Endpoint Data

A recent Simility blog post detailed how it is tracking online fraud. With the help of a SaaS-based machine learning tool, the company and its beta customers have seen a 50 to 300 percent reduction in fraudulent online transactions. This last January, they looked at 100 different behaviors across 500,000 endpoints scattered around the world. They found more than 10,000 of those devices were compromised, and then looked for patterns of similar behavior. They found seven commonalities, and some of them are surprising.

You can read my blog post on IBM’s SecurityIntelligence.com here.

IBM SecurityIntelligence blog: Can You Still Protect Your Most Sensitive Data?

An article in The Washington Post called “A Shift Away From Big Data” chronicled several corporations that are actually deleting their most sensitive data files rather than saving them. This is counterintuitive to today’s collect-it-all data-heavy landscape.

However, enterprises are looking to own their encryption keys and protecting  their metadata privacy. Plus, there is a growing concern that American-based companies are more vulnerable to government requests than offshore businesses.

You can read more on IBM’s SecurityIntelligence.com blog here.

The blockchain world gets more interesting by the day

 

 

 

I was at a conference last week where everyone was doing some interesting things with blockchain technology. This is the not-so-secret sauce behind Bitcoin: a transaction log that is verifiable and can be synchronized across distributed servers and still handle multiple trust relationships, where chargebacks can’t happen and where the crypto is strong enough to have banks and other financial institutions spending millions of dollars supporting dozens of startups.

I have written before about blockchain tech for IBM’s SecurityIntelligence blog here, but what got me interested about the conference was how practical blockchain implementations have been and will be. This is especially true in changes to the world of supply chains, where goods move across the globe under a variety of incomplete and error-prone tracking circumstances.

Indeed, at the conference I saw lots of blockchain apps that related to supply chains and had almost nothing to do with cryptocurrencies. This is an industry that is ripe for change. As one analyst has written, many supply chains have data quality issues and automation has failed to deliver significant productivity gains. That could change with these new apps.

For example, there is a company called Everledger.io. The idea is to attach a unique digital signature to each and every diamond that is traded on the various international exchanges. This signature can be immediately verified with the actual item itself – like the way a checksum can be used to verify if a digital file has been altered – to ensure that the diamond hasn’t been tampered with or substituted. So far they have been able to track close to a million diamonds in this fashion. According to insurers, about seven percent of the world’s diamonds are fraudulent in one way or another. Last fall, data from the Gemological Institute of America, the main diamond industry certification body was altered by hackers.

We are still in early days, but you can see there are lots of other applications to help detect when counterfeit goods enter a supply chain that are ripe for blockchain applications. Sending prescription drugs around the world is another high-value application that several teams are working on blockchain apps.

One FedEx manager was on a panel where they spoke about how they need new technology for managing their supply chain. “The immutability of the transaction is important for us: are you who you say you are, and are you shipping what you say you are shipping?” They spend a lot on insurance and it would be nice if they could leverage blockchain tech to prove that a package actually did make it to the final destination, with something other than an illegible signature.

While they can track a package from when it leaves your door through their shipment network, that only works if they have control over the shipment from end-to-end. That isn’t always the case, and especially internationally where it can be more cost-effective if they can hand off a package to another shipper. The panel also brought up an interesting question, as to what constitutes a delivery address, with one of them holding up his phone, saying how he wants to be able to deliver something right to where he is at the moment. That has a lot of appeal to me, as I recall how many hours I have spent trying to find a package delivery person when I stepped out of my office for a moment.

Also speaking was a representative of Chattanooga-based Dynamo, a new accelerator for supply chain ventures. They are funding several blockchain-related startups. “It isn’t just about saving money with these kinds of businesses, but about finding opportunities to expand commerce.”

The conference started off with a speech from Brian Behlendorf, who is now in charge of the hyperledger project that is part of the Linux Foundation. He has been around the tech industry for a long time, putting up Wired magazine’s early website and developing numerous open source projects. The idea behind hyperledger is to have an open source project that can be used in a number of blockchain circumstances. Think of what the Apache programmers did for web servers back decades ago: the same thing will be attempted with having a set of protocols and standard infrastructure to build blockchain apps on top of with hyperledger.

Before the conference took place, a pre-conference hackathon was held and more than a dozen teams and 50 people participated to win the top prize of $20k. The winners included college students, which should give you an idea of how quickly blockchain is evolving. Unlike many hackathons where the winners get to pose with an oversize check, in this case the winning teams’ prize money was preloaded in bitcoin on a special cryptokey, which was quite fitting. The first place finishers wrote an app to eliminate ID fraud, using blockchain to encrypt and validate who you actually are.

Blockchain isn’t just all about the supply chain: the banks are getting involved too. A private effort from R3 has more than 40 financial services supporters to try to create standards for distributed ledgers. Barclays has more than 45 Bitcoin-related projects. Deloitte has a group based in Toronto doing cryptocurrency and blockchain consulting. A Berlin neighborhood has dozens of retailers who accept bitcoins. Finally, there are other currencies that are gaining traction, including Ethereum and Dash.org, that attempt to improve upon the original bitcoin specifications and further fueling blockchain interest.

It looks like there will lots of blockchain-related news in the coming months.

For Immediate Release: a podcast for B2B Marketers

I return to doing a regular series of podcasts with my long-time former partner Paul Gillin, called For Immediate Release: B2B. Paul and I co-hosted almost 100 episodes of MediaBlather back several years ago, and many of those shows have held up well talking about how technical PR and marketing communications professionals can leverage new media and other strategies.

In this week’s show, we talk about the upcoming merger between Microsoft and LinkedIn (Paul and I are split on whether it is a good thing), and interview Radius.com CEO Darian Shirazi about predictive analytics and its utility for marketing and customer retention.

iBoss blog: When geolocation goes south

 

What do a Kansas farm and a seaside McMansion have in common? Both have been discovered as the result of various geolocation-programming errors over the past several years.

Certainly the use of global positioning system (GPS) chips now built-in to tablets and smartphones is mostly a benefit when it comes to navigating to a meeting spot or finding a nearby gas station or restaurant. But the ubiquity of GPS tech also has its downsides too.

Take the case of that Kansas pasture. For more than a decade, the owners of a small family farm outside of Wichita started getting regular visitors and calls thinking their farmhouse were the center of criminal activity or digital abuse. The reason had to do with a deliberate rounding error of their latitude and longitude for the center of the continental US. And thanks to software that matches up IP addresses with a location, their farm was showing up on thousands of records, including the default location for scammers and other questionable situations.

“The harassment continued to the point where the local sheriff had to intervene. He placed a sign at the end of their driveway warning people to stay away from the house and to call him with questions,” according to the post. Sadly, that didn’t help. One irate visitor even dumped a defective toilet on their driveway in frustration.

Others around the country have suffered a similar fate, such as a man in Ashburn Virginia whose home has been attached to millions of IP addresses from the Internet service providers who are located in nearby data centers. Think of them as “living in an IP flood zone” as the above article calls these geolocation disasters.

However, there certainly are other unintended consequences.One report ties the tracking bracelet that was worn by noted cartel boss El Chapo as the way his confederates helped locate the escape tunnel they dug to come out precisely inside his cell. And an ISIS fighter found out too late that his Tweets were being geo-tagged, broadcasting his whereabouts. In another case, a divorce lawyer monitored the social media of his client’s Gen Y children to geolocate properties that weren’t mentioned in the original filings. “We were able to go to the court with a list of assets that we conservatively estimated at $60 million, which the court then seized.”

Even if you don’t geotag your social media posts, there are still ways to figure out where you live, according to this academic research paper published last year. The scientists examined their friends’ geolocations and were able to estimate the target within a few miles.

So what can you do to prevent this? First, understand the accuracy limitations of any enterprise-level geolocation technology that you use. Actual mileage, as the saying goes, can vary. Although geolocation technology has been around for more than a decade, it isn’t as precise a location down to a particular household or street address. Facebook’s “safety check” warnings that its users might be inside a disaster zone turned out initially to not be very accurate, after people around the world were warned they might be near a bombing in Pakistan. Hopefully, the alert location algorithms have been improved since then.

Second, examine the developer tools that are available to employ geolocation and understand what apps you are trying to build. Look at what Google and Facebook are doing in this field and how you can tie into existing mapping efforts from these giant software vendors.

Finally, examine the settings for any corporate-owned phones and tablets and make sure you turn the geolocation features off if this is a concern.

Using citizen science to hunt for new planets

When I was growing up, one of my childhood heroes was Clyde Tombaugh, the astronomer who discovered Pluto. Since then, we have demoted Pluto from its planetary status. But it still was a pretty cool thing to be someone who discovered a planet-like object. Today, you have this opportunity to find a new planet, and you don’t even need a telescope nor spend lonely cold nights at some mountaintop observatory. It is all thanks to an aging NASA spacecraft and how the Internet has transformed the role of public and private science research.

Let’s start in the beginning, seven years ago when the Kepler spacecraft was launched. Back then, it was designed to take pictures of a very small patch of space that had the most likely conditions to find planets orbiting far-away stars. (See above.) By closely scrutinizing this star field, the project managers hoped to find variations in the light emitted by stars that had planets passing in front of them. It is a time-tested method that Galileo used to discover Jupiter’s moons back in 1610. When you think about the great distances involved, it is pretty amazing that we have the technology to do this.

Since its launch, key parts of the spacecraft have failed but researchers have figured out how to keep it running using the Sun’s solar winds to keep the cameras properly aligned. As a result, Kepler has been collecting massive amounts of data and downloading the images faithfully over the years, and more than 1,000 Earth-class (or M class, from Star Trek) planets have already been identified. There are probably billions more out there. 

NASA has extended Kepler’s mission as long as it can, and part of that extension was to establish an archive of the Kepler data that anyone can examine. This effort, called Planethunters.org, is where the search for planets gets interesting. NASA and various other researchers, notably from Chicago’s Adler Planetarium and Yale University, have enlisted hundreds of thousands of volunteers from around the world to look for more planets. You don’t need a physics degree, you don’t need any sophisticated computer or run any Big Data algorithms. Instead, if you have a keen mind and eyesight to pore over the data and the motivation to try to spot a sequence that would indicate a potential planetary object.

What is fascinating to me is how this crowd-based effort has been complementary to what has already happened with the Kepler database. NASA admits that it needs help from humans. As they state online, “We think there will be planets which can only be found via the innate human ability for pattern recognition. At Planet Hunters we are enlisting the public’s help to inspect the Kepler [data] and find these planets missed by automated detection algorithms.”   

Think about that for a moment. We can harness the seemingly infinite computing power available in the cloud, but it isn’t enough. We still need carbon-based eyeballs to figure this stuff out.

Planet Hunters is just one of several projects that are hosted on Zooniverse.org, a site devoted to dozens of crowdsourced “citizen science” efforts that span the gamut of research. Think of what Amazon’s Mechanical Turk does by parcelling out pieces of data that humans classify and interpret. But instead of helping some corporation you are working together on a research project. And it isn’t just science research: there is a project to help transcribe notes from Shakespeare’s contemporaries, another one to explore WWI diaries from soldiers, and one to identify animals captured by webcams in Gorongosa National Park in Mozambique. Many of the most interesting discoveries from these projects have come from discussions between volunteers and researchers. That is another notable aspect: in the past, you needed at least a PhD or some kind of academic street cred to get involved with this level of research. Now anyone with a web browser can join in. Thousands have signed up.

Finally, the Zooniverse efforts are paying another unexpected benefit: participants are actually doing more than looking for the proverbial needle in the haystack. They are learning about science by doing the actual science research. It is taking something dry and academic and making it live and exciting. And the appeal isn’t just adults, but kids too: one blog post on the site showed how Czech nine year old kids got involved in one project. That to me is probably the best reason to praise the Zooniverse efforts.

So far, the Planet Hunters are actually finding planets: more than a dozen scientific papers have already been published, thanks to these volunteers around the world on the lookout. I wish I could have had this kind of access back when I was a kid, but I also have no doubt that Tombaugh would be among these searchers, had he lived to see this all happening.