« Microsoft Gatineau: The Interface | Main | Bounce Rate Dragging Down Your Quality Score? »

Gatineau A Threat?

In Web Analytics Report, CMS Watch, a company that analyzes content management technologies, indicates the entrance of Microsoft's analytics solution (code named Gatineau) as one of several disruptive events in the web analytics environment. The report indicates three primary events that have recently had a major impact on web analytics customers.

  • Changes in WebTrends' senior management,
  • The acquisition of Visual Sciences by Omniture,
  • The Gatineau beta.

One of Gatineau's differentiating features is the ability to display demographic data—age, gender, and occupation—about a site's visitors.

“...spelled out clearly in the privacy policies of web sites using Gatineau to allay possible site visitor concerns.”

The CMS Watch report strikes a note of caution about Gatineau's introduction of visitors' demographic data into web analysis.

“Customers will be able to use ‘anonymized’ demographic data from site visitors who have signed up for a Live ID through Microsoft's Hotmail or Messenger, however, this will need to be spelled out clearly in the privacy policies of web sites using Gatineau to allay possible site visitor concerns.”

Is Gatineau a threat to privacy?

If you manage a commercial web site, your purpose is likely to sell a product, acquire a lead, or reinforce a brand. The more you know about your site's visitors, the better you can accomplish your goal by serving the right content to the right people. Gatineau provides unique insight into the age, gender, and occupation of your site's visitors but is it a potential risk to your visitors' privacy?

What's useful to a publisher
may be threatening to the pubic.

CHS: Capitol Hill Seattle offers an example of the value of demographic data. It's a blog about the fancy pants part of Seattle's Capitol Hill neighborhood. It uses Gatineau as a tool to understand which articles are of interest to different segments of its audience. Some of the data is predictable—men prefer articles about sex, women prefer articles about baby names—but the overall age of the audience is surprisingly mature for a blog that caters to the trendy Capitol Hill crowd. If you're Justin Carder, you're responsible for CHS. Knowing that a large number of your readers are more than 35 years old makes a tremendous difference in your choice of content and writing style.

What's useful to a publisher may be threatening to the pubic, however. Concern over the privacy of personal information is evident even in the comments on CHS.

Ian Thomas is Microsoft's Director of Customer Intelligence—the man responsible for Gatineau. On his blog Lies, Damned Lies he refers to a white paper describing Microsoft's anonymization process (I'm pretty sure that's a legitimate word) with the imposing title Privacy Protections in Microsoft's Ad Serving System and the Process of “De-indentification”.

Although it's a surprisingly accessible white paper, I don't expect you to read all 11 pages. Here are the Cliff Notes.

What does Microsoft know about you?

Microsoft stores demographic and behavioral targeting data about a person separately from their contact information with strong safeguards in place to prevent “unauthorized correlation” of the separate data sets. The key that joins these data sets is an Anonymous ID—the ANID.

What kind of data is accumulated? Certainly the information you supply when signing up for Hotmail or any number of Microsoft services. As well, your behavior on Microsoft web sites—which sites you visit, which parts of those sites, and how often. Also, publicly available data supplied by third parties may be used to complete your profile. That's potentially a lot of data. Safeguarding that data is a big deal.

A cookie provides an identity that
persists across web sessions.

All of this data depends upon a cookie. There are a great many misconceptions about cookies. Basically it's a simple text file that resides on your hard drive, a simple but very specific text file. The cookie contains at least one unique value that identifies a specific web browser on a specific machine. As long as you use the same browser on the same machine, you can be identified upon your return. Unless, of course, you delete your cookies.

A cookie provides an identity that persists across web sessions. Web sites have no memory of their own—they live entirely in the moment—but your browsing behavior can be saved to a database. Without the cookie to bridge separate sessions, however, that data would be useless. The cookie provides continuity.

Microsoft actually uses three different cookies—the Machine Unique ID (MUID), the Windows Live User ID (LiveID), and the Anonymous ID (ANID). (The US military is possibly the only organization more enamored of acronyms than Microsoft.)

Behavioral vs. demographic targeting

The MUID identifies your browsing behavior and enables the behavioral targeting of ads. It doesn't require that you log into an account, it hasn't a clue as to your gender, and it doesn't need to know your name. It just provides the unique value that threads together your history of pages visited and content viewed within Microsoft's domains.

You get more relevant content, the site gets a higher conversion rate and everyone wins! Unless you end up giving away more than you get...

From this data a site can build a detailed profile of the content that interests you and then use that profile to provide additional content or offers relevant to your interests. You get more relevant content, the site gets a higher conversion rate and everyone wins! Unless you end up giving away more than you get, like personal information you'd rather keep personal.

The LiveID is the passport to Microsoft's sites and services. (In fact, it was previously called the Passport ID.) Whenever you log into Windows Live Hotmail, Messenger, Spaces or a continually increasing number of Windows Live web services, a cookie with a LiveID is placed on your hard drive. The LiveID is linked to the information you provided when you first signed up for the service—personally identifiable information (PII).

Microsoft's online advertising principles

Because the LiveID is associated with personally identifiable information, Microsoft doesn't use it to target ads. In fact, one of Microsoft’s online advertising principles is that its ad targeting platform can select ads based only on data that does not personally and directly identify individual users. The same is true of Gatineau.

Instead, the ANID (Anonymous ID) is used for serving ads and identifying demographic information. The ANID is placed in a cookie on your hard drive at the same time as the LiveID.

If even the mention of cryptography
makes your
head hurt, you're not alone.

The ANID is a cryptographic hash function of the LiveID. I know nothing of cryptography or hash functions, so I'll quote from the white paper.

“The ANID is derived by applying a one-way cryptographic hash function to the LiveID. A one-way cryptographic hash function ensures that there is no practical way of deriving the original value from the resulting hash value—that is, the process cannot be reversed to obtain the original number.”

While personally identifiable information is associated with the LiveID, only information that can't be used to identify an individual—age, gender, occupation, and such—is associated with the ANID and it's the ANID that used for serving ads and providing demographic data to Gatineau.

For a more detailed and coherent explanation of the ANID and it's function in safeguarding personally identifiable information, I'd suggest reading Privacy Protections in Microsoft's Ad Serving System and the Process of “De-indentification”. (Cliff Notes never were as good as the original.)

The bottom line

As a web publisher, you have a pressing need to know more about your site visitors in order to deliver a more compelling offer and increase conversions. You also have a responsibility to those same people, a responsibility to safeguard their trust. That trust may be explicit (a sale or subscription) or implicit. Your visitors' personally identifiable information should remain personal until they decide to share it and even then it should be shared only within the scope of their consent. Microsoft has that same responsibility with Gatineau. I'm no authority on security and encryption but it seems to me they've taken that responsibility seriously and delivered on it.

What's your opinion?



Add to: Yigg Add to: Digg Add to: Del.icio.us Add to: Reddit Add to: Simpy Add to: StumbleUpon Add to: Slashdot Add to: Furl Add to: Yahoo Add to: Spurl Add to: Google Add to: Blogmarks Add to: Technorati

November 30, 2007 in Analytics | Permalink

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8345844dc69e200e54f9235468833

Listed below are links to weblogs that reference Gatineau A Threat?:

Comments

ANY TIME Microsoft starts collecting personal information it is scarey. Has everyone forgotten that Microsoft sells it's membership lists to advertizers and e-mail marketers for the purpose of sending us their crap advertising??

Posted by: Tim | Dec 3, 2007 12:26:22 PM

Tim: It's certainly not an uncommon opinion of Microsoft. From my own limited experience (two years as a MS employee), it seems they take the security of personal information very seriously. I don't have much insight into the company's marketing, however.

Posted by: Charles Thrasher | Dec 3, 2007 2:55:14 PM

Post a comment