Doubleclick.wtf

Steve and I were asked to implement doubleclick.net for some site by “oh you just drop some code in the page and it works great OK?”

No. We never do this, because we’re actually responsible for the crap that gets served from our servers, and there’s already enough clean-up we have to do.

So let’s take a look at this code (with identifying marks removed to protect the funky).

Here’s the original code:

<script type="text/javascript">
var axel = Math.random() + "";
var a = axel * 10000000000000;
document.write('<iframe src="http://fls.doubleclick.net/activityi;src=1234567;type=feline123;cat=tabby012;ord=1;num=' + a + '?" width="1" height="1" frameborder="0"></iframe>'); 
</script> 
<noscript> 
<iframe src="https://fls.doubleclick.net/activityi;src=1234567;type=feline123;cat=tabby012;ord=1;num=1?" width="1" height="1" frameborder="0"></iframe> 
</noscript>

Well that’s special. It generates an iframe so it can load whatever content it wants from doubleclick.net’s servers. That makes me slightly nervous and annoyed, but what’s worse, the code is invalid XHTML strict, so I’m going to have to rewrite it to be valid. Might as well rewrite the whole thing, since the Javascript is pretty stinky, too. (At least they took the trouble to write a noscript version)

var axel = Math.random() + "";
var a = axel * 10000000000000;

What does it do? Well, at first glance it looks like it tries to create a very long string. But actually no, in Javascript, "12345.67" * 1 == Number(12345.67). So this can be rewritten to make sense, be more efficient, and be one line: var a = 10000000000000 * Math.random();

Next, we can build the attributes in a way that makes this whole block of code more reusable:

var url_src = 1234567;
var url_type = "feline123";
var url_cat = "tabby012";
var url_ord = 1;
// and just for completeness
var url_num = a;
var data = "http://fls.doubleclick.net/activityi" +
  ";src=" + url_src +
  ";type=" + url_type +
  ";cat=" + url_cat +
  ";ord=" + url_ord +
  ";num=" + url_num + "?";

Then we’ve got the invalid iframe element. The object tag can be used in most cases in place of the iframe tag, so let’s use that. We build the element into the DOM:

var o = document.createElement("object");
o.data = data;
o.width = 1;
o.height = 1;
// Ignore that "frameborder" attribute because
// it's neither valid nor valuable.

…and since we were asked to insert this code “as close as possible to the opening <body> tag,” insert it before the first child of the body element:

var b = document.body;
b.insertBefore(o, b.firstChild);

Putting it all together:

// Remember me? I got renamed!
var url_num = 10000000000000 * Math.random();
var url_src = 1234567;
var url_type = "feline123";
var url_cat = "tabby012";
var url_ord = 1;
var data = "http://fls.doubleclick.net/activityi" +
  ";src=" + url_src +
  ";type=" + url_type +
  ";cat=" + url_cat +
  ";ord=" + url_ord +
  ";num=" + url_num + "?";

var o = document.createElement("object");
o.data = data;
o.width = 1;
o.height = 1;

var b = document.body;
b.insertBefore(o, b.firstChild);

When I ran this code in Firebug, it produced the following DOM node on my page:

<object height="1" width="1" data="http://fls.doubleclick.net/activityi;src=1234567;type=feline123;cat=tabby012;ord=1;num=9608606539790.215?"></object>

So I figured I would grab a copy of that URL using wget and see what it looked like. It looks like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><title></title></head><body style="background-color: transparent"><img src="http://ad.doubleclick.net/activity;src=1234567;type=feline123;cat=tabby012;ord=1;num=9608606539790.215?" alt=""/></body></html>

So… Wait, what? The only differences between the URL in that <img> and the URL generated for the <object> is fls has become ad and activityi has become activity. So why didn’t we just load that <img> in the first place? Only Doubleclick knows for sure, but loading the iframe and then the image does tell them a little bit more about browser capabilities, because it makes two different requests to their servers from your browser. Clever, but very irritating. On the other hand, maybe they’re just using the <img>.