“And it’s so easy, it’s so easy, it’s just so easy, to do the goat dance. Yeah, but it’s so hard, it’s so hard, to love somebody, really love somebody.” -Greg Brown, So Hard
If affiliate marketing is the goat dance (whatever that is), then tracking statistics is loving somebody. In Feeling the Pain, I mentioned that one of the top frustrations for affiliate marketers is having to log into every network to access the tracking statistics from their affiliate programs. Everyone we talk to that runs on more than one affiliate network tells us what a hassle that is.
So that begs the question of why it is so hard to aggregate those statistics. Certainly, companies have built databases for personal use that aggregate statistics across multiple networks, and some have even done it for commercial use. But they will tell you it is hard, and not a panacea. When you understand what they have to do, you begin to understand why.
First, there are only so many ways you can pull data from a database. If you have direct access to the database and know what report to run (or write), it is easy. But no network is going to give a publisher or advertiser direct access to their database. Instead, your access is either going to come from an Application Programming Interface (API) that was written and authorized by the network, viewing the reports through a Web browser (which you may or may not be able to export to your hard drive), or from the network pushing the report to you via FTP or email.
APIs
Let’s look at APIs first. APIs are great if you can get them. By writing an API, a network essentially tells developers what parts of their database they can access, and provides the vocabulary for them to pull the data they are looking for. So if every network had an API, a developer could easily use those APIs to pull all the tracking statistics into your own database, which you could then run consolidated reports from.
Here are the problems, though. First, not all networks have APIs. Even if they aren’t opposed to having you get your information that way (it is your data, after all), they have to take the time and effort to write the APIs. That includes documentation on what APIs are available, and what the data structures, object classes, protocols, etc. are used in them. Second, if they do have APIs (yes, there will most likely be multiple APIs), they may not be written using the same protocol. The same network can have some APIs that use Simple Object Access Protocol (SOAP) and others that use Representational State Transfer (REST). That drives developers nuts. Third, even if they have some APIs, they may not have them for everything you are looking for in your consolidated report. That means you either have to forego that desired information, or you pull it in with a scrape.
Scrapes
A scrape refers to creating a program that will go to a website, log in as a user with a username and password, go to a particular page on the site, run any necessary reports, and harvest the results of those reports into a database outside the site. This happens automatically, perhaps on a daily basis. This sounds fine, considering you ostensibly should be able to pull all the information you have access to as a real user logging in.
Except for these problems. First, companies don’t particularly like non-humans logging into their sites. They may employ software tools to thwart hackers, which could restrict access. Just because an application can scrape the site today doesn’t mean it will have access tomorrow. And even if it does, it relies on everything staying the same on the site. If the website changes to where the report shows up in a different place, or with a different name, the data pulled from the scrape could be missing, or worse, replaced with incorrect data. That means your developer has to constantly be tweaking your report. And third, unlike APIs, networks don’t have any control over what information you’re pulling from their databases. Besides not liking that, it may be in violation of their terms of service.
The Good News
As we’ve talked to networks about data aggregation, we have yet to have one tell us they are opposed to it.
The Bad News
Just because they aren’t opposed to it, networks don’t seem to have a big enough incentive yet to help out with the effort, short of some of them providing APIs. Even though their customers are all clamoring for it, they aren’t leaving the network over it. After all, where would they go?




