What is the best source of information for a peer?
-- Michael Bolton
Are you any relation to the singer?
Anyway, I am working on a white paper called The Business Case for Direct Connect, (an enterprise form of peering) that documents the various business rationale (financial, security, performance, etc.) for using direct connect services to access various cloud services. Amazon was the first to market, and more importantly, has all prices on-line so I compare Internet Connect against Direct Connect Please e-mail me if you would be open to letting me talk you through the white paper for feedback.
As part of this process I developed a simulation model of the Internet peering ecosystems that was derived from peeringdb among other sources. I stumbled upon some bad data in peeringdb, something you find and watch out for in pretty much all databases.
That peeringdb data is not perfect is not a new story - Job Snijders did a nice presentation at NANOG on PeeringDB Accuracy comparing, among other things, peeringdb and what is seen in the routing tables. I also found “Using PeeringDB to Understand the Peering Ecosystem” to be helpful for this modeling work as well.
I found peeringdb, as a source of peering intel, to be pretty good, but suffering from the same thing that makes it powerful - the information is entered and maintained (or not) by humans, working in organizations that peer. The value of the database is proportional to the value of the information that others enter.
The Information People Enter
The information that people enter into peeringdb is subject to at least two challenges. First, people make mistakes.Besides fat fingering data, I found a number of places where people entered things like “111 8th Avenue”, “111 Eight Avenue”,”111 8th Street” or “111 8th” to all mean the same things. Some fields seem to have been filled with placeholder text, fields that someone figured they would fill in later. As a result I found myself increasingly coding normalization entries, a dictionary thesaurus to point all of these variants to the same authoritative entry.
This leads to the second problem - you generally don’t use peeringDB to find your own information, you use it to find out about others. It doesn’t matter as much to you that your information is up-to-date, particularly if you are not courting peers at this time.
For peeringDB to work for peering automation, which seems to be the direction people are heading these days, the peering information needs to be closer to 100% accurate and minimally accurately maintained by the organization or removed. The current approach is to encourage networks to keep their information up-to-date through reminders at conferences. Some networks require up-to-date information on peeringDB as a prerequisite to peering with them. This is good effort, but not quite good enough to allow peeringdb to be the data source for automated configuration.
And then people leave organizations, and they have no incentive to update that fact before they leave. The remaining networking folks may not have any idea that peeringDB exists or that the person that left had maintained it. So for this and other reasons, the information in publicly maintained databases have a natural tendency towards getting stale.
PeeringDB remains one of the best sources of peering information. It is the place many of us start, and even with the problems mentioned, it is the best tool to date. Accept peeringdb for what it is, but pull information from the exchange point web sites as well, and depend on information only when validated by the prospective peer.