So the other day, I was trying to get traceroute working with python, so that I could do a traceroute to different computers, and see what IPs were in common between the different routes. I quickly learned that ICMP (the protocol behind ping/traceroute) is nowhere near as supported by the python socket library as TCP or UDP is. In fact, other than a single IP flag, ICMP has no support!
So after a bit of struggling, I got my own traceroute/ping working in python using raw sockets. One thing I learned was that you have to be root to open a raw-socket server in linux. That made me wonder: "Well, how do the official traceroute/ping applications work without being root?" Well, it turns out that those apps (at least in Ubuntu) are setuid to root, so when you run them, they get run with the privileges of the root user. Live and learn, I guess.
Anyways, I got the traceroute working, and then I used the generated routes to create graphs using python-graph showing what servers are common among the different routes. I then plotted the routes using Graphviz. Here are some of the better graphs generated by my program:
First, this is a graph showing the paths from my computer (while logged into the campus wireless) to a few of the computers in the Computer Science Instructional Facility (CSIF), and then to Google, Yahoo, Digg, and PoweredByToast. (PoweredByToast is behind static.theplanet.com... we don't use any of those fancy dedicated servers... :) ) Also note that I blurred the names of some of the CSIF's internal servers. They aren't top-secret or anything, and I don't believe in security-through-obscurity, but it's still probably not a very good idea to publish them... :)
In this graph, I took the top 20 sites on Alexa, and mapped them. (I removed all Microsoft-owned sites, since they seem to reject all ICMP packets anyway, so we wouldn't have learned anything, and we'd have to wait for the program to timeout on all of the connections.)
Problems with the program:
- I don't check any of the sequence/id fields. In fact, I ignore them completely. This is a problem. For example, try doing a traceroute to a microsoft server. (Microsoft ignores ICMP messages, so the traceroute will eventually hang.) Now, use the normal ping utility to ping google. My program will see the ICMP message, and assume it's from Microsoft! Oops!
- The traceroute ignores failed links! For example, say we have servers A, B, and C. A is connected to B, and B to C. If we somehow fail to ping B, then the traceroute program will happily state that A is connected directly to C. Oops!
- If there are multiple paths to a destination, or one address resolves to multiple destinations, then the resulting route may include nodes from either path. This happened several times when I was mapping the route taken to google.
- This doesn't work behind a NAT router. I know it is possible to get it working eventually, but I'll figure that out at some other time...
- Try doing a traceroute on yourself. The program freaks out, because it doesn't understand why the remote server would be doing a PING REQUEST. :)
Anyways, the code can be found here:
Dependencies:
Source: http://www.poweredbytoast.com/files/tracert.py.txt
Note: You need to be root to use this program! So usage would be something like:
$ sudo python tracert.py google.com yahoo.com youtube.com ...
