Internet Mapping : from Art to Science 2008 Value to me : * The tuple space idea is interesting * Co-ordination of data is key * Lacking any this Africa (to my knowledge and to this scale, only 1 ark node :/ ) Paper's Angle : Architecture of a distributed system that performs monitoring and taking measurements designed to deal with things like parallization, distribution and co-ordination so that researchers concerned about the measurements themselves don't have to concern themselves with those issues. General : -> Dependence on the Internet (yadada) -> Comparision of the internet to a trillion dollar ecosystem (interesting, I kind of like the idea) -> Some data points suggest doubt in the ability for the internet to act as the worlds communication medium -> Lack of data sharing due to distributed ownership -> Project to support large scale active measurement studies of the internet -> secure -> analysis/processing -> annotation -> topology generation -> interactive visualization Archipelago -> is Caida's new measrument infastructure -> skitter based active measurement infrastructure -> 31 "Arks" (only one in africa ...) -> Plans to deploy more in africa/south america -> most monitors are in academia (plans to have some organization etc) -> Promotes rapid and easy dev -> high productivity -> more "risky" experiments/measurements -> possibly better results -> Scamper -> steesr measurements -> general purpose management engine -> handles parrelized traceroutes -> talk about other stuff specific to their platform Dynamic and Co-ordinated measurements -> At simplest a measurement infrasture executes pre-configured set of measurements to a static set of targets -> Some measurements have dynamic components and co-ord among measurement nodes -> prefix monitoring to detect for prefix hijacking (as an example) -> Major focus on co-ordination -> planning, co-ordinating and executing a series of computations -> allows for measurement components to work well together -> New implementation of Ark for co-ord -> Miranda -> tuple space co-ordination model (ref to orginal paper is there) -> basically distributed shared memory with simple ops -> stores tuples and retrival of tuples through pattern matching (sounds quite strange ...) -> supports 1-1 and many-many communication -> decentralize measurements (those at actual nodes) will automagically communicate with centralized control when needed and more cause further measurements dependent of results -> Simple to use + don't have to worry about network faults etc in deployment -> Discuss how tuples arent addressed to a particular recipreant (and alot of other complex stuff) -> Above allows to break complex measurements into phases Measurement Services -> Section mainly discusses ways that ark provides services -> XML-RPC and SOAP are traditional -> Tuple space allows for same sort of thing (see the ping example, pretty neat idea) ->rest of section discuesses advantages of this approach Macroscopic IP Topology -> Measures IP Level paths to dynamically generated list of IP addies covering all /24 prefixes in routable ipv4 space -> Due to distribution its possible to get all the traceroute info for 13 monitors to probe 7.4 million /24's at 100pps -> Do lots of random probing of /24 space -> avoid possible routing probelms -> avoid complaints -> Make use of scamper for trace routes -> ICMP Paris Traceroute (something to look into) -> Use a bulk DNS Lookup service due to the number of lookups required (Scapy) -> Two datasets ->Simple IP-Hostname map ->Raw DNS Query and the responses (clever...) Alias Resolution -> is reconstructing the router topology from trace routes -> requires grouping ip's belonging to the same router together -> CAIDA IFFinder tool -> Paper describes how this tool works ( it is rather hectic) -> Should add more here when I review paper -> The rest of the paper deals with AS internet topology maps, while interesting isn't very relevant