teach computers to read the internet (open-source) structure > semi-structured > unstructured 1. parse what you expect; 2. see what else is there; 3. repeat real fast parse: grammar says what to expect see: sample what you got from visualization, saving best to a file factbook has characters factbook has keys and values factbook key capitalization means something factbook has a few keys used everywhere real fast trying experiments real fast reading files read all of wikipedia in under an hour exploratory mode works because of innovation in 2004 (2 minutes) this is not the stuff you used to learn in compiler class scanner v. parser v. PEG wiki syntax == repeated scanning by php preview: restoring old programs (45 seconds) preview: reinventing wiki for open-data QR Code (15 seconds)