One of the things that lights my brain up is bring able to process, analyze, and visualize cool datasets. I’ve been incredibly fortunate to be able to make a living out of that. This tag is a catch-all for projects related to data processing, statistics, and data engineering.
— City, with natural stripes — or nature, with city stripes? Annotations for A Pattern Language, and tracking public opinion of where Americans want to live over the past 50 years. (4 min read)
— When it comes to collecting data, there's a fine line between making a good site and making an invasive product. To make this site more useful, I want to collect enough data to improve, but never enough to undermine anybody's privacy. (3 min read)
— Is Jane Street run by soccer-loving ants? Inconclusive. We *can* conclude that they're fans of Markov Chains, though — an invaluable tool for understanding complex data structures. (1 min read)
— The beginning of a topological review of the 1977 urban design and architecture reference book A Pattern Language, and a journey to understand Earth's greatest graph: the Earth, itself. (12 min read)
— Stop reprocessing your entire dataset every time new data arrives.
A practical guide to Spark Structured Streaming with code examples and cost logic.
(9 min read)
— Efficiency: spending six hours building a web scraper to avoid five minutes of daily work.
Automating a business simulation because checking in is for chumps.
(8 min read)
— My college presentation on the Gale-Shapley paper, recorded on an iPad, like a true professional.
Non-market environments, matchmaking lattices, and gratitude for good professors.
(1 min read)
— A first foray into network visualization: messy graphs, abject terror.
Early data viz experiments searching for supply loops. Bad graphs; interesting questions.
(2 min read)