Object Oriented Scraper Backed With Tests Pt. 2

I’m picking up from where I left off last night. If you look back at the previous post we ended with a spec’d out Runner object. Now we need to build our Crawler which will slurp up all the content from our posts and return them as meaningful data.

Object Oriented Scraper Backed With Tests

I just drank a ton of coffee and I’m blasting music in my headphones so this post my bit a bit more scatter-shot than most since I can’t really focus :]

Yesterday I managed to build a pretty naive scraper using Nokogiri which would count how often each word was used in the first 10 posts of this blog. Basically scraping the home URL of the site and grabbing everything inside of the div.entry-content selector.

Today I want to convert it into a more OO library so it’s a bit more modular and reusable. I also want to back everything with RSpec tests to get into the practice. While it won’t be true TDD I’ll try to write the tests for the library before putting the classes together.

Building a Simple Scraper With Nokogiri in Ruby

Since I’ve been talking so much about D3.js lately I thought it might be fun to start a little project which combines D3 and Ruby. The idea is to build a very simple page scraper that counts how often certain words are used in each post. I’ve also decided to start adding a little block of metadata at the end of each post so I can graph that over time as well.

D3 Basics: The Linear Scale

In the last post we did a basic introduction to the concept of scales in D3.js. Today we’ll look at our first scale and write some code to visualize it.

D3 Basics: An Introduction to Scales

After selections, scales are probably the most frequently used element in D3 because they faciliate such great control over data and screen space. I want to spend several posts documenting how scales work to help out anyone who is struggling with the concept. We’ll start with a high level overview of what a scale is in D3 and then explore the individual objects to learn their nuances.

D3.js and Octopress

This morning I was hoping to cover some of the basics of using D3.js. Along the way I realized I really wanted people to be able to see the graphs on the blog itself. I could have used JSFiddle, but I didn’t like all that chrome repeated across the page. So I came up with my own solution with a little bit of hacking :)