I did some cleanup this morning on the Options class and the options_spec, mainly to remove items that seemed like they shouldn’t be tested. Here’s where I’m currently at:
Previously I was testing against @config = YAML.load(File.open(config)) to see if it would throw an error when passed nil or empty string for the argument. I’ve since realized that’s basically testing Ruby Core to see if it’s working as described in the docs…which seems silly to me. Now if I were handling those exceptions and doing something in response, then yeah, I would want to test it. But since I’m allowing the program to explode if you try to load an empty config file I figure it’s best to just let the core or stdlib do their thing and assume that it was well tested. Having said that I think we’ve got decent coverage on Options and can move back to the Runner and then the Crawler.
By the way, if you want a more visual representation of our tests you can run bundle exec rspec -f html -o index.html which will generate an html file showing what passed/failed and is still pending.
Mocking Nokogiri requests with FakeWeb
I was curious if it would be possible to mock the Nokogiri requests from our Crawler so I did a bit of googling. It looks like the best options would be either Artifice or FakeWeb. I’m not super familiar with Rack and I don’t want to write a separate app just to mock a few calls so I’ve decided to go with FakeWeb.
First we add it to our Gemfile
and do the usual bundle install. Next we’ll stub out our crawler_spec and verify that it’s at least detecting all the methods on the class.
I also want to verify that my class responds to an alternative constructor. Rather than just saying Crawler.new I’d prefer to use Crawler.from_uri. It doesn’t serve much of a purpose but I think it’s a good exercise. Here’s the modified test to support it.
And here is our Crawler class based largely on our original Crawler from the first post.
If we run the specs now they should pass but they’re EXTREMELY slow! Just 4 examples takes 6 seconds O_O. Can you spot the source of all that lag? Take a look at what happens inside of Crawler#initialize. Notice how it’s creating a new Nokogiri doc every time? Since we have a before block in our spec that means that each test (after the before) is going out and parsing our website. Let’s see if FakeWeb can help us out some.
While it’s not the prettiest test ever written it does get the job done. 0.00359 seconds for 4 examples down from 6 seconds! That’s going to wrap it up for tonight. Tomorrow we’ll finish off the spec and the implementation and finally get some data coming down from the live site. Until then!