ALIVE and Treetop

February 23rd, 2008

A sweet project

I’m proud to say really fun mini-project went live today. OKWU Alive for Oklahoma Wesleyan University, a very forward thinking college. They wanted a site that is mainly updated by twitter messages.

We used a combination of Radiant, Twitter4R, and an upcoming library Treetop that I heard about a RubyConf 2007. Ever since attending Nathan Sobo’s presentation I’ve wanted to put it to use, but kept putting it off.

The “challenge”

To give some context to the site, OKWU wanted to parse direct twitter messages and add them to the site. The thing that made this interesting, is that they wanted to be able to tag each message. Most messages take on the form of:


  tag : message

Now I could obviously use regular expressions to parse out both the tag and the message, but what fun is that?

Treetop to the rescue

Treetop is structured to take a grammar file, that can be brought into ruby code. Here is the grammar we used to define the twitter message:


  grammar Twitter
    rule status
      tag delimiter message
    end

    rule tag
      [a-zA-Z_0-9-]+
    end

    rule message
      .*
    end

    rule delimiter
      space* ':' space*
    end

    rule space
      ' '
    end
  end

If you haven’t worked with grammar specifications before, don’t feel overwhelmed. What this essentially says is “a twitter status (another definition of a message from twitter) is composed of a tag followed by a delimiter followed by a message.” With each part, you can find a more specific definition. For example, a tag can only take the form of alphanumerical characters, underscores and dashes.

“Ok, that’s neat, but how is it useful?”

The coolness comes in with the consumption of the grammar. Here’s the code that uses Treetop:


  require "treetop" 
  Treetop.load "twitter" 

  parser = TwitterParser.new
  parsed_results = parser.parse("awesomified : you won't believe it's that easy")

  tag = parsed_results.get_tag
  message = parsed_results.get_message
  puts "message: #{message} classified under: #{tag}" 

As you can see, Treetop loaded in the grammar and immediately gave me a TwitterParser. From there I parsed an example twitter message, and with the results I retrieved the tag and message.

“Wait, how did you get the tag and message?”

Well, I didn’t exactly show the entire grammar. Here’s the final one:


  grammar Twitter
    rule status
      tag delimiter message {
        def get_tag
          tag.text_value
        end

        def get_message
          message.text_value
        end
      }
    end

    rule tag
      [a-zA-Z_0-9-]+
    end

    rule message
      .*
    end

    rule delimiter
      space* ':' space*
    end

    rule space
      ' '
    end
  end

Almost identical to the above except…it has friggin’ ruby code attached! That means when given a status, I can call #get_tag and #get_message to return the items. Pretty doggone easy.

“Impressive, but how is this better than just using regular expressions”

So I will not deny the same thing could be accomplished with a single regex, but this looks sexy. And it has additional benefits. Lets say in the future they want to:

  1. Allow multiple tags
  2. Allow spaces, and commas to be valid tag delimiters
  3. Allow the tags to be optional

Here’s a grammar modified with those exact requests:


  grammar Twitter
    rule status
      (tags delimiter)? text {
        def get_tags
          if self.class.method_defined? "tags" 
            tags.get_tags
          else
            []
          end
        end

        def get_message
          text.text_value
        end
      }
    end

    rule tags
      tag optional_tags:(optional_tag*) {
        def get_tags
          [tag.get_tag] + optional_tags.elements.map { |e| e.get_tag }
        end
      }
    end

    rule optional_tag
      tag_delimiter tag {
        def get_tag
          tag.text_value
        end
      }
    end

    rule tag
      [a-zA-Z_0-9-]+ {
        def get_tag
          text_value
        end
      }
    end

    rule text
      .*
    end

    rule delimiter
      space* ':' space*
    end

    rule space
      ' '
    end

    rule tag_delimiter
      space* ',' space* / space+
    end
  end
Some examples and their output:

  results = parser.parse("tag1 : the message")
  results.get_tags      # => ["tag1"]
  results.get_message   # => "the message" 

  results = parser.parse("tag1 tag2, tag3 : the message")
  results.get_tags      # => ["tag1", "tag2", "tag3"]
  results.get_message   # => "the message" 

  results = parser.parse("the message")
  results.get_tags      # => []
  results.get_message   # => "the message" 

  results = parser.parse(": the message")
  results.get_tags      # => []
  results.get_message   # => ": the message" 

  # Yea, well not bad for only 15 min, lets chalk the last one up to user-error.

I want to thank Nathan Sobo for putting together such a useful and intuitive library. For more information about Treetop, you can check out the site as well as the mailing list.

Combined PGP Keys Into One

February 18th, 2008

After finding out today that pgp keys can have multiple email addresses, I revoked all but public key and combined all emails into one key.

How

Edit a key in the terminal:


> gpg --edit-key KEY_ID
Command> adduid
Add name and email address. Then remember to set a primary uid:

Command> uid NUMBER_IN_PARENTHS_TO_SELECT
Command> primary
Command> save

Should be all good. Remember to revoke the old keys and re-export the public key.

Additional Tip

You can share your public key with others easier by uploading it to a key server like MIT

Git Stash is Sweet

February 16th, 2008

Earlier I mentioned a method of storing away changes in a patch, rebasing and then applying the patch so you don’t have to commit beforehand.

Yesterday Janson found git-stash, and it is sweet.

  1. type “git stash”
  2. do your changes
  3. type “git stash apply”
  4. profit!

Background

When we started development on Ascribe about 3 months ago, I had a hankering to try out RSpec. We used a lot of similar concept on a test/unit side such as small readable tests and mock testing, but we hadn’t given view specs a try. To be fair, I had been pretty critical of view specs prior to the project; not really seeing the benefits. This is a write up on our experiences. There will be a lot of verbiage like “felt”, “thought” and not a lot hard numbers.

Expectations

As I said, I had been pretty critical about using view specs. I had seen them used together in functional tests, and I strongly disliked the “cluttered” look of controller assertions intermixed with the view assertions. My main concerns using them were:

  1. A significant amount of time to initially create the spec
  2. The additional rigidity of changing views
  3. Brittleness in comparison to other tests we value (unit, controller, and integration)
  4. The added technical hurdle for allowing designers or html developers to contribute

In practice

“A significant amount of time to write the specs”

We were actually quite surprised to find that writing the view specs were the easiest among all of our tests. Essentially, all that is asserted is the presence of text. When in comparison to testing object interaction, or algorithms, this felt brain-dead simple. Further reinforcement is that there shouldn’t be a lot of code in the views. If you noticed your view or spec getting complicated, it generally helped to pull out code (helpers, additional model behaviors). Of course we found testing pitfalls, and became more judicious in what we would assert.

Examples of what we would test:
  1. Presence of html controls like text fields, text areas, select, etc.
  2. Presence of labels
  3. Important attributes for html controls (names, ids, and sometimes classes1)
  4. Form actions, and alternate http methods (delete, put)
  5. Partial renders
  6. Custom helper calls
  7. Iterators and If/Else controls
  8. Anchors
Examples of what we would not test:
  1. HTML Containment. We would only assert this if it was required for RJS
  2. Exact text. More often than not, we wold assert a portion of the text with case insensitive regular expressions
  3. Common helpers like h()
  4. HTML classes not used for client javascript or rjs. We found this hindered css from an html developers point of view, and did not provide a lot of value.

“Additional rigidity for changing views”

To discuss this, it may help to lend insight to our team. There was Janson and me, both developers who wrote and maintained ruby, all specs, and html / css. We also had Aaron who is not a developer, but is really knowledgeable about html / css. One of our concerns was every time Aaron would want to make a change to the design, either we would have to be there with him, or we would have to teach him how to run, fix, and create view specs.

We ended up opting for a completely different approach. We told him to ignore the specs in all regards save one. If he was going to add rhtml calls (typically anything with <%= %>) we asked if he would open up the spec for the view and create a pending statement.


    it "should have a link to the dashboard" 

That way we would be able to return later and flesh out the stubbed specs. Although this isn’t exactly promoting TDD, it allowed him to keep running, as well as placing a bookmark in the code for us. We never felt as if we were getting bombarded with pending statements, and quickly writing them and moving on didn’t feel time consuming.

“Brittleness in comparison to other tests we value (unit, controller, and integration)”

With Aaron working on the html, we were even more concerned with the brittleness of our tests than if the entire team were ruby developers. At Elevator Up it is very common that we have non-developers contributing to the html / css, so obviously I was apprehensive of being overwhelmed by constantly breaking specs. No developer likes to consistently fix other developers broken tests. And it wasn’t exactly like we could yell at Aaron for not doing his job. This is one expectation that played out in the course of development, but in a different light.

When Aaron would make a commit it was pretty common for him to break tests. In fact I believe his record was 38, and without any numbers, my guess was he averaged around 5 – 10 breaks per commit. What we didn’t anticipate was that fixing 30+ tests took no more than 15 minutes. That was a huge shocker to me. A non-developer would rip out the code we were asserting, make additional calls that weren’t stubbed, and it only took us 15 minutes to fix them all.

We found that the biggest breakages were due to calling mocked helpers in a valid, but unexpected manner. Since all other spec relied on a valid render, they would break as well. In some cases, fixing one line, would fix 10-20 assertions. There were also a few instances where the specs actually broke for the right reasons. For example, a link_to was modified and the href wasn’t pointing to the right location.

“The added technical hurdle for allowing designers or html developers to contribute”

With only having Aaron add pending statements on code additions, he didn’t feel uncomfortable, or feel like he had to learn how to program just to make visual changes. This felt like a huge win for us. While it would be nice to have all html / css developers understand and maintain ruby, this isn’t realistic at Elevator Up.

Other Surprises

Up to now I’ve only talked how our expectations matched up to our experiences, however I haven’t mentioned any benefits of view testing. Talking about the benefits of testing views is just as difficult as talking about the benefits of unit testing, or using mocks. You can’t convince another developer with anecdotes alone. Using them in practice normally speaks volumes in a much more articulate way.

That being said, as we worked on Ascribe, I kept a list of successes that I felt the view specs were the prime contributer to2.

  1. Checking label “for” mispellings
  2. Ensuring all input fields are present
  3. Asserting non “sunny day” code paths (if/then)
  4. Using fields that could potentially be nil
  5. Ensuring links are pointing to the correct places
  6. Ensuring forms are posting to the right places
  7. Ensuring forms are posting with the right method
  8. Incorrect usage of named routes
  9. Renaming / Refactoring Routes
  10. Ensuring render partials have correct locals
  11. Better confidence in refactoring views into partials / or helpers
  12. Ensuring correct usage of helpers link_to, link_to_remote, form_for
  13. Better testing-as-documentation for RJS
  14. Using pending tasks as view centric TODOs

Summary

A realization that sparked while we were at the end of the project, was that compared to unit, controller, and integration tests, view tests are cheap. Their creation and maintenance are very low, and for their small amount of effort, noticeable benefits arise. It feels comparable to investing pennies and earning $200. We were very pleased we went out on a limb and gave them a shot. Elevator Up has made it a point to add view specs to our development routine.

1 We found this useful since we use dashed ids instead of underscored. Also we found on more than one occasion, we were using rails tags incorrectly. One of the biggest culprits was setting ids and classes on select tags.

2 I’m not saying that these are unique to view specs, and can’t be achieved with other testing techniques.

Using Git

February 8th, 2008

Janson and I have been using Git for a few months now at work. We’re still keeping an open mind with regards to Mercurial and Bazaar however its clear that we’ll be moving to a distributed version control.

Working with git, I’ve come across a few gotchas and a few takeaways, especially in regards to git-svn.

Some terminology

Here’s some layman’s definitions to two very common terms with git. These are actually a very narrow view of what they actually do, but helps when starting out.

  1. “rebase” typically means to take the changes from a source branch, and bring them into the branch you are currently working with
  2. “merge” typically means to take the changes in your current branch, and push them into another branch

Ignore files and svn:ignore

There’s two ways for git to track ignored files. You can track it for your branch only in DIR/.git/info/exclude as well as a shared revisioned file at DIR/.gitignore. We’ve been preferring the latter on most of our projects.

When using git-svn, you can have it dump all svn:ignores into a git file which is convienient, however I ran into a few issues where git was trying to add / remove files that svn thought were ignored. If you’re going to interface with svn, I’d recommend keeping all ignores at the git level.

Merging conflicts

One thing thats cool about git is the amount of docs out there. That being said, it was hard for me to get the hang of merging conflicts. Its actually a relatively easy process once you’ve done it a few times. More often than not, the conflicts will happen on a rebase.

  1. First edit the conflicted file and merge the changes by hand
  2. Run a git add path/to/file
  3. Tell git to finish the rebase by git rebase --continue

Don’t under any circumstances delete the .dotest folder. I repeat, leave that folder alone. I think there’s some carry over from using svn that some people think that how you finish the merge. I was one of those people, and I know a few people who did the same. The .dotest allows you to run git rebase --abort to quit a rebase in the middle of a conflict.

Commiting files

Most people come across this within the first few days of using git. Unlike svn, you have to tell git what files you want to commit every changeset. You do this by running git add path/to/file on already revisioned files. However to save the hassle you can run git commit -a to add any tracked files to the next commit.

On a related note, you can use wildcards with git add. So git add * will add all the untracked files in the directory and any subdirectories. That can be helpful at times.

Reverting changes

This one seemed a bit unintuitive to me. Lets say you have a file, made some changes, then wanted to dump all changes for the latest known revision. To do this you run git checkout path/to/file or git checkout . to dump your entire changeset.

Don’t use git reset unless you really know what you are doing. When using git-svn, this can revert all changes since your last rebase with svn. Which can be a real pain in the ass.

Showing old revisions

You can do this pretty easily by running git log path/to/file taking the commit id and then running git show commit-id:path/to/file.

Dumping changes in order to rebase with svn

This can be a pain, and if anyone has an easier solution, please share. Let’s say you have a repository that has recently been rebased against svn. You start making some changes, and then you realize someone else commited something to svn that you need. The approach I’ve been taking is this:

  1. First make a patch of you current changes git diff > changes.patch
  2. Reset all your changes git checkout .
  3. Rebase with svn git-svn rebase
  4. Re-apply the changes to your directory. git-apply changes.patch

I hope this helps. If you have any other suggestions or tips, by all means share them.

Hamachi

February 5th, 2008

Me and a few buddies play Command and Conquer Generals. It’s a decent RTS, but has problematic networking issues. Before giving up, I came across Hamachi which is an awesome VPN app.

It’s free for small networks, and it is brain-dead easy to setup. Once all of us had it installed, we connected to the same network and were able to play as if we were in a LAN party. I was expecting some type of network latency going through the software, but I was pleasantly surprised at the results. I couldn’t tell a difference.

Soon I’ll probably install it on our office Media / File Server so we can transfer files and screen share remotely using bonjour without touching our router.

Hacking Nite Recap

February 5th, 2008

I’ve been so busy with work that I didn’t get to post how the Hacking session went. 2 weeks ago we met at the Elevator Up office and worked on a Kayak api Brandon Keepers had started earlier. We also used iui to allow searching from an iPhone.

I created a mailing list so we could coordinate for the next session.

Ascribe goes live

February 5th, 2008

We have been busting ass for over 2 solid months and last night we launched a new application. Check out Ascribe, a great portfolio management system specifically designed for contractors.

If you have any questions feel free to contact Jason Carpenter.