collect {thoughts}

drew olson, software developer, chicago

Good Software Developers

I’m approaching 8 years as a professional software developer. I’ve written a lot of code, worked on a few teams and helped build a team from 4 developers to ~60. Now, more than ever, I find myself thinking about what it means to be a good software developer. Not good as in “good is the enemy of great”, but good as in “I want to work with this person again”. I think there are three characteristics that make someone good in this industry.

  • Curiosity
  • Humility
  • Discipline

Curiosity is important because it fuels the drive to learn more, try new things, tinker and fail.

Humility is necessary to internalize feedback, keep trying after failure and learn from others.

Discipline, I’ve found, is both more subtle and more important than the other characteristics. It tells me how an individual will react under pressure, how they’ll perform repetitive tasks, how they’ll commit their code and run their builds. Here’s a non-exhaustive list of the ways discipline manifests itself on real software development teams.

A disciplined software developer will:

  • Make small, intentional commits
  • Write code with tests
  • Run the tests before committing
  • Ensure passing builds after committing
  • Understand their VCS tool and use it well
  • Understand the team’s release process
  • Write tested scripts to perform one-off maintenance
  • Ask for feedback (via pairing, PRs, etc.) on their code
  • Know team conventions and know when to break them

The list could go on and on. It may not be obvious that all these things are related to discipline, but when I see a developer doing these things on a daily basis it builds trust and respect. I would work with that person again.

I met a friend in college that started his own awning cleaning business in high school. He still runs it today and it’s extremely successful. I was talking with him one day and found myself confused as to why one awning cleaning business would succeed where others failed. How do you differentiate yourself? When I asked him this question, his response was both simple and insightful. “We show up on time and do the job we said we would do.” That’s it.

We can be successful in many facets of life following those same rules. Notice that nothing in the list above says “genius coder”, “knows algorithms”, “cranks out solutions quickly” or anything of the sort. Most of being a good software developer involves showing up on time and doing your job well. It’s the little things that count.

We are what we repeatedly do. Excellence, then, is not an act, but a habit.

Aristotle

Node Streams for APIs

Node streams are a fantastic abstraction for evented programming. They’re also notoriously hard to implement. In this post, I’d like to walk through implementing a streams2 Readable stream to wrap an API.

The API

Suppose we have a web service that returns a list of customers. There might be a large number of customers, so this service paginates the results and expects us to provide a page number when requesting data. A sample request to the API might look something like the line below.

1
curl http://localhost:3000/customers?page=1

And here’s an example of a response.

1
{"customers":[{"name":"drew"},{"name":"john"}],"nextPage":"/customers?page=2","isLastPage":false}

This API makes it convenient to fetch the next page of result by providing the url in the nextPage key of the response. It also tells us if we’ve reached the last page of results.

To make is simple to test our stream that will warp this API, I’ve created an express app that implements our description. Let’s take a look at the code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
var express = require('express');
var app = express();

var callCount = 0;
var customers = [
  {name: "drew"},
  {name: "john"},
  {name: "bill"},
  {name: "bob"},
  {name: "sam"}
];
var pageSize = 2;

app.get('/customers', function (req, res) {
  callCount += 1;

  var page = parseInt(req.query.page);
  var startingIndex = (page - 1) * pageSize;
  var endingIndex = Math.min(startingIndex + pageSize, customers.length);

  var currentCustomers = customers.slice(startingIndex, endingIndex);

  res.send(JSON.stringify({
    customers: currentCustomers,
    nextPage: "/customers?page=" + (page + 1),
    isLastPage: endingIndex === customers.length
  }));
});

app.resetCallCount = function () {
  callCount = 0;
};

app.getCallCount = function () {
  return callCount;
};

module.exports = app;

Note that I’ve specified an artifically small pageSize as well as number of customers to make testing easier. Also note that the callCount variable and its associated functions are provided so that we can make assertions about the number of requests we’ve made to the API in our tests.

Tests

Before we dive into the implementing our stream, let’s write a few tests to specify the desired behavior. Here are the two key things we should test:

  1. We can treat our stream as any other node stream
  2. It lazily fetches results from the API as needed

Below are two tests written using mocha. The method we’ll be implementing is customer.all(). This will eventually return our stream that wraps the API above.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
var assert = require('assert');
var api = require('./support/api');
var customer = require('../lib/customer');
var Writable = require('stream').Writable;
var _ = require('lodash');

describe('customer', function () {
  before(function () {
    api.listen(3000);
  });

  beforeEach(function () {
    api.resetCallCount();
  });

  describe('#all', function () {
    it('returns a stream of customers from the api', function (done) {
      customers = []

      var stream = customer.all()

      stream.on('data', function (customer) {
        customers.push(customer);
      });

      stream.on('end', function () {
        assert.equal(5, customers.length);
        assert.equal('drew', customers[0].name);
        assert.equal(3, api.getCallCount());

        done();
      });
    });

    it('consumes lazily when piped', function (done) {
      var ws = new Writable({objectMode: true});
      var writeCount = 0;

      ws._write = function (chunk, enc, next) {
        writeCount += 1;

        if (writeCount < 2) {
          next();
        } else {
          assert.equal(1, api.getCallCount());
          done();
        }
      };

      customer.all().pipe(ws);
    });
  });
});

On line 2, we require the API we discussed above. On line 3, we require the customer module we’re about the write. Before running any tests we start our API server listening on port 3000. Before each test we reset the call count on our API server.

Our first test verifies that we can interact with the return value from customer.all() using the stream API. We attach data and end events to the stream to switch it into flowing mode. When the stream emits the end event, we verify that we’ve received all 5 customers, we’ve only made 3 calls to the API and that the first customer has the name “drew”.

Our second test verifies that we fetch results from our API lazily. We ensure this by writing a custom Writable stream that only asks for the first two customers from our stream and verifies that it has only called the API once to retrieve them. Finally, we pipe the result of customer.all() into our Writable stream to kick things off.

Implementing the Stream

We know what our API looks like and we have tests to verify that our stream will work correctly. Now we need to implement it. First, let’s take a look at the customer module. It’s extremely simple because it just returns a new instance of CustomerStream.

1
2
3
4
5
var CustomerStream = require('./customer-stream');

exports.all = function () {
  return new CustomerStream();
};

Now we’ve arrived at the real meat of this post. It’s time to dive into the implementation of our stream itself. Before doing so, let’s talk a bit about how to implement a Readable stream.

First, we need to inherit from the Readable stream base class. Next, we need to implement a _read method. Our _read method will be called each time someone is requesting data from our stream. Each time _read is called, we’re expected to call the push method (provided by the base class) at least once. In fact, _read will not be called again until we’ve pushed at least one value. Calling push pushes a value onto the stream to allow our consume access to it.

There’s a few other things we need to know about the push method. First, if push ever returns false that tells us we should not push any more values until _read is called again. Second, when we’re done pushing all our data we need to call push with null to signal the end of the stream.

So what does our _read method need to do? The high level steps are as follows:

  1. If we have any customers in memory that we’ve fetched previously, we should push them.
  2. If we don’t have any customers and we’re on the last page, it’s time to stop.
  3. If neither of the previous two statements is true, we need to fetch new results from the server, buffer them, and push them.

Let’s take a look at the code. I’ll discuss a few pieces of it in detail.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
var _ = require('lodash');
var Readable = require('stream').Readable;
var request = require('request');
var util = require('util');

function CustomerStream() {
  Readable.call(this, {objectMode: true});

  this.customers = [];
  this.nextPage = '/customers?page=1';
  this.isLastPage = false;
}
util.inherits(CustomerStream, Readable);

CustomerStream.prototype.fetchNextPage = function () {
  request('http://localhost:3000' + this.nextPage, function (error, response, body) {
    var response = JSON.parse(body);

    this.customers = this.customers.concat(response.customers);
    this.nextPage = response.nextPage;
    this.isLastPage = response.isLastPage;

    this.pushBufferedCustomers();
  }.bind(this));
};

CustomerStream.prototype.pushBufferedCustomers = function () {
  while (this.customers.length > 0) {
    var customer = this.customers.shift();

    if (this.push(customer) === false) {
      break;
    }
  };
};

CustomerStream.prototype._read = function () {
  if (this.customers.length > 0) {
    this.pushBufferedCustomers();
  } else if (this.isLastPage) {
    this.push(null);
  } else {
    this.fetchNextPage();
  }
};

module.exports = CustomerStream;

Let’s first look at the constructor on lines 6 - 13. We call the Readable stream’s constructor making sure it is in object mode. This allows us to push objects to the stream. Next we initialize our customers, nextPage and isLastPage variables. Finally, we make sure CustomerStream inherits from Readable.

We’ll examine the _read method next. Notice that the body of the method very closely follows the steps we described above. If we have any customers available we push them. If we’re on the last page we push null to indicate that the stream is complete. Otherwise, we fetch our next page of results.

The pushBufferedCustomers method pops customers off the customers array one at a time and pushes them. If any of those pushes returns false, it stops.

Finally, the fetchNextPage method uses the request library to call our API. It then parses the body, adds the customers to the array, updates the nextPage and updates isLastPage. The call to pushBufferedCustomers on line 23 is very important. As I said earlier, each call to _read expects at least one call to push. If we don’t push any results after fetching customers from the API, our stream would hang indefinitely waiting on results.

Wrap Up

We now have a working stream that wraps our paginated API. If you’re interested in the code used for this blog post it can be found here.

Understanding gen_server with Elixir and Ruby

Recently, I’ve been spending some time working in Erlang and Elixir. I had tried to break into Erlang in the past but I was always stymied by the steep learning curve of OTP. gen_server in particular always seemed like black magic to me. However, after attending an Erlang workshop at Lambda Jam this year it finally clicked for me. After I finally “got it” I had another realization: it isn’t that complicated, but there aren’t very many good explanations. In this post I’m going to attempt to explain gen_server using Elixir and Ruby code and some simple diagrams.

I chose to use Elixir rather than Erlang because I’ve really enjoyed working in it recently and I think the syntax is approachable for those new to Erlang/Elixir as well as those already familiar with Erlang.

Disclaimer: My goal is to help you understand the concepts around gen_server not how the actual underlying implementation works. Specifically, the Ruby code I link to at the end of the post is meant to help you understand the idea of what’s happening. It is in no way a real implementation of gen_server in Ruby.

Concepts

The two key concepts we’re going to focus on in this blog post are state and behavior.

In traditional OO languages classes store behavior and instances store state. You write a class definition with methods that specify the behavior of an object and then you instantiate that class to create an instance of an object that holds its own state.

1
2
3
4
5
6
7
8
9
10
11
       +-----------------+                   +-----------------+
       |      Class      |                   |     Instance    |
       |-----------------|                   |-----------------|
       |                 |                   |                 |
       |                 |                   |                 |
       |    Behavior     +------------------>|      State      |
       |                 |                   |                 |
       |                 |                   |                 |
       |                 |                   |                 |
       |                 |                   |                 |
       +-----------------+                   +-----------------+

Let’s write a simple ruby class that has behavior and state.

1
2
3
4
5
6
7
8
9
10
11
12
class Greeter
  def initialize(name)
    @name = name
  end

  def greet
    "Hello, #{@name}"
  end
end

drew = Greeter.new("Drew")
puts drew.greet

In the example above, the Greeter class defines some behavior, namely the greet and initialize methods. The instance of the Greeter class, drew, is instantiated with some state (the string “Drew”) and this state can be referenced from behavior that the class defines. The greet method uses the instance variable @name. Those of us who have spent any time with an OO language understand these concepts intuatively. Objects are the fusion of behavior and state, classes and instances.

But there’s a problem. Elixir doesn’t have state. You can’t (really) store stuff anywhere. We can fake something like the code above using modules and records.

1
2
3
4
5
6
7
8
9
10
defrecord Person, greet: ""

defmodule Greeter do
  def new(name) do
    Person.new(greet: "Hello, #{name}")
  end
end

drew = Greeter.new("Drew")
IO.puts drew.greet

But there’s a key difference here. We’re creating records to hold values rather than state. These values can never change over time, so if we wanted to modify the greet attribute of our Person record we would actually create a totally new record. What if we want to have a true concept of state that can change over time and that we can use as part of the behavior we describe in a module?

Well, as you might have guessed, gen_server fills this role in Elixir. We will use gen_server to create processes that will be associated with our modules. Modules will contain behavior and processes will contain state.

1
2
3
4
5
6
7
8
9
10
11
       +-----------------+                   +-----------------+
       |      Module     |                   |     Process     |
       |-----------------|                   |-----------------|
       |                 |                   |                 |
       |                 |                   |                 |
       |    Behavior     +------------------>|      State      |
       |                 |                   |                 |
       |                 |                   |                 |
       |                 |                   |                 |
       |                 |                   |                 |
       +-----------------+                   +-----------------+

An Example

We’re going to build a simple stack using Elixir and gen_server. We’ll be able to push values onto the stack and pop values off of the stack. Our module will define the behvior of the stack but gen_server will store the state.

Because our module isn’t responsible for storing the state, the methods we define inside of it must assume that the state will be passed in as arguments. The return values from these methods will need to include a new value for the state after the call. This makes the return values from our methods look a little weird because we have to both return the result of the method call (if there is one) as well as the new state. But don’t worry, it’s not too bad.

It will be gen_server’s responsibility to find the state associated with our process and pass it to our methods as well as storing the resulting state after the method call.

First, let’s take a look at our Stack module.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
defmodule Stack do
  use GenServer.Behaviour

  def handle_call(:pop, _from, []) do
    {:reply, nil, []}
  end

  def handle_call(:pop, _from, state) do
    [head|new_state] = state

    {:reply, head, new_state}
  end

  def handle_cast({:push, value}, state) do
    {:noreply, [value|state]}
  end
end

Ok, so there’s some weird stuff going on here. First, why are all our methods named handle_call and handle_cast? It’s because when gen_server finds our processes state for us, it calls these callback methods on our module and passes along the arguments we provided, the pid of the caller (the _from argument, which we ignore) and the current state. Note that there are two (for now) types of callbacks: call and cast. Call is synchronous, it updates the state and sends a reply. Cast is asynchronous, it updates the state but doesn’t send a reply.

Next, let’s look at how we actually create a gen_server process (our equivilent of an instance) associated with this module and interact with it.

1
2
3
4
5
6
7
8
9
10
11
12
iex(1)> {:ok, pid} = :gen_server.start_link(Stack, [], [])
{:ok, #PID<0.40.0>}
iex(2)> :gen_server.call(pid, :pop)
nil
iex(3)> :gen_server.cast(pid, {:push, 1})
:ok
iex(4)> :gen_server.cast(pid, {:push, 2})
:ok
iex(5)> :gen_server.call(pid, :pop)
2
iex(6)> :gen_server.call(pid, :pop)
1

First, we start a new gen_server process associated with the Stack module using the start_link method on gen_server. The first argument is the module with our behavior and the second is the initial state. Ignore the third. This method call returns the symbol :ok along with the pid of the process. Think of this as our instance or a pointer to our state.

Now, when we want to actually call (or cast) a function defined in our module we just use the call and cast functions on gen_server. We always pass our pid as the first argument and then the arguemnts to our Stack function as the second argument. If we want to pass anything other than the function name we use a tuple.

Behind the scenes gen_server gets our pid, finds the state associated with it and then calls the appropriate method on our module passing the state along as an argument. That’s it!

“But Drew!”, you shout, “That looks hideous!” Well, you’re correct. But we can hide the complexity of gen_server from our callers by writing a few more functions on our module.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
defmodule Stack do
  use GenServer.Behaviour

  def start_link(state) do
    {:ok, pid} = :gen_server.start_link(Stack, state, [])
    pid
  end

  def pop(pid) do
    :gen_server.call(pid, :pop)
  end

  def push(pid, value) do
    :gen_server.cast(pid, {:push, value})
  end

  def handle_call(:pop, _from, []) do
    {:reply, nil, []}
  end

  def handle_call(:pop, _from, state) do
    [head|new_state] = state

    {:reply, head, new_state}
  end

  def handle_cast({:push, value}, state) do
    {:noreply, [value|state]}
  end
end

Now when we interact with our Stack, it looks nice.

1
2
3
4
5
6
7
8
9
10
iex(1)> pid = Stack.start_link([])
#PID<0.45.0>
iex(2)> Stack.push(pid, 1)
:ok
iex(3)> Stack.push(pid, 2)
:ok
iex(4)> Stack.pop(pid)
2
iex(5)> Stack.pop(pid)
1

Wrapping Up

I hope this helps you gain a better understanding what gen_server provides and, on a conceptual level, how it works. If you’re interested in digging further, take a look at this demonstration of implementating the spirit of gen_server in Ruby. Again, this is not a true implemetation but a learning tool to help those who understand Ruby get a better feel for what gen_server is doing.

Now go out there and write some Elixir (or Erlang, if you must).

Clojure core.async and Go: A Code Comparison

Last week, Rich Hickey announced Clojure core.async in a blog post. As mentioned in the post, the new core.async library has a lot in common with Go. In this post, I’ll compare the fundamental building blocks of concurrency in core.async and Go with code examples.

Note: Clojure core.async provides two sets of operations on channels. The blocking operations are for use with native threads and the non-blocking operations are for use with go blocks. In this post, I’ll be focusing on the non-blocking operations used with go blocks but I’ll briefly mention the blocking versions.

Update: It is important to note that I’m using Thread/sleep in the clojure examples for clarity. This will block the entire thread and eventually starve the thread pool used for go blocks. Don’t use it in real code, use a timeout instead (thanks MBlume and pron).

Setup

To install Go on OSX, just use homebrew.

1
$ brew install go

For clojure, you’ll want to install leiningen via homebrew.

1
$ brew install leiningen

After generating a leiningen project, you’ll need to add core.async as a dependency. Unfortunately it’s not yet available on maven central, so you’ll need to clone it and install it in your local maven repository first.

1
2
3
$ git clone git@github.com:clojure/core.async.git
$ cd core.async
$ maven install

Now, we can add core.async as a dependency in our project.clj file.

1
2
3
4
5
6
7
(defproject async_example "0.1.0-SNAPSHOT"
  :description "Async example"
  :url "http://example.com"
  :license {:name "MIT License" :url "http://opensource.org/licenses/MIT"}
  :main async-example.core
  :dependencies [[org.clojure/clojure "1.5.1"]
                 [org.clojure/core.async "0.1.0-SNAPSHOT"]])

Update: To avoid having to install core.async locally, you can add the following line to your project.clj (thanks weavejester):

1
:repositories {"sonatype-oss-public" "https://oss.sonatype.org/content/groups/public/"}

We’re all set to start comparing Go and core.async.

Goroutines and Go Blocks

Both core.async and Go provide a facility for spawning “lightweight threads”. In core.async, this is handled via go blocks. In Go, we use goroutines.

Let’s write an example spawning 10 lightweight threads that will sleep for a random amount of time and then print a number (0-9).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
package main

import (
  "fmt"
  "math/rand"
  "time"
)

func main() {
  for i := 0; i < 10; i++ {
    go func(i int) {
      sleep := time.Duration(rand.Intn(1000))
      time.Sleep(sleep * time.Millisecond)
      fmt.Println(i)
    }(i)
  }

  time.Sleep(2000 * time.Millisecond)
}

As you can see, we use the go keyword to spawn goroutines and each waits a bit and prints its designated number.

1
2
3
4
5
6
7
8
9
10
11
(ns async-example.core
  (:require [clojure.core.async :refer :all])
  (:gen-class))

(defn -main [& args]
  (doseq [i (range 10)]
    (go
      (Thread/sleep (rand-int 1000))
      (println i)))

  (Thread/sleep 2000))

The clojure code looks quite similar (besides being a lisp) to the Go code. The main difference is we use the (go ...) macro to spawn a go block.

Channels

While goroutines and go blocks are slightly interesting in isolation, they become much more powerful when combined with channels. Channels can be thought of as blocking queues that goroutines or go blocks can push messages onto and pull messages off of. In Go, we use ch <- and <-ch to push and pull from a channel respectively. In clojure, we use >! and <!.

To construct channels in Go we use make(chan <type>), in clojure we use (chan).

It is important to remember that, by default, when a value is pushed onto a channel it blocks until it is pulled off. Likewise, when a value is pulled from a channel it blocks until there is something to pull.

Below is an example of 10 goroutines/go blocks pushing values onto a channel and a main goroutine/go block pulling values off the channel and printing them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
package main

import (
  "fmt"
  "math/rand"
  "time"
)

func main() {
  c := make(chan int)

  for i := 0; i < 10; i++ {
    go func(i int) {
      sleep := time.Duration(rand.Intn(1000))
      time.Sleep(sleep * time.Millisecond)
      c <- i
    }(i)
  }

  for i := 0; i < 10; i++ {
    fmt.Println(<-c)
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(ns async-example.core
  (:require [clojure.core.async :refer :all])
  (:gen-class))

(defn -main [& args]
  (let [c (chan)]
    (doseq [i (range 10)]
      (go
        (Thread/sleep (rand-int 1000))
        (>! c i)))

    (<!!
      (go
        (doseq [_ (range 10)]
          (println (<! c)))))))

There are a few differences here to point out. First, you’ll notice that we didn’t spawn a goroutine for the main loop that reads the values in the go example. This is because the main program itself is running in a goroutine. In clojure, because core.async is a library, we must put the pulling component in a go block as well.

Second, you’ll notice that the last go block in the clojure example is surrounded by (<!! ...). This is an equivalent function to <! except that it is used with native threads instead of go blocks. In core.async, go blocks return a channel that have the last value of the go block pushed onto it when execution is complete. By wrapping the final go block in a call to <!!, we block the main thread of the program until all the pulling is complete.

Select and Alts!

The last piece of the puzzle is the ability to pull a value off many channels. Go provides select and core.async provides alts!. Each will take a collection of channels and execute some code based on the first channel with activity.

We can use select or alts! to add timeouts to our actions. Suppose we have a goroutine/go block that will put a value onto a channel sometime between now and a second from now, but we want to stop the operation if it takes longer than half a second. The following code would accomplish this task.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package main

import (
  "fmt"
  "math/rand"
  "time"
)

func main() {
  rand.Seed(time.Now().UTC().UnixNano())

  c := make(chan string)

  go func() {
    sleep := time.Duration(rand.Intn(1000))
    time.Sleep(sleep * time.Millisecond)
    c <- "success!"
  }()

  select {
  case <-c:
    fmt.Println("Got a value!")
  case <-time.After(500 * time.Millisecond):
    fmt.Println("Timeout!")
  }
}

It’s important to understand that the function time.After returns a channel onto which a value will be pushed after the specified timeout. Note that I’m seeding the rand package so that we get different results every time the program is run.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(ns async-example.core
  (:require [clojure.core.async :refer :all])
  (:gen-class))

(defn -main [& args]
  (let [c (chan)]
    (go
      (Thread/sleep (rand-int 1000))
      (>! c "success!"))

    (<!!
      (go
        (let [[result source] (alts! [c (timeout 500)])]
          (if (= source c)
            (println "Got a value!")
            (println "Timeout!")))))))

Similar to time.After, the timeout function returns a channel that will have a value pushed onto it after the timeout. The call to alts! returns a vector of the value from the channel and the channel that returned the value (called source in the above example).

Wrap Up

After spending a few days with clojure’s core.async, I’m very excited about the possibilities. Previously, I was using Go because I enjoyed its approach to concurrency. Now, this same functionality has been added to clojure via a library. To me, this is a huge win. It means I can program using the concurrency style from Go without fighting its type system and verbosity. To make things even better, you retain all the benefits of lisp and the java ecosystem.

You can learn more about core.async from the excellent code walkthrough.

Rails Mass Assignment Protection in the Controller

Github recently had a rails mass assignment bug that caused quite a stir. In the aftermath, several people proposed new ways of handling mass assignment protection in rails. One of the proposals, authored by Yehuda Katz, advocated for protecting against mass assignment in the controller rather than the model. I thought it was a great idea, so I built a little gem called Params Cleaner.

Params Cleaner

Params Cleaner provides you with a module that mixes into your controllers. Once mixed in, an ‘allowed_params’ method is available to define which params keys are allowed under a given root key. Here’s an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class PlayersController < ApplicationController
  include ParamsCleaner

  allowed_params :player => [:name, :email]

  def create
    @player = Player.new(clean_params[:player])

    if @player.save
      redirect_to player_path(@player)
    else
      render :new
    end
  end
end

Note that now, instead of accessing your params via the ‘params’ method, you’re accessing them using the ‘clean_params’ method.

The symbols provided to ‘allowed_params’ will work at any level of nesting inside the params hash. For example, assume this ‘allowed_params’ declaration:

1
2
allowed_params :player => [:name, :email]
               :name => [:first, :last]

Next, assume our params look like this:

1
2
3
4
5
6
7
8
9
10
11
{
  :player => {
    :email => "drew@drewolson.org"
    :bad_key => "nefarious stuff",
    :name => {
      :first => "Drew",
      :last => "Olson",
      :nested_bad_key => "more nefarious stuff"
    }
  }
}

Here’s what you’d see when calling the ‘clean_params’ method:

1
2
3
4
5
clean_params[:player]
# => {:email => "drew@drewolson.org", :name => {:first => "Drew", :last => "Olson"}}

clean_params[:player][:name]
# => {:first => "Drew", :last => "Olson"}

Use It!

ParamsCleaner is ready for use. Just add ‘params_cleaner’ to your Gemfile. The source is available here if you’re interested in contributing.

Testing Express with Jasmine

I recently worked on a side project using node. In the past, I’ve used vows extensively as a testing framework. There are many great things about vows including speed of execution and seemless support for testing asynchronous functions. However, looking back on that project I feel I spent more time debugging vows issues than actually testing/writing my code. I’ve previously used jasmine for in-browser testing and I enjoyed it. I decided to give jasmine-node a try this time around.

The project used express, a micro-framework for building web apps in node. Express is a great library but it wasn’t immediately obvious the best way to test an application using jasmine. I put together a few simple helpers that made the process of testing an express app with jasmine painless.

The Approach

My general approach was straight forward: spin up the express app, use the request library to hit the running server, make assertions and the stop the express app. First, let’s look at the project layout.

The Layout The project layout is below, not much more to say here.

1
2
3
4
5
6
7
8
example_app
|-- lib
|   `-- app.coffee
`-- spec
    |-- app-spec.coffee
    `-- spec-helper.coffee

2 directories, 3 files

The App

Our example app is very simple. It includes two routes, one get and one post. The important piece here is the fact that we export the app for testing and we only start the server if the file is run directly, not when it is required.

1
2
3
4
5
6
7
8
9
10
11
12
express = require 'express'

exports.app = app = express.createServer()

app.get "/", (req, res) ->
  res.send "Hello, world!"

app.post "/", (req, res) ->
  res.send "You posted!"

if __filename == process.argv[1]
  app.listen 6789

The Spec Helper

Now, let’s take a look at the spec helper for our application. It exposes a function called withServer that we’ll use to test our express application. The withServer function creates the server, starts listening, and then calls the provided callback with a nice wrapper around request and a callback that must be called at the end of your spec. withServer also calls asyncSpecWait and the provided callback calls asyncSpecDone.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
request = require "request"

class Requester
  get: (path, callback) ->
    request "http://localhost:3000#{path}", callback

  post: (path, body, callback) ->
    request.post {url: "http://localhost:3000#{path}", body: body}, callback

exports.withServer = (callback) ->
  asyncSpecWait()

  {app} = require "../lib/app.coffee"

  stopServer = ->
    app.close()
    asyncSpecDone()

  app.listen 3000

  callback new Requester, stopServer

The Spec

Finally, we’ll take a look at how the withServer function is actually used in a spec.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
helper = require './spec-helper'

describe "App", ->
  describe "get /", ->
    it "responds successfully", ->
      helper.withServer (r, done) ->
        r.get "/", (err, res, body) ->
          expect(res.statusCode).toEqual 200
          done()

    it "has the correct body", ->
      helper.withServer (r, done) ->
        r.get "/", (err, res, body) ->
          expect(body).toEqual "Hello, world!"
          done()

  describe "post /", ->
    it "has the correct body", ->
      helper.withServer (r, done) ->
        r.post "/", "post body", (err, res, body) ->
          expect(body).toEqual "You posted!"
          done()

Next time you start a project based on express, consider using this technique to aid testing.

Make Your Cucumber Step Definitions Time Aware

If you’re like me, you’ve found yourself with a cucumber step definition like this:

1
2
3
Given /^I recieved an invitation$/ do
  # ...
end

And you want to write a step definition like this:

1
2
3
Given /^I recieved an invitation 2 days ago$/ do
  # ...
end

Instead of doing all that extra work, I threw together a cucumber step that lets you add times to any existing step definition:

1
2
3
4
5
Given /^(.+) (\d+) (seconds?|minutes?|hours?|days?|months?|years?) (ago|from now)$/ do |string, number, time_unit, time_direction|
  Timecop.freeze(number.to_i.send(time_unit).send(time_direction.gsub(' ','_'))) do
    Given string
  end
end

To make things even easier I created a gem called Timebomb. It lets you append time constraints to your cucumber step defintions by mixing and matching:

  • seconds, minutes, hours, days, weeks, months, years
  • ago, from now

So with the following step definition:

1
2
3
Given /^I recieved an invitation$/ do
  # ...
end

I can write any of the following:

  • Given I received an invitation 1 day from now
  • Given I received an invitation 2 weeks ago
  • Given I received an invitation 3 months from now
  • Given I received an invitation 15 years ago

Timebomb to the rescue, BOOM!

== Installing ==

First install Timebomb

1
gem install timebomb

Now, just require it in your cucumber env.rb file. BOOM goes the dynamite.