5

I'm trying to find a good pattern to execute a bunch of parallel tasks.

Let me define some task to exemplify. Tasks a, b, c, d, e, f, g execute as a(function(er, ra){//task a returned, ra is result}), so do b to g

There are also some tasks that should be execute after some task is done, let's call them ab, bc, abc, bd, bcd, af, fg, means when a and b has returned ab(ra, rb) should be executed at once, and when b and c returned, bc(rb, rc) should be executed at once, and if a, b, c all returned, abc(ra, rb, rc) should be executed.

For the simplest case, if there is only a and b, I can do something like this:

(function(cb){
    var count = 2, _ra, _rb;
    function update(){if(--count == 0) cb(null, _ra, _rb)}
    a(function(er, ra){_ra = ra; update()});
    b(function(er, ra){_rb = rb; update()});
})(function(er, ra, rb){
    ab(ra, rb);
});

As you can see, a and b execute in parallel, and when both are done, ab(ra, rb) execute.

But how can I do more things for a lot of parallel tasks?

6 Answers 6

14

What you actually want is a deferred pattern though like futures.

function defer(f) {
    // create a promise.
    var promise = Futures.promise();
    f(function(err, data) {
        if (err) {
            // break it
            promise.smash(err);
        } else {
            // fulfill it
            promise.fulfill(data);
        }
    });
    return promise;
}
var da = defer(a), db = defer(b), dc = defer(c), dd = defer(d), de = defer(e), df = defer(f), dg = defer(g);

// when a and b are fulfilled then call ab
// ab takes one parameter [ra, rb]
Futures.join(da, db).when(ab);
Futures.join(db, dc).when(bc);
// abc takes one parameter [ra, rb, rc]
Futures.join(da, db, dc).when(abc);
Futures.join(db, dd).when(bd);
Futures.join(db, dc, dd).when(bcd);
Futures.join(da, df).when(af);
// where's e ?
Futures.join(df,dg).when(fg);
Futures.join(da,db,dc,dd,de,df,dg).fail(function() {
    console.log(":(");
});
Sign up to request clarification or add additional context in comments.

5 Comments

It's almost like music it's so beautiful
Is this genuinely handling those tasks in parallel, when node.js is single threaded running in a single process? Or is it pushing the tasks to the end of the event queue, to be handled later, when the event queue has nothing else to do?
@AntKutschera there is no multi threading. It's simply saying run a-g in parallel (there asynchronous handled by a different process in a different thread) and when any pair finishes run this function
who starts that child process? The Futures library? How?
@AntKutschera I assumed a,b,c,d, etc all either start child processes or do asynchronous IO through a socket or stream. Node does a minimal amount of lifting. Futures doesn't. Futures just says when both da and db return run ab.
8

You should check out Step ( https://github.com/creationix/step ). It's just over a hundred lines of code, so you can read the whole thing if needed.

My preferred pattern looks something like this:


function doABunchOfCrazyAsyncStuff() {
  Step (
    function stepA() {
      a(arg1, arg2, arg3, this); // this is the callback, defined by Step
    }
    ,function stepB(err, data) {
      if(err) throw err; // causes error to percolate to the next step, all the way to the end.  same as calling "this(err, null); return;"
      b(data, arg2, arg3, this);
    }
    ,function stepC(err, data) {
      if(err) throw err;
      c(data, arg2, arg3, this);
    }
    ,function stepDEF(err, data) {
      if(err) throw err;
      d(data, this.parallel());
      e(data, this.parallel());
      f(data, this.parallel());
    }
    ,function stepGGG(err, dataD, dataE, dataF) {
      if(err) throw err;
      var combined = magick(dataD, dataE, dataF);
      var group = this.group();  // group() is how you get Step to merge multiple results into an array
      _.map(combined, function (element) {
        g(element, group()); 
      });
    }
    ,function stepPostprocess(err, results) {
      if(err) throw err;
      var processed = _.map(results, magick);
      return processed; // return is a convenient alternative to calling "this(null, result)"
    }
    ,cb // finally, the callback gets (err, result) from the previous function, and we are done
  );
}

Notes

  • My example also uses the underscore library, "the tie to match JQuery's tux": http://documentcloud.github.com/underscore/
  • Naming each step function stepXXXXX is a good habit so that the stack traces are clear and readable.
  • Step lets you make powerful and elegant combinations of serial and parallel execution. These patterns are straightforward and comprehensible. If you need anything more complex like "when 3 of 5 of these methods completes, go to the next step", seriously RETHINK your design. do you really need such a complex pattern? (maybe you are waiting for a quorum set). Such a complex pattern deserves a function of it's own.

2 Comments

is group tasks are parallel or serial
There are a couple downsides to this over async, namely if you want to call a utility method, such as with waterfall, you'd need to have a wrapper to handle the err. ex: collection.find.bind(...), where you can use function binders to methods in your chain... this is really useful, and doesn't work with step.
2

Try to look at step module and this article.

2 Comments

That should be a compliment. Explaining how to use step is an answer.
@Raynos it took me a couple times reading that to realize I think you meant 'comment', but don't change it because that's pretty funny
1

yeah, look at flow control module, like step, chain, or flow ~ and i think there is something like that in underscore.js too

Comments

0

A very simple barrier just for this: https://github.com/berb/node-barrierpoints

Comments

0

nimble is another good choice.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.