Fixing "Error: EMFILE, too many open files" in Node.js

At Endorse.me, we use Amazon S3 as a CDN for all our static assets - images, scripts, and css. To make sure that we’re only uploading new assets to Amazon, and to make sure that we’re always using the latest versions on the production site, we fingerprint all of our static assets.

The result is that, using Node.js’s asynchronous filesystem functions, we’re opening nearly all of our static files at once to create hashes for the fingerprints. I recently ran up against OSX’s default maxfiles setting of 256, which resulted in the descriptive Error: EMFILE, too many open files error, crashing the Node process.

I didn’t find a good solution, either from the trusty StackOverflow, the Node.js docs, Google, or the #nodejs channel on IRC. It seems this isn’t a common enough use case to warrant a best practice.

Based on this StackOverflow answer, I created my own replacement for some of the most common fs methods that avoids the EMFILE error. It does this by keeping track of the total number of files opened at once by the module, and queuing up those that will take us over a predefined limit (200 by default).

A solution like this (as opposed to say, large batches) allows us to keep the maximum concurrency that the filesystem allows without changing the underlying pattern of opening, reading, and writing files using Node’s built-in fs module.

Using my module, reading files from a directory looks the same, just without the crash:

var Filequeue = require('filequeue');
var fq = new Filequeue(200); // max number of files to open at once

fq.readdir('/path/to/files/', function(err, files) {
if(err) {
throw err;
}
files.forEach(function(file) {
fq.readFile('/path/to/files/' + file, function(err, data) {
// do something besides crash
}
});
});

The module, Filequeue, is available on NPM and Github, and is MIT Licensed. Let me know if you have any questions or suggestions for improvement.