broccoli-persistent-filter
Helper base class for Broccoli plugins that map input files into output files. Except with a persistent cache to fast restarts. one-to-one.
API
class Filter {
/**
* Abstract base-class for filtering purposes.
*
* Enforces that it is invoked on an instance of a class which prototypically
* inherits from Filter, and which is not itself Filter.
*/
constructor(inputNode: BroccoliNode, options: FilterOptions): Filter;
/**
* method `processString`: must be implemented on subclasses of
* Filter.
*
* The resolved return value can either be an object or a string.
*
* An object can be used to cache additional meta-data that is not part of the
* final output. When an object is returned, the `.output` property of that
* object is used as the resulting file contents.
*
* When a string is returned it is used as the file contents.
*/
processString(contents: string, relativePath: string): {string | object };
/**
* Method `getDestFilePath`: determine whether the source file should
* be processed, and optionally rename the output file when processing occurs.
*
* Return `null` to pass the file through without processing. Return
* `relativePath` to process the file with `processString`. Return a
* different path to process the file with `processString` and rename it.
*
* By default, if the options passed into the `Filter` constructor contain a
* property `extensions`, and `targetExtension` is supplied, the first matching
* extension in the list is replaced with the `targetExtension` option's value.
*/
getDestFilePath(relativePath: string): string;
/**
* Method `postProcess`: may be implemented on subclasses of
* Filter.
*
* This method can be used in subclasses to do processing on the results of
* each files `processString` method.
*
* A common scenario for this is linting plugins, where on initial build users
* expect to get console warnings for lint errors, but we do not want to re-lint
* each file on every boot (since most of them will be able to be served from the
* cache).
*
* The `.output` property of the return value is used as the emitted file contents.
*/
postProcess(results: object, relativePath: string): object
}
Options
annotation
: Same as broccoli-plugin; see there.async
: Whether thecreate
andchange
file operations are allowed to complete asynchronously (true|false, default: false)concurrency
: Used withasync: true
. The number of operations that can be run concurrently. This overrides the value set withJOBS=n
environment variable. (default: the number of detected CPU cores - 1, with a min of 1)dependencyInvalidation
: Defaults to false. Setting this option totrue
will allow the plugin to track other files as dependencies that affect the output for that file. See Dependency Invalidation below for more information.extensions
: An array of file extensions to process, e.g.['md', 'markdown']
.inputEncoding
: The character encoding used for reading input files to be processed (default:'utf8'
). For binary files, passnull
to receive aBuffer
object inprocessString
.name
: Same as broccoli-plugin; see there.outputEncoding
: The character encoding used for writing output files after processing (default:'utf8'
). For binary files, passnull
and return aBuffer
object fromprocessString
.persist
: Defaults tofalse
. Whentrue
, causes the plugin to cache the results of processing a file to disk so that it can be re-used during the next build. See Persistent Cache below for more information.targetExtension
: The file extension of the corresponding output files, e.g.'html'
.
All options except name
and annotation
can also be set on the prototype instead of being passed into the constructor.
Example Usage
const Filter = require('broccoli-persistent-filter');
class Awk extends Filter {
constructor(inputNode, search, replace, options = {}) {
super(inputNode, {
annotation: options.annotation
});
this.search = search;
this.replace = replace;
this.extensions = ['txt'];
this.targetExtension = 'txt';
}
processString(content, relativePath) {
return content.replace(this.search, this.replace);
}
}
In Brocfile.js
, use your new Awk
plugin like so:
var node = new Awk('docs', 'ES6', 'ECMAScript 2015');
module.exports = node;
Persistent Cache
Adding persist flag allows a subclass to persist state across restarts. This exists to mitigate the upfront cost of some more expensive transforms on warm boot. It does not aim to improve incremental build performance, if it does, it should indicate something is wrong with the filter or input filter in question.
By default, if the the CI=true
environment variable is set, peristent caches are disabled. To force persistent caches on CI, please set the FORCE_PERSISTENCE_IN_CI=true
environment variable;
How does it work?
It does so but establishing a 2 layer file cache. The first layer, is the entire bucket. The second, cacheKeyProcessString
is a per file cache key.
Together, these two layers should provide the right balance of speed and sensibility.
The bucket level cacheKey must be stable but also never become stale. If the key is not stable, state between restarts will be lost and performance will suffer. On the flip-side, if the cacheKey becomes stale changes may not be correctly reflected.
It is configured by subclassing and refining cacheKey
method. A good key here, is likely the name of the plugin, its version and the actual versions of its dependencies.
const Filter = require('broccoli-persistent-filter');
class Subclass extends Filter {
cacheKey() {
return md5(Filter.prototype.call(this) + inputOptionsChecksum + dependencyVersionChecksum);
}
}
The second key, represents the contents of the file. Typically the base-class's functionality is sufficient, as it merely generates a checksum of the file contents. If for some reason this is not sufficient (e.g. if the file name should be considered), it can be re-configured via sub-classing.
Note that this method is not useful for general purpose cache invalidation since it's only used to restore the cache across processes and doesn't apply for rebuilds. See the dependencyInvalidation
option above to invalidate files that have dependencies that affect the output.
const Filter = require('broccoli-persistent-filter');
class Subclass extends Filter {
cacheKeyProcessString(string, relativePath) {
return superAwesomeDigest(string);
}
}
It is recommended that persistent re-builds is opt-in by the consuming plugin author, as if no reasonable cache key can be created it should not be used.
var myTree = new SomePlugin('lib', { persist: true });
Warning
By using the persistent cache, a lot of small files will be created on the disk without being deleted. This might use all the inodes of your disk. You need to make sure to clean regularly the old files or configure your system to do so.
On OSX, files that aren't accessed in three days are deleted from /tmp
. On systems using systemd, systemd-tmpfiles should already be present and regularly clean up the /tmp
directory. On Debian-like systems, you can use tmpreaper. On RedHat-like systems, you can use tmpwatch.
By default, the files are stored in the operating system's default directory for temporary files, but you can change this location by setting the BROCCOLI_PERSISTENT_FILTER_CACHE_ROOT
environment variable to the path of another folder.
To clear the persistent cache on any particular build, set the CLEAR_BROCCOLI_PERSISTENT_FILTER_CACHE
environment variable to true
like so:
CLEAR_BROCCOLI_PERSISTENT_FILTER_CACHE=true ember serve
Dependency Invalidation
When the output of processString()
can depend on files other than the primary input file, the broccoli plugin should use the dependencyInvalidation
option and these related APIs to cause the output cache to become automatically invalidated should those other input files change.
Plugins that enable the dependencyInvalidation
option will have an instance property dependencies
that can be used to register dependencies for a file.
During either processString
or postProcess
, the plugin should call this.dependencies.setDependencies(relativeFile, arrayOfDeps)
to establish which files this file depends on.
Dependency invalidation works during rebuilds as well as when restoring results from the persistent cache.
When tracking dependencies, setDependencies()
should always be called when processing a file that could have dependencies. If a file has no dependencies, pass an empty array. Failure to do this can result in stale dependency information about the file.
The dependencies passed to setDependencies()
can be absolute paths or relative. If relative, the path will be assumed relative to the file being processed. The dependencies can be within the broccoli tree or outside it (note: adding dependencies outside the tree does not cause those files to be watched). Files inside the broccoli tree are tracked for changes using a checksum because files in broccoli trees do not have stable timestamps. Files outside the tree are tracked using modification time.
FAQ
Upgrading from 0.1.x to 1.x
You must now call the base class constructor. For example:
// broccoli-filter 0.1.x:
function MyPlugin(inputTree) {
this.inputTree = inputTree;
}
// broccoli-filter 1.x:
function MyPlugin(inputNode) {
Filter.call(this, inputNode);
}
Note that "node" is simply new terminology for "tree".
Source Maps
Can this help with compilers that are almost 1:1, like a minifier that takes a .js
and .js.map
file and outputs a .js
and .js.map
file?
Not at the moment. I don't know yet how to implement this and still have the API look beautiful. We also have to make sure that caching works correctly, as we have to invalidate if either the .js
or the .js.map
file changes. My plan is to write a source-map-aware uglifier plugin to understand this use case better, and then extract common code back into this Filter
base class.