Simplex
;// => { name: 'foo', value: 'bar' }
Simpler than regular expressions
Let's say you have some text like this:
Bob 35 (M)
Suzie 42 (F)
Phil 29 (M)
Marlene 26 (F)
Quick! Can you write code to parse those lines using a regular expression?
var pattern = /\s+\s+\(\)/g people = match; while match = pattern people; console;// [ { name: 'Bob', age: 35, gender: 'M' },// { name: 'Suzie', age: 42, gender: 'F' },// { name: 'Phil', age: 29, gender: 'M' },// { name: 'Marlene', age: 26, gender: 'F' } ]
Not too bad. But for such a simple pattern-matching problem, that feels like more code than we ought to need.
This scenario---matching simple patterns---is what Simplex is designed for. Check it out:
var people = ; console;// [ { name: 'Bob', age: 35, gender: 'M' },// { name: 'Suzie', age: 42, gender: 'F' },// { name: 'Phil', age: 29, gender: 'M' },// { name: 'Marlene', age: 26, gender: 'F' } ]
Way easier!
Usage
The Simplex
constructor (which can be called with or without new
) takes two
arguments: a pattern expression (required) and an optional options
object.
var simplex = 'expression' /* options */ ;
The match
method takes a string, matches it against the given pattern, and
returns the first result. To return an array of all results, use matchAll
.
How it works
A pattern expression consists of fields that will be matched. For example, take this expression:
term (type): def*
The fields are term
, type
, and def
. When you call match
, Simplex will
look for a word where each field is. The '*' after def
tells Simplex to match
(potentially) multiple words for that field.
When all fields are matched, the result will be an object whose keys are the names of the fields, and whose values are the matches:
;// => { term: 'coffee', type: 'noun', def: 'a tasty hot beverage' }
By default Simplex treats every word-like token as a field. This means
letters, numbers, and underscores (same as with regular expressions). If you
want to represent fields differently, you can use the fieldMarkers
option,
which is described in more detail further down.
Options
Parsing
By default, Simplex does some very basic type inference for numbers and boolean values.
;// => { year: 2014, month: 2, day: 10 } ;// => { verbose: true }
You can customize this behavior by specifying the parser
option:
var simplex = 'month/day' { return ; }; simplex;// => { month: 10, day: 21 }
If you want to parse different fields differently, you can pass an object mapping field names to parsing functions:
var simplex = 'name percentile%' parser: { return name; } { return Numberpercentile / 100; } ; simplex;// => { name: 'DTAO', percentile: 0.88 }
Whitespace
By default Simplex is lenient with whitespace, meaning every space in a pattern expression matches any number of spaces.
;// => { salutation: 'hello', valediction: 'goodbye' }
Enable the strictWhitespace
option to make it strict, so that whitespace
must be matched exactly.
var simplex = 'first \tlast' strictWhitespace: true ; simplex; // => nullsimplex; // => nullsimplex; // => { first: 'joe', last: 'schmoe' }
Field markers
As mentioned earlier, Simplex defaults to treating every word-like token in a pattern expression as a field. This may not be what you want, if:
- There are some words in the pattern itself that are not fields
- You want to give some field(s) a name that is multiple words
Here's an example of the first case:
// The first 'format' below isn't a field; it's actually part of the pattern. ;// => { format: 'html' }
And here's an example of the second case:
// Without field markers, Simplex will think that 'silly' and 'greeting' below// name two separate fields. But we actually just want *one* field, called// 'silly greeting'. Notice that we also add the '*' to indicate that the// matched value can also consist of multiple words. ;// => { 'silly greeting': 'Oh hai' }
Notice how we specified those field markers in the second example using just a
simple string, '[]'
. This tells Simplex that placeholders start with '['
and
end with ']'
. This works with any string; Simplex will take the first half as
the left marker, and the second half as the right.
;// => { content: 'Hello, world!' }
For strings with an odd number of characters, the assumption is that the middle character goes on both sides.
;// => { 'rose color': 'red' }
Of course, as shown in the '--format'
example, you can also specify an array.
You'll need to do this if you're using crazy asymmetrical field markers for some
ridiculous reason.
;// => { adjective: 'CRAZY' }
Questions? Open a ticket!
This library is a work in progress.