org-mode-parser

A parser for the Emacs org-mode package. DRAWER and archive tag supported. Stronger API

npm install org-mode-parser
18 downloads in the last week
41 downloads in the last month
-*- mode: org ; mode: visual-line; -*- -*
#+SETUPFILE: ~/.emacs.d/config.org
#+TITLE:README.org
* A Nodejs org mode parser module
** "So What?"
This node.js module implements a [[http://orgmode.org/][org-mode]] file format parser.
Org-mode is a cool Emacs package let you structuring information in a nice way, 
and exporting it in html, latex, pdf and so on

** What does it basically do org-mode-parser I either couldn't do before?
I was unable to find a JavaScript parser for org-mode. Now we have one.
Org-mode is so useful you will start writing your data in org-mode instead of text files for everything
needing a bit of structuring.
It is also _XML-free_ and yes, we like it  :)

** what are its key benefits?
A lot of unit testing (over 90) and a pretty fast parser (see below).
Also, it has minimal requirements.
** Where can I find more news?
Take a look to http://gioorgi.com/tag/org-mode-parser/
for latest news
** Installation
On a shell, which can run node, try out:
#+BEGIN_SRC shell
curl http://npmjs.org/install.sh | sh
npm install org-mode-parser
#+END_SRC

** Usage example:
#<<opening Example>>
# Look  http://orgmode.org/manual/Code-block-specific-header-arguments.html
# for the syntax of BEGIN_SRC :tangle, 
# anyway org-babel-tangle
# will export this source
#+BEGIN_SRC javascript -n -r  :tangle examples/basic-example.js
var org=require('../lib/org-mode-parser');
org.makelist("README.org", function (nodelist){
   // Here nodelist is a list of Orgnode objects (ref:putyourcode)
   console.dir(nodelist[0]);
});
#+END_SRC
In the line [[(putyourcode)]] you are free to process the nodes in order.

* API
The main entry point is the makelist function, which accepts a filename and a callback function.
makelist() will pass to the function the list of parsed nodes as first parameter, as described in the [[openening Example]]

You can optionally build a query object called OrgQuery to easily select subtree,
searching tags, et cetera:

#+BEGIN_SRC javascript -n -r :tangle examples/org-query-example.js
  var org=require('../lib/org-mode-parser');
  org.makelist("./test/treeLevel.org",function(nodes){
      var ofd=new org.OrgQuery(nodes);
      //  ofd is a complex object wiith let you do query and so on
      console.log(ofd.selectTag('complex').first().toOrgString());
  });
  
#+END_SRC

Supported methods of OrgQuery are:
 + selectSubtree(node)
   Extract the nodes hierarchically below the given input
 + selectTag(tagString)
 + sortBy
 + reject(function)
 + rejectTag(tagName)
 + toArray()
 + each(functionToPassEach)
 + random()
   Extract a random element


See the unit test section 'basicLibraries OrgQuery-Complex' on test/parseTest.js 
for more usage examples, and do not miss the [[FAQ]]

** Known Defects?
The input must be a well formed org-mode file. 
Parser can detect some corruptions, but it is not provided a specific sanity check.


* Design Goals
** API Stability
    In general, every API described in the [[API]] section is here to stay. 
    Compatibility will be retained as much as possible in the future releases.
** Performance
    At the time of writing, the parser is pretty fast. On a Linux virtualized machine, we get about 20.000 nodes per seconds.
    We will keep an eye on performance.
    Please help us to stress the parser and find its true limit.
** Minimize dependency
    Org-mode-parser depends only on two packages, underscore ad vows. Vows dependency is used only 
    for regression tests, so the parser really depends only on underscore.
** Tests are documentation.
   Take a look to the examples/ directory for some tiny examples.
   Please look at test/parserTest.js file for API usage examples.
   Tests are commenteted and pretty self explanatory: they are the primary source for correctness of this module.
* FAQ
** Where can I find stable packages?
On npm repository. 
<<Github>>'s latest checkout is the developement version, to use it at your own risk.

** Who is my best friend?
OrgQuery is a very handy object (see below), because allow you to filter nodes in a structured way.
Use it instead of hand-parsing.

** How can I get rid of archived nodes?
Use the OrgQuery.rejectArchived() method

** Undeclared drawer are parsed?
Yes, but org mode wants them to be declared (see par 2.9 Drawers on documentation).
So do not relay on undeclared drawers, because we can change the parser to be more stringent.
Also undeclared drawer are not indented accordingly!

** Querying Questions
*** How can I find all the subnodes of node tagged releaseNotes and the query it?
#+BEGIN_SRC javascript  -n -r :tangle examples/org-query-example2.js
  var org=require('../lib/org-mode-parser');
  org.makelist("./README.org",function(nl){
      var q=new org.OrgQuery(nl);
      var subtree=q.selectSubtree(q.selectTag('releaseNotes').first());
      console.log("Dev version is:"+subtree.selectTag('dev').first().headline); 
  });  
#+END_SRC
** SETUPFILE and INCLUDE are supported?
No, at least at the moment.
    
* RELEASE NOTES :releaseNotes:
** ORG_MODE_PARSER_0.0.9 						:dev:
+ API Change: IllegalArgumentException and ParseError disappears. They are replaced with simple Error object
  because in node 0.10.x the toString() of the old objects was wrong.
  /Parsing error are targeted to human beings, so avoid trapping them for complex recovery puropse/
** ORG_MODE_PARSER_0.0.8 						:published:
+ Added the examples/site-publisher example
+ Tested against latest node version
+ Removed direct dependecy from vows, only used for testing
** ORG_MODE_PARSER_0.0.7						:published:
*** Stringent selectSubtree API (BUGFIX).
anOrgQuery.selectSubtree(i)  now will accept only these types of i objects:
 + A Orgnode
 + A 1-size OrgQuery collection
 + empty/null

/Keep in mind the following rule for understanding the behavior:/

q.selectSubtree(emptyNodeList) === q.selectSubtree(null) == q.selectSubtree() === q

which is a bit naif, but it will not break the existing API.

*** Magic avoided
Evaluted the option of omitting first() on mono results, but API gets dirty. 
For play with it, see commit tag dev_orgquery_one_node_merge  (6dd58da5e3a90e3f651bba4949cbe7b95155bc6b)
and serch for the use of the "mergeFrom" method, now disabled.

** ORG_MODE_PARSER_0.0.6					  :published:
Minor documentation fixes

** ORG_MODE_PARSER_0.0.5					  :published:
1. Addedd support for generic :DRAWER: syntax
2. Archive tag is supported
Missed:
+ links/ Footnotes  are still completly missed
+ Ordered/Bulletin list are still missed, 
  but org-mode will present it in a nice way anyway

** ORG_MODE_PARSER_0.0.4 :published:
 1) Added new OrgQuery methods: 
    1. sortBy
    2. reject
    3. toArray
    4. each
 2) :PROPERTIES: without :END: generates an error now.
    The parser is quite weak, but can detect this simple case.
BUGFIXES:
 + OrgQuery had a bug, and collected nodes could not be unique in some rare situations.
   Now we relay on underscore library for generating unique id
 + Added a set of stronger guards on constructors
   
** ORG_MODE_PARSER_0.0.3   :published:
  1) Added the ability to regenerate the Orgnode as string using the method
     toOrgString()
     Be carefully, the method is still experimental and do not emit:
      a) Comments
      b) SCHEDULE,DEADLINE and CLOCK directive
  2) Added the OrgQuery object, for doing queries like 
     + subtree extraction with .selectSubtree
     + tag-based searches with selectTag   
Even if the OrgQuery try to play nice, it is not yet an array, so
avoid using it directly with _.each(...)


*** KNOWN LIMITATIONS
  1) Comments are stripped off during parsing.
  2) Special directive starting with '#+' are mostly ignored during the parsing, 
     for instance #+AUTHOR etc
  3) Tables are not parsed at all. 
  4) In org-mode tags cannot have "-" character in name. They are split in subwords. 
     The parser allow this instead, so be careful when editing by hand org files.
  5) properties can have "-" but this will force 
     you to access them with the array syntax instead of the dot notation, so we
     strongly suggest to avoid "-" and special java character in property names.
     Relay on "_", for instance.

** ORG_MODE_PARSER_0.0.2 					  :published:
  1) SCHEDULE,DEADLINE and CLOCK directives now are correctly parsed
  2) Added a performance watchdog to track slowdowns
  3) Added the ability to return performance data via makelist
  4) Started restructuring parser for better performance.
  5) Minor API Change: null is the default value for tag,priority,scheduled, deadline 
     when not set.
     e.tags.existingtag is true if existingtag is there.
     Anyway is better to use 
       "existingtag" in e.tags
     which is a better syntax
** ORG_MODE_PARSER_0.0.1					  :published:
First revision

* Github
https://github.com/daitangio/org-mode-parser

* Developing HOWTO
Install globally vows and try out something like:
npm install -g vows@0.7.0
NODE_PATH=$(dirname $(which node))/../lib/node_modules:. ./bin/testme
* Release command sequence / Kitchensink 		  :kitchensink:
At the time of writing, the github repository is the master code repository

1. Check the package.json version
2. Issue the following commands:
#+BEGIN_SRC shell
./bin/releaseVersion.sh ORG_MODE_PARSER_0.0.6
#+END_SRC



npm loves you