Lightweight full-text search engine in Javascript for browser search and offline search.
Elasticlunr.js is a lightweight full-text search engine in Javascript for browser search and offline search. Elasticlunr.js is developed based on Lunr.js, but more flexible than lunr.js. Elasticlunr.js provides Query-Time boosting and field search. Elasticlunr.js is a bit like Solr, but much smaller and not as bright, but also provide flexible configuration and query-time boosting.
var index = elasticlunr(function () { this.addField('title'); this.addField('body'); this.setRef('id'); });
Adding documents to the index is as simple as:
var doc1 = { "id": 1, "title": "Oracle released its latest database Oracle 12g", "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year." } var doc2 = { "id": 2, "title": "Oracle released its profit report of 2015", "body": "As expected, Oracle released its profit report of 2015, during the good sales of database and hardware, Oracle's profit of 2015 reached 12.5 Billion." } index.addDoc(doc1); index.addDoc(doc2);
Then searching is as simple:
index.search("Oracle database");
Also, you could do query-time boosting by passing in a configuration:
index.search("Oracle database profit", { fields: { title: {boost: 2}, body: {boost: 1} } });
- 1. Query-Time boosting, you don't need to setup boosting weight in index building procedure, this make it more flexible that you could try different boosting scheme.
- 2. More rational scoring mechanism, Elasticlunr.js use quite the same scoring mechanism as Elasticsearch, and also this scoring mechanism is used by lucene.
- 3. Field-search, you could choose which field to index and which field to search.
- 4. Boolean Model, you could set which field to search and the boolean model for each query token, such as "OR", "AND".
- 5. Combined Boolean Model, TF/IDF Model and the Vector Space Model, make the results ranking more reliable.
- 6. Fast, Elasticlunr.js removed TokenCorpus and Vector from lunr.js, by using combined model there is need to compute the vector of a document to compute the score of a document, this improve the search speed significantly.
- 7. Small index file, Elasticlunr.js did not store TokenCorpus because there is no need to compute query vector and document vector, then the index file is very small, this is especially helpful when elasticlunr.js is used as offline search.
- elasticlunr.js - uncompressed
- elasticlunr.min.js - minified
Every document and search query that enters lunr is passed through a text processing pipeline. The pipeline is simply a stack of functions that perform some processing on the text. Pipeline functions act on the text one token at a time, and what they return is passed to the next function in the pipeline.
By default lunr adds a stop word filter and stemmer to the pipeline. You can also add your own processors or remove the default ones depending on your requirements. The stemmer currently used is an English language stemmer, which could be replaced with a non-English language stemmer if required, or a Metaphoning processor could be added.
var index = lunr(function () { this.pipeline.add(function (token, tokenIndex, tokens) { // text processing in here }) this.pipeline.after(lunr.stopWordFilter, function (token, tokenIndex, tokens) { // text processing in here }) })