Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account

Full Text Search: What to Use?

Full Text Search: What to Use?:

A problem everyone using a NoSQL databases faces (nb: actually I think this applies to most storage engines that don't support full text indexing):

The problem now is: what to use? Currently I'm toying with 3 options:

  1. Use Sphinx Search; it's pretty powerful, pretty damn fast, but requires me to feed it data through XML, but only when the indexer runs. Basically it's quite hard to get real-time indexes going, and the delta updates are something I'd rather not mess with.
  2. Use Solr; I'd go for this if it wasn't for the fact it's Java and requires Tomcat to work. Our entire application infrastructure is basically MongoDB and Perl, and I don't want to go and set up a Tomcat instance just for Solr; on top of which I have a pathologically deep hatred for Java, but that aside…
  3. Roll my own. Full text search the way we need it doesn't actually require things like stemming or fancy analysis of things. What it does need is the ability to search a schema-less database… Solr and Sphinx both suffer from the fact you need to tell them what to index, and even then you run into the fact that it'll need a double pass. First pass is getting the search results, and the second pass entails the checking to see whether the user doing the search can actually see the document.

Couple of thoughts:

  1. there are a couple of solutions out there, both relational and NoSQL databases, that support different degrees of full text indexing (e.g. Riak Search, MarkLogic)
  2. even if your database supports some form of full text search, the implementation might not be complete/optimal.
  3. initially it may sounds like building a reverse index is the best solution. Twitter's story of migrating from their own reverse indexes in MySQL to a Lucene based solution should change your mind.
  4. some NoSQL databases provide good mechanisms for enabling full text indexing. Riak has post commit hooks, CouchDB has a _changes feed.

Original title and link: Full Text Search: What to Use? (NoSQL database©myNoSQL)

Source Article
Comments
0
Be the first to comment

Join with account you already have


Sign in with Twitter account
Sign in with Facebook account
Sign in with Google Friend Connect
avatar
Tags: time indexes, full text search, application infrastructure, sphinx, best solution, indexer, hatred, hooks, schema, mechanisms, databases, delta, real time, impleme
On latest trunk the full text search order by on text fields aren't working, it did before
Jun 10, 2011
I have an application that uses the full text search capabilities of sqlite3 and on some queries it uses order by on fts fields other than docid and it used to work before: create virtual table myfts using fts3(day text, number text, header text,…

full text search
Mar 4, 2011
The full text search of postgres is not support Chinese, who can give me some advises?

full text search
Oct 6, 2010
Hi, Is there a way to extract the texts or sentences in which the searched keyword occurs when doing full text search in a document stored in JackRabbit? Thanks a lot.Jason

Full Text Search
Mar 26, 2011
Hi, I am Sumesh, student from india. I currently doing a project using qt and sqlite. I want to implement Full Text Search in that project. Anyone please tell me, from where i start to learn FTS and it's working and how it is implemented. …

full text search in jackrabbit
Aug 25, 2010
Hi, First of all, in order to use full-text search for files stored in JackRabbit, do I need to set up indexing configuation manually? If so, how to set it up? Currently I just run the following codes: nothing returns (no exceptin either)…

Update on full text search?
Feb 3, 2011
It would be great to get an update on full text search integration into MongoDB - has a search technology been chosen? Is the process going to plan? How long till we are likely to see FTS? This is a critical feature for us and we're keen to see…

a full text search formongodb
May 18, 2011
Hi everybody, I just want to ask if is there any Open Source Search Server ,like sphinx for example, wich support mongodb. thnx

Full Text Search jackrabbit 2.1.0
Jul 12, 2010
Hi everyone, I use jackrabbit 2.1.0, and I'd like to do full text search in nodes that hold documents (word, pdf.. and so on) I wrote the following code, and the porblem is that it never returns result! Although the documents are there and the…

Problem with full text search on PDFs
Jul 13, 2010
I have got a problem with Jackrabbit 2.1.0 and full text search on PDFs. I have created a repository containing several plain text and PDF documents using the Java APIs. I am able to use the Java API to perform full text search on the text…

Full text search FTS3 of files
Oct 17, 2010
Hi: I am trying to build a sqlite3 database to index files. What I want to do is to keep the files in the file system on the disk (not in the database) and index the files with keywords such that when a search is performed, the right file names are…