INSERT INTO Item (text) VALUES ('My sister is coming for the holidays.') Open studio or console and create a sample dataset: CREATE CLASS Item ĬREATE INDEX Item.text ON Item(text) FULLTEXT ENGINE LUCENE The StandardAnalyzer usually works fine with western languages, but Lucene offers analyzer for different languages and use cases. The default analyzer used by OrientDB when a Lucene index is created is the StandardAnalyzer. Moreover, it is easy to write better Lucene queries. When multiple properties should be indexed, define a single multi-field index over the class.Ī single multi-field index needs less resources, such as file handlers. CREATE INDEX City.name_description ON City(name, description) For example, create an index on the properties name and description on the class City. Indexes can also be created on n-properties. CREATE INDEX City.name ON City(name) FULLTEXT ENGINE LUCENE The following SQL statement will create a FullText index on the property name for the class City, using the Lucene Engine. To create an index based on Lucene CREATE INDEX ON (prop-names) FULLTEXT ENGINE LUCENE On the other side, it offers a complete query language, well documented here: Index creation Terms are produced analyzing the provided text, so the right analyzer should be configured. Lucene doesn't work as a LIKE operator on steroids, it works on single terms. If we want to retrieve documents that contain both my and fudge, rewrite the query: "+my +fudge". Lucene's default operator is OR, so it retrieves the documents tha contain my OR fudge. Note that the query is broken into words (terms) and each term is matched with the terms in the index. The full list of documents containing the keywords is. In order to find matches for the query, we break it into the individual words, and go to the posting lists. Retrieval is the process starting with a query and ending with a ranked list of documents. Indexing must be done before retrieval, and we can only retrieve documents that were indexed. The index consists of all the posting lists for the words in the corpus. Search has two principal stages: indexing and retrieval.ĭuring indexing, each document is broken into words, and the list of documents containing each word is stored in a list called the postings list. What does Lucene do? Lucene is a full text search library.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |