I'm looking into implementing full text search on our Firebird database. Our requirements are:
Every field in several tables should be indexed. When a result is found we should be able to find out the originating table.
The index can be stored in the database or in the file system.
The results of the search (BigInt primary keys) must be used to join with the original records in the database to display the records in a table.
Can anybody recommend a decent way to achieve what we need? I've looked at somehow integrating DotLucence into Delphi, but can't really find very much information on how to go about it.
Here are a few resources for you to consider:
Sphinx very powerful and popular free open source full-text search engine.
Textolution Fulltext search for Interbase and Firebird.
IBObjects Full Text Search ("Fuzzy Search") module, a fully working module that can be used to set up your search indexes or as a model for your own custom implementation.
Rubicon is a Delphi add-on that lets you put full text search capabilities into your applications.
Fulltext Search for Firebird SQL By Dan Letecky on CodeProject using DotLucene full-text search engine.
Mutis is a Delphi port of the Lucene Search Engine. Provide a flexible API for index, catalog and search text-based information with great performance. Excelent for implement custom search engines, researching, text retrieval, data mining and more.
There is a fork of Firebird code made by a company called Red Soft. It's licensed under the same license as Firebird, so you can take a look at their version which can support full-text searches using Lucene engine via JavaVM interfaces.
You can also read a paper titled "Full text search in Firebird without a full text search engine" by Bjoern Reimer and Dirk Baumeister, presented at 4th Firebird Conference.
I think you will have a problem with requirement 2: The index can be stored in the database or in the file system. Most indexing services create their own index file which stores data in a highly optimized way. If you really want it, maybe it is possible to load and save an index to a single blob field but I don't really see a reason for this.
Related
On our website, it is possible to tag content by a country list. This country list could be implemented as a tag control but I'm concerned about mis-spellings creeping in over time. However, the country list is very long (150+) so not ideal for a dropdown multiple control either.
What I'm looking to do is have a control that has the same type + autocomplete functionality as the existing tags control but limit the possible values to those retrieved from a database table.
I also want to be able to list all tags that a piece of content has been tagged against as well as searching for content based on tags e.g. GetNodesWithTags
Has anyone developed anything like this before? I've had a look at packages etc but can't see anything similar. Does anyone have any advice before I start off?
Definitely, using Tags datatype for this may cause a lot of problems :)
In my opinion, the perfect solution will be to use nuPickers (https://our.umbraco.org/projects/backoffice-extensions/nupickers/) package and available there TypeaheadList Picker.
Depending of your additional requirements, you may use Lucene index / C# accessed source (totally custom - db, static, enum etc.) / XML file source as a prevalues for your control.
Then, you'll be able to create logic which will enable you to perform search based on this field as it will be a typical property with value on the nodes. Once again - suggested way is to use Lucene Examine index as it's tailored to be fast with searching. You can read more about searching with Examine here: https://our.umbraco.org/documentation/reference/searching/examine/.
Hopefully it will solve your problem.
I am using ravendb for my intranet website. I need to implement full text search on whole website ? I can use ravendb's linq search queries for documents which is lucene based in the background.
Other approach is to use Lucene.Net library to implement fulltext search independently.
Whatever approach I choose, it should be able to search through attachments stored in blob format in ravendb.
Any ideas or suggestions please ?
RavenDB is fully integrated with Lucene. There would be little point to using it independently.
But by definition, attachments are not searchable. You can certainly store very large documents that are fully searchable, but they wouldn't be attachments. The whole point of attachments are for things that you wouldn't want to search. Example: videos, photos, music, etc.
Review:
http://ravendb.net/docs/client-api/attachments
http://ravendb.net/docs/client-api/querying/linq-extensions/search
http://ravendb.net/docs/appendixes/lucene-indexes-usage
Revised Answer
I have written a bundle that uses IFilters to have RavenDB automatically extract the contents of attachments and index them with Lucene. It is available here.
Enjoy!
I am trying to implement a web/smart phone app that allow users to search for places based on keywords and location and here is the requirement:
Users shall be able to search by typing in keywords and location; Locations can be zip code, city/state or current location from the mobile app (lat and long)
We would like to be able to customize relevance score; We need to be able to define our own relevance algorithm based on keyword matching, location matching and some other parameters.
We use ASP.NET MVC as our web development framework and MongoDB as a data store. We also maintain a list of all zipcode and city/state as well as their centroid (lat/long) in our database. Our thought is override the scoring that the full-text system provide (like Lucene scoring) with our own algorithm. I am trying to find the best solution to address this. I am wondering whether should we use MongoDB full-text search or try to use Lucene .NET or perhaps Solr? Any help/pointer/comment is always apprecated!
So as a starting point, MongoDB does not have support for full-text search.
It has some regex capabilities and you can index on arrays. So you can do some things here, like building an array of keywords to make basic text search possible.
However, this is a long way from what Solr and Sphinx.
The other big problem you'll have is with relevance scoring. It's going to be very difficult to perform any type of server-side relevance scoring with MongoDB. There's no really efficient version of a server-side stored procedure. You'll likely have to pull the results to a client or server dedicated to that scoring.
I'm currently building an e-commerce website that's going to be backed by a SQLite3 database. I'm looking for a way to do an in-browser search, with the results being links to products that match the search query. I've got no idea where to start. Any suggestions?
Why? If you have enough products that you need search, use a database that will grow with you. SQLite is more for development and small/low-traffic applications. For example, Google Chrome uses it to store your history.
For very basic search, SQL-based finding is okay. You can do
SELECT * FROM foo WHERE bar LIKE "%query%" LIMIT 10;
…fairly easily, and relatively quickly (especially since you're using an index on that column, right?).
For more advanced or high-performance search with conditions, partial matches and searching across tables, you can use SOLR or another third-party search-specific application.
I have a mediawiki installation that I've customized with some of my own extensions. Here is the basic platform, pretty standard LAMP install.
Ubuntu Server
Apache 2
Mediawiki 1.15
PHP 5.2.6
MySQL 5.0.67
For the actual MW search I use Lucene (EzMwLucene). I also have custom extension that displays tabular data from a separate database within a MW page. Lucene doesn't index this info (which, in my case is actually good because it would clutter your expected search results). For this installation I didn't do anything to Lucene other than install it and wouldn't know how to customize it for my needs and it may be "too powerful".
At any rate, I need to create a search for the data in my other database. I have a master table that is updated daily based on data stored in other (normalized) tables. At the moment it is one of these searches that basically creates a SQL query based on the criteria you enter. This is a lot of work, though. I would like it to be more of a "type and submit" type search.
I don't think I need a comprehensive "cut & paste" type answer, but if anybody has something that I can google I would be very appreciative. I don't need to recreate the wheel, which is what I would be doing if I followed what I see in google.
If you would like to see my master database, let me know, I would want to sanitize it to make me more anonymous (whatever that means). Also, if you're familiar with MW and would like to see any of my extension code, again, let me know.
TL;DR: need to make a custom search feature with LAMP (displayed in Mediawiki). Any guidance appreciated.
Thanks SO!
Why do you need to add custom search? This will relate to the best answer.
For simplicity, you could use the Google Search Engine - http://www.mediawiki.org/wiki/Extension:Google_Custom_Search_Engine
Otherwise it sounds like you need to write a full-text query for the database.