Commenting: electronic.alchemy
electronic.alchemy
where the past meets the future

Commenting: FederatedSearch

Some ideas for a federated search engine:

Features

  • Searching should be performed by one or more plugins selected for a given search query.
  • Should allow asynchronous searches with result returns as processing permits
  • Should be possible to provide per user / per site preferences for certain plugins (such as pay-databases)
  • Should provide result clustering, possibly by a parallel process, such as Carrot2?
  • Should provide means for selecting groups of plugins to use for a query.
Plugins
  • Google
  • Yahoo
  • OpenSearch compliant engines
  • Z39.1?
  • ???
Query architecture
  1. A master query is given an identifier
  2. Each selected plugin is started, and given an identifier, which is registered back to the master query.
  3. The individual queries can report back to the master query: error, results, complete.
  4. The master query can report current status and result sets to interested parties in a synchronous or asynchronous manner.
  5. A synchronous query would wait until all results are in, or a configurable time limit elapsed.
Questions
  • do we need a query meta-language, or should queries be "pass-through"?
  • how do we do ranking and combination within the search result-set?
  • if a search results in the same content as a hit for multiple plugins, does its ranking increase?
  • how would clustering occur in an asynchronous query situation?
  • is this a java project? java would be an obvious choice, but could it be prototyped more quickly in another language? lots of dynamic languages provide bridges into java. we could define a plugin api for java and do the rest in something else. java's asynchronous functionality is horrible, as well.

Name:
Email:
Website:
Authenticate: