Monday, December 21, 2009

Automated approach for initializing Enterprise Search experience

Whenever you want to utilize the Enterprise Search functionalities in your application, you must take into account for correct initialization: set up a content source, administer its crawling scheme, create the searchable managed properties, set up a Search Scope. Although the different steps can be done manually via the SharePoint GUI (combination of Central Admin and your own application), this is less workable within the context of an ALM based project. Your application is then multiple times (re)deployed, and to different environment (development, test, staging, production). Each time the manual installation/initialization steps would need to be repeated. This is cumbersome, and [thus] error prone. A better approach (as always) is to strive for a fully automated initialization of the enterprise search. I’ve applied this several times via the approach outlined here:
  • create a new Feature, with a FeatureReceiver codebehind
  • In the activation method of the feature, do the following steps:

    Create the content source

  1. use the Search Object Model to create the content source
  2. if applicable, administer include and exclude rules
  3. create the crawl schemes; full and incremental
  4. Create managed properties

    Important to realize here is that a managed property can only be made if the mapping crawled property is available.

  5. make sure the crawled content source contains at least one item, by either adding a dummy listitem (for regular Lists), or uploading a dummy document (for document library)
  6. Use the content type(s) definition(s) to determine the fields of the searchable content, and assign per field a non-nil value
  7. Initiate a full crawl, in order to let the crawler make up crawled properties for each of the content type(s) fields
  8. After the full crawl, loop through the collection of determined content type(s) fields, and for each field create a Managed Property of the proper type, and associate it with the automatically created crawled property
  9. Remove the dummy content(s) of step 4
  10. Create the Search Scope

  11. use the Search Object Model API to create a Search Scope
  • In the deactivation method of the feature, do the proper reversible actions of the feature activation event.
What is proper, is situation / application dependent. Normally, you would implement in the feature deactivation a full restore to the status before the feature activation. Here that means removal of the managed properties, crawled properties, search scope and content source. However, when you delete the content source, you typically undo more than strict the feature activation. In a production situation, the content source has been crawled and crawled, building up the index administration. Upon content source deletion, you also loose al this hard work content crawling and indexing. Feature deactivation would then thus not only undo the feature activation itself, but also work done later. Whether this is appropriate, depends on the application and content specifications. Every content can be recrawled. However, for a large and complex set (documents, .pdf’s, TIFF files, LOB via BDC…) this can be time consuming, and during the required full crawl the application search cannot find and return all search requests.

No comments:

Post a Comment