The Umbraco CMS comes with a built in search engine called Examine which is both powerful and fast. When I say it comes with it built in what I really mean is that Examine is actually a separate project but its assemblies are distributed along with Umbraco and used within the editor to perform its own internal searching of content nodes and members.

Examine is effectively a wrapper around Lucene.Net that seamlessly integrates it into Umbraco and the Examine project itself can be found on CodePlex.

Examine is a fairly configurable and extendible beast enabling you to do some quite complex things but using it to add a simple site search to your website is quite easy.

The site structure

Let's consider the following site which consists of just three document types:

  1. Home
  2. Page
  3. Folder

Site structure

and just two properties between them all:

  1. umbracoNaviHide
  2. bodyText

The 'umbracoNaviHide' property will simply enable us to easily hide a node from navigation menus, bodyText is simply a Richtext editor property and will contain the page content that we primarily want to search.

Configuring Examine

Out of the box Umbraco will already be indexing this content. Within the 'config' folder is the ExamineIndex.config file and it is this file that contains the list of Examine indexes that have been set up. By default three indexes will be in place:

  1. InternalIndexSet
  2. InternalMemberIndexSet
  3. ExternalIndexSet

The first two are in place for the searches that exist within the Umbraco editor. The 'ExternalIndexSet' index set is in place for us to use for our site search.

<!-- Default Indexset for external searches, this indexes all fields on all types of nodes-->
<IndexSet SetName="ExternalIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/External/" />

As you can see from the comment this will index all properties on all document types. This might be fine for a demo or small site like our example site but for a more complex site you are better off fine tuning the index to concentrate on just the data that you want to index.

Here we can configure the index to specify which attributes, document types and document properties we're interested in and which we are not:

<IndexSet SetName="ExternalIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/External/">
    <IndexAttributeFields>
      <add Name="id" />
      <add Name="nodeName"/>
      <add Name="nodeTypeAlias" />
      <add Name="parentID" />
    </IndexAttributeFields>
    <IndexUserFields>
      <add Name="umbracoNaviHide" />
      <add Name="bodyText" />
    </IndexUserFields>
    <ExcludeNodeTypes>
      <add Name="Folder" />
    </ExcludeNodeTypes>   
  </IndexSet>

We've included the 'umbracoNaviHide' property in the index as that will enable us to easily ignore any content that has been hidden by the editor.
We've excluded the 'Folder' document type because nodes of that type contain no meaningful information.

That's our index configured.

Also in the 'config' folder is the ExamineSettings.config file, this contains the default Indexer and Searcher provider settings for each of the index sets.

The only change we're going to make to the 'ExternalIndexer' is to modify the analyzer. The default is to use the WhitespaceAnalyzer but this analyzer is case sensitive so searching for "umbraco" will not return results that contain "Umbraco".
A quick run-down of the some of the commonly used Examine analyzers and how they behave can be found in this blog post.

<!-- default external indexer, which excludes protected and published pages-->
 <add name="ExternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
          supportUnpublished="false"
          supportProtected="false"
          interval="10"
          analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>

and we'll make the same change to the 'ExternalSearcher':

<add name="ExternalSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
   analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" enableLeadingWildcards="true"/>

Be careful when making this change. Do not simply replace 'WhitespaceAnalyzer' with 'StandardAnalyzer' as the full namespace is different, it's 'Standard.StandardAnalyzer'.

Notice how through the use of the 'supportUnpublished' and 'supportProtected' attributes the Indexer will automatically ignore any content that has not been published or has been protected by membership restrictions that the user does not have.

It is possible to use different analyzers in your indexer and corresponding searcher however it may lead to strange results and is not recommended.

You may be wondering how these index sets, index providers and search providers all know about each other. The index provider and search provider configs have an 'indexSet' attribute. This attribute can be used to specify the name of the index set to use. If you do not specify the index set name explicitly then it will look for one with a matching name, <yourname>IndexSet.

The Examine Dashboard package

There are going to be some times when you think the search might not be behaving as you'd expect it to.
The Examine Dashboard package is a great and essential tool to install as it provides a very simple way to re-build the search indexes when necessary.

The search macro

With the indexing all in place we just now require a macro that will search it using the Examine API and display the results in a meaningful way.

The first thing we want to add is a search textbox and button to the top of every page. This will simply post the user entered search text as a query string to a page that will display the search results. To do this we'll add the following code to the site's master template:

<%@ Master Language="C#" MasterPageFile="~/umbraco/masterpages/default.master" AutoEventWireup="true" %>

<script runat="server">
    protected void searchButton_Click(object sender, EventArgs e)
    {
        Response.Redirect(string.Format("~/system-pages/search-results.aspx?s={0}",
            HttpUtility.UrlEncode(searchTextBox.Text)));
    }
</script>

<asp:Content ContentPlaceHolderID="ContentPlaceHolderDefault" runat="server">
    <form id="form1" runat="server" defaultbutton="searchButton">
        <div class="search">
            <asp:TextBox ID="searchTextBox" runat="server" />
            <asp:Button ID="searchButton" runat="server" Text="Search" OnClick="searchButton_Click" />
        </div>
        <umbraco:Macro Alias="Navigation" runat="server" />
   
        <asp:ContentPlaceHolder Id="MainContent" runat="server">
            <!-- Insert default "MainContent" markup here -->
        </asp:ContentPlaceHolder>
    </form>
</asp:Content>

For the search results macro itself we'll create an ASP.Net Web User Control. This control will utilise the ListView and DataPager controls to display the search results with some nice paging with very little effort on our part.

<%@ Control Language="C#" AutoEventWireup="true" CodeBehind="SiteSearchResults.ascx.cs" Inherits="WebApplication1.usercontrols.SiteSearchResults" %>
<%@ Import Namespace="WebApplication1.usercontrols" %>

<div id="SiteSearchResults">           
    <asp:ValidationSummary ID="ValidationSummary" runat="server" DisplayMode="List" CssClass="validation" ValidationGroup="SiteSearchResults" />
    <asp:CustomValidator ID="CustomValidator" runat="server" Visible="False" ValidationGroup="SiteSearchResults"></asp:CustomValidator>

    <asp:Literal ID="summaryLiteral" runat="server" />

    <div style="display:none;">
        <asp:Literal ID="queryLiteral" runat="server" />
    </div>

    <div id="TopPagingControls">
        <asp:DataPager ID="topDataPager" PagedControlID="searchResultsListView" PageSize="10" Visible="false" runat="server">
        <Fields>
            <asp:NextPreviousPagerField ShowNextPageButton="False" RenderDisabledButtonsAsLabels="true" />
            <asp:NumericPagerField ButtonCount="10" CurrentPageLabelCssClass="currentpage" />
            <asp:NextPreviousPagerField  ShowPreviousPageButton="False" RenderDisabledButtonsAsLabels="true" />
        </Fields>
        </asp:DataPager>
    </div>

    <div id="SearchResults">           
        <asp:ListView ID="searchResultsListView"  runat="server" ItemPlaceholderID="itemPlaceholder" OnPagePropertiesChanging="searchResultsListView_PagePropertiesChanging">
            <LayoutTemplate>
                <asp:PlaceHolder ID="itemPlaceholder" runat="server"></asp:PlaceHolder>      
            </LayoutTemplate>       
            <ItemTemplate>
                <li>
                    <a href="<%# ((Examine.SearchResult)Container.DataItem).FullURL() %>">
                        <%# ((Examine.SearchResult)Container.DataItem).Fields["nodeName"] %>
                    </a>
                    <p><%# ((Examine.SearchResult)Container.DataItem).Fields.ContainsKey("bodyText") == true ? ((Examine.SearchResult)Container.DataItem).Fields["bodyText"] : ""%></p>
                </li>
            </ItemTemplate>
            <ItemSeparatorTemplate></ItemSeparatorTemplate>
            <EmptyDataTemplate></EmptyDataTemplate>
        </asp:ListView>
    </div>
           
    <div id="BottomPagingControls">
        <asp:DataPager ID="bottomDataPager" PagedControlID="searchResultsListView" PageSize="10" Visible="false" runat="server">
        <Fields>
            <asp:NextPreviousPagerField ShowNextPageButton="False" RenderDisabledButtonsAsLabels="true" />
            <asp:NumericPagerField ButtonCount="10" CurrentPageLabelCssClass="currentpage" />
            <asp:NextPreviousPagerField  ShowPreviousPageButton="False" RenderDisabledButtonsAsLabels="true" />
        </Fields>
        </asp:DataPager>
    </div>
</div>

You may notice that we have imported the namespace 'WebApplication1.usercontrols'. This is because we want to make use of an extension method in the code behind called FullURL to get the URL of each page.

The rest is fairly straightforward, a ListView (which will be bound to our search results in the code behind) displays each result's node name as a hyperlink to the actual content page followed by the contents of its bodyText field.

The DataPager is set to 10 results per page but we'll override that with a public property which can then be set using a parameter on the macro when it is inserted on the page.

One little extra is the hidden 'queryLiteral' control. This will be populated with the actual query that Examine executes. We are simply adding this to the page as a debugging aid. If we are ever unsure about the results that we are getting then we can paste this query into Luke to verify them.

The code behind this control will make use of some Umbraco and Examine libraries so it is important to add references to the following assemblies:

  1. Examine
  2. umbraco
  3. UmbracoExamine
using umbraco;
using Examine;
using UmbracoExamine;

namespace WebApplication1.usercontrols
{
    public static class SiteSearchResultExtensions
    {
        public static string FullURL(this Examine.SearchResult sr)
        {
            return umbraco.library.NiceUrl(sr.Id);
        }
    }

    public partial class SiteSearchResults : System.Web.UI.UserControl
    {
        #region Properties

        private int _pageSize = 10;
        public string PageSize
        {
            get { return _pageSize.ToString(); }
            set
            {
                int pageSize;
                if (int.TryParse(value, out pageSize))
                {
                    _pageSize = pageSize;
                }
                else
                {
                    _pageSize = 10;
                }
            }
        }

        private string SearchTerm
        {
            get
            {
                object o = this.ViewState["SearchTerm"];
                if (o == null)
                    return "";
                else
                    return o.ToString();
            }

            set
            {
                this.ViewState["SearchTerm"] = value;
            }
        }

        protected IEnumerable<Examine.SearchResult> SearchResults
        {
            get;
            private set;
        }

        #endregion

        #region Events

        protected override void OnLoad(EventArgs e)
        {
            try
            {
                CustomValidator.ErrorMessage = "";

                if (!Page.IsPostBack)
                {
                    topDataPager.PageSize = _pageSize;
                    bottomDataPager.PageSize = _pageSize;

                    string terms = Request.QueryString["s"];
                    if (!string.IsNullOrEmpty(terms))
                    {
                        SearchTerm = terms;
                        PerformSearch(terms);
                    }
                }

                base.OnLoad(e);
            }
            catch (Exception ex)
            {
                CustomValidator.IsValid = false;
                CustomValidator.ErrorMessage += Environment.NewLine + ex.Message;
            }
        }

        protected void searchResultsListView_PagePropertiesChanging(object sender, PagePropertiesChangingEventArgs e)
        {
            try
            {
                if (SearchTerm != "")
                {
                    topDataPager.SetPageProperties(e.StartRowIndex, e.MaximumRows, false);
                    bottomDataPager.SetPageProperties(e.StartRowIndex, e.MaximumRows, false);
                    PerformSearch(SearchTerm);
                }
            }
            catch (Exception ex)
            {
                CustomValidator.IsValid = false;
                CustomValidator.ErrorMessage += Environment.NewLine + ex.Message;
            }
        }

        #endregion

        #region Methods

        private void PerformSearch(string searchTerm)
        {
            if (string.IsNullOrEmpty(searchTerm)) return;

            var criteria = ExamineManager.Instance
                .SearchProviderCollection["ExternalSearcher"]
                .CreateSearchCriteria(UmbracoExamine.IndexTypes.Content);

// Find pages that contain our search text in either their nodeName or bodyText fields...
            // but exclude any pages that have been hidden.
            var filter = criteria
                .GroupedOr(new string[] { "nodeName", "bodyText" }, searchTerm)
                .Not()
                .Field("umbracoNaviHide", "1")
                .Compile();

            SearchResults = ExamineManager.Instance
                .SearchProviderCollection["ExternalSearcher"]
                .Search(filter);

            if (SearchResults.Count() > 0)
            {
                searchResultsListView.DataSource = SearchResults.ToArray();
                searchResultsListView.DataBind();
                searchResultsListView.Visible = true;
                bottomDataPager.Visible = topDataPager.Visible = (SearchResults.Count() > _pageSize);
            }

            summaryLiteral.Text = "<p>Your search for <b>" + searchTerm + "</b> returned <b>" + SearchResults.Count().ToString() + "</b> result(s)</p>";

            // Output the query which an be useful for debugging.
            queryLiteral.Text = criteria.ToString();
        }

        #endregion
    }
}

Simply just compile the control, assign it to a macro and then add that macro to the bodytext field of the 'Search results' page. With everything in place and the 'Page Size' property set to 2, the results of a search should look something like this:

Search results

Hopefully this example has been a useful introduction to using Examine for Umbraco and set you on your way to creating a mighty fine search for your website.

To download the complete source for this web user control macro click here.