Using SQL Server Full Text Index / Search with a Thesaurus

0 Flares Twitter 0 Facebook 0 StumbleUpon 0 Google+ 0 LinkedIn 0 0 Flares ×

Using a thesaurus with sql full text index is really useful, mostly when developing some form of generic search functionality. The idea with the thesaurus is that you can provide sql server with a sets of words that can act interchangeably or provide a particular pattern that is replaced with a given phrase.

For example, you could specify that when a search is submitted for the word author, all results containing author, writer or journalist are returned. (yes, I’m getting my examples from msdn). This is an expansion set.

Or you could specify that when a search is submitted for the word W2K, all results containing windows 2000, XP or vista are returned, but not results containing W2K. This is a replacement set.

To implement this you must have a table that has a full text index configured, then…

Adding the Thesaurus

The thesaurus files are stored within “<SQL_Server_data_files_path>\MSSQL11.MSSQLSERVER\MSSQL\FTDATA\” and look something like “ts” + 3 letter language acronym + “.xml”. For English language the file is called tsenu.xml.

Open this file and you’ll see examples of what to configure. An expansion specifies a list of different terms that can be interchanged:

 

A replacement set looks like:

Where pat defines a pattern to be replaced, and sub defines the value(s) to replace it with.

After making these changes you need to save the file, and then execute the following stored proc in your database:

Here 1033 is your language lcid (1033 is english)

Using the Thesaurus in Queries

Implementing this is only half the job, as you have to tell your full text queries to use the thesaurus. You do this using the FORMSOF command in your CONTAINS, CONTAINSTABLE (etc) commands. For example, the following query uses the CONTAINS command with the FORMSOF command:

 

And this example uses CONTAINSTABLE:

 

All in all it works pretty well and performs excellently.

,

  • James

    Hi Matt,

    Thanks for the article. How would you pass a parameter where ”writer” is?

    I’m currently trying something like this but no results are being returned.

    WHERE CONTAINS(l.LocAddressFormatted, ‘FORMSOF(THESAURUS, @p_address)’)

  • James

    Matt,

    I solved it. This part must be passed in as a string literal

    DECLARE @v_searchPhraseThesaurus NVARCHAR(100)

    SET @v_searchPhraseThesaurus = ‘FORMSOF(THESAURUS, “‘ + @p_address + ‘”)’

    WHERE CONTAINS(l.LocAddressFormatted, @v_searchPhraseThesaurus)

The Essential App Marketing Kit
Subscribe To My Newsletter To Get an Entire Chapter From The Book for FREE
Never display this again
0 Flares Twitter 0 Facebook 0 StumbleUpon 0 Google+ 0 LinkedIn 0 0 Flares ×