Sunday, January 09, 2011

SQL Server Full-Text Search

Episode 61 of the DeepFriedBytes podcast recently introduced me to 2 new SQL Server concepts. The first one was SQL Server Full-Text Search which I'd heard of as we use it for some of our business applications, but I've personally never really been involved. The second one is SQL Server Service Broker, which I must admit to having never heard of, but from the use cases described in the podcast it looks like something worth exploring further.

SQL Server Full-Text Search allows you to:

"perform linguistic searches against text data in full-text indexes by operating on words and phrases based on rules of a particular language such as English or Japanese. Full-text queries can include simple words and phrases or multiple forms of a word or phrase"

The basic steps to configure table columns in a database for full-text search are as follows:

1. Create a full-text catalog e.g.
use mydatabase
go
EXEC sp_fulltext_database 'enable'
go
CREATE FULLTEXT CATALOG mycatalog
go
2. On each table that you want to search, create a full-text index e.g.
CREATE FULLTEXT INDEX ON mydatabase.dbo.mytable
(
column_to_index
Language 0X0
)
KEY INDEX myindex ON mycatalog
WITH CHANGE_TRACKING AUTO
a. Identify each text columns that you want to include in the full-text index.
b. If a given column contains documents stored as binary data (varbinary, varbinary(max), or image data), you must specify a table column (the type column) that identifies the type of each document in the column being indexed.
c. Specify the language that you want full-text search to use on the documents in the column.
d. Choose the change-tracking mechanism that you want to use on the full-text index to track changes in the base table and its columns.


3. After the columns have been added to a full-text index, applications and users can run full-text queries on the text in the columns. These queries can search for any of the following:
  • One or more specific words or phrases (simple term)
  • A word or a phrase where the words begin with specified text (prefix term)
  • Inflectional forms of a specific word (generation term)
  • A word or phrase close to another word or phrase (proximity term)
  • Synonymous forms of a specific word (thesaurus)
  • Words or phrases using weighted values (weighted term)
SELECT    product_id
FROM products
WHERE CONTAINS(product_description, ”Snap Happy 100EZ”
OR FORMSOF(THESAURUS,’Snap Happy’)
OR ‘100EZ’)
AND product_cost<200
SELECT    candidate_name,SSN
FROM candidates
WHERE CONTAINS(candidate_resume,”SQL Server”)
AND candidate_division =DBA
And that's about it to get started...!

No comments: