Lesson Four:
Power Search Techniques (Boolean and Field Searching)

This week we are going to explore Power Search Techniques. There are many excellent guides describing Boolean logic and charts showing the features available at the "major" search engines, which you should consult for more details (please see the section For More Information). Here, our purpose is a brief overview of possible techniques. Please note that while most of the major search engines we have been using allow some advanced search capabilities, they appear or are implemented differently.

And of course, we'll be going on another Info Quest.

### Why "Power Search?"

We've all had the experience where we query a search engine and come back with thousands, maybe hundreds of thousands of "hits" or "matches" for our search terms. Unfortunately, it is time-consuming to sift through the web sites, and usually not very profitable. When you obtain a large number of "hits" from a search, this is known as high recall. While this might be the goal in some cases (for example, if you are working on a topic that is relatively new and you want everything published on it -- and the number of "hits" is only a couple of hundred), for the most part when people are searching the web, they are interested in high precision. High precision means that the retrieved documents are highly relevant to your subject, and is achieved by fine tuning your search to accurately describe your topic with its unique aspects.

Many advanced web searching techniques are old friends of folks used to searching more traditional databases, such as those containing bibliographic citations or references to journal articles. Some techniques are unique to the web because of its media and structure.

Important Note: not all advanced techniques are enabled by all search engines! Consult one or two of the charts in For More Information and/or read the Help documentation of the search facility you are using.
Another Note: for the most part, search engines at directory sites do not offer advanced searching features.

### What is Boolean Searching?

Boolean searching is an implementation of Boolean logic and set theory. Boolean operators, such as AND, OR and NOT, are used to combine search sets in a variety of ways and appear within Internet search engines in a range of disguises. A very brief overview:
Search phrase: cats and dogs
means find web pages in which both terms occur
Search phrase: cats or dogs
means find web pages in which either term occurs
Search phrase: cats not dogs
means find web pages in which the term cat appears but not dog

Most web search engines have the capability to implement these basic Boolean operators but may present them in a different way. You will almost always need to go to an "Advanced" search function to use true Boolean operators; however, you may be able to search using implied Boolean using the symbols + (must include) or -(exclude) from the "Basic" search interface.

Examples of usage:

AND
Use this operator to search for documents where you'd like both terms to appear, narrowing a search.
Dalmatians AND feeding

OR
Use this operator to include synonyms, particularly where there are several terms or names used for a topic, or you would like to broaden a search.
Dalmatians OR spotted dogs

NOT
Use this operator to exclude terms, particularly when your search terms have more than one meaning.
Blues NOT depression

Special Note: these Boolean operators are often presented as options like "include all the words," (AND operator) "include any of the words," (OR operator) and "exclude" (NOT operator).

Another special note: while you might expect that search engines default to an implied AND (which means if you enter 2 search terms it returns documents in which they BOTH occur) in fact this is not always the case -- some search engines default to the initially unhelpful OR (it returns documents in which EITHER occur)

### What are "Proximity Operators"?

Also Boolean Operators, proximity operators such as NEAR or ADJ are used to control how closely the terms occur in the web document that is retrieved. For example, NEAR/3 means that the terms must occur within 3 words of each other. Proximity operators ensure that your terms are more closely related to another.

Examples of usage:

NEAR/x
Use this operator to search for documents where you'd like both terms to appear within a specified distance of each other, narrowing a search
Dalmatians NEAR/3 feeding (web documents will be returned that have the term Dalmatians occurring within 3 words of the term Feeding

Use this operator to search for documents where you'd like both terms to appear next to each other, narrowing a search

Special Note: the ADJ Boolean operator is often disguised as the option "exact phrase."

### What is Field Searching?

Remember that a web search engine is only as good as its database and indexes. Databases are collections of records organized in a similar manner; simply put, this means they are divided into fields that contain the same information in each record. If data is entered into a separate field you can retrieve it using its field label. This means that if you want to search by title, the search engine looks in a special title index (or searches notations that indicate that the term occurs in the title field) where it has collected data from the field with the label title

Field searching is so wonderful because you can specify where to look in the web document; for example, in the title only, or the url fields. Field searching allows you to be very specific about where you want you terms to occur and hence is a very powerful tool.

Using Search templates:

I'm a big fan of advanced search templates such as the ones used by Hotbot (http://www.hotbot.com) and Snap (http://www.snap.com). These templates use many Boolean and field searching techniques without having to learn the syntax of yet another search engine.

Recommended:

### Assignments:

• Where can I find a picture of a dragon in the Western Sahara?
• Why was the Taj Mahal built? Use a web site that includes a picture.
• Where can I find out about recalls on children's toys?
• Find out why crime doesn't pay in Brazil.
• What happened to Mr. Earnshaw on p. 39 of Wuthering Heights?
• Listen to Martin Luther King discussing defense expenditures. What did he think we should be spending more money on?
• What can you take a tour of this native of Queensland that is invisible to the naked eye?
• What class of people came after the Pharaoh in the social pyramid?
2. Suggestion 2 implementations (search queries which demonstrate the use of) of the following techniques:
• Boolean Searching
• Proximity Searching
• Nesting
• Field Searching
3. Try the following searches in 2 different search engines, noting the differences. Do you get the results you expect (for example, if you think you are narrowing the search, does this in fact happen?) Use a search template for the field searches, or try out some of the field searching techniques.
• rainbow trout
• rainbow and trout
• "rainbow trout"
• rainbow +trout
• rainbow -trout
• rainbow trout but not farm
• rainbow trout and not farm
• (rainbow trout) and (fly or flyfishing)
• rainbow trout in title field
• rainbow trout in top level field
• rainbow trout in .edu domain
• rainbow trout in .gov domain
• rainbow trout in .com domain
