WEEK FOUR
Doing Research Online:
Planning Your Search, Exploring and Using Search
Tools
Exactly What Is It That We Are Looking At When We Are Searching?
Everything from specialized full-text and statistical databases to online library catalogs to web sites and web pages found by using a search engine or directory is available online. It's important to begin by knowing exactly what we are searching.
As we learned in our earlier lesson, when we are looking for web documents (web pages or websites), what we are really doing is using a search tool that looks at its own database or collection of information about websites that live on computers (servers) located worldwide. The World Wide Web contains billions of documents, and unlike a library's catalog which is indexed using standard terms, the Web is not indexed using any common vocabulary. This means when we enter our search term(s) we are really guessing what terms someone used to organize a web page or website on a particular topic, and hoping that our guess will match those terms so we can find the link to that document!
So, we enter our search term and
hope for the best!
Even though any search tool
we choose is really only looking at a small subset of the entire World Wide Web,
the number of links to websites that come to us in response to a search can be
overwhelming. So, a good concept to keep in mind is that it is impossible for
any search to look at the entire web, but your searching techniques can be
developed so you will choose the best search tool, maximize your efforts and
find what you need.
The basic categories of search tools include search engines, metasearchers, subject directories, and library gateways/specialized databases. For each of these, we'll look at what it is, how it works, the pros and cons of using it, and examples of searches for which you would want to use it.
What Is A Search Engine?
Search
engines are huge databases of web page files that have been assembled
automatically by machine. There are two types of search engines: individual
search engines that compile their own searchable databases (Google, alltheweb)
and Metasearchers that search the databases of multiple individual search
engines simultaneously (ixquick, vivisimo, surfwax).
Remember when we said we are guessing what terms were used to define or organize a web site? Whenever you search the web using a search engine, you're asking the engine to scan its index of sites and match your keywords and phrases with those in the search engine's database -- hoping your guess matches the terms used.
Search engines use sets of rules or guidelines (varying from one engine to another) to rank pages. They want to return the most relevant pages at the top of their lists, so they look for the location and frequency of keywords and phrases in the web page document and (sometimes in the HTML coding that doesn't appear on the web page). They check out the title field and scan the headers and text near the top of the document. Some of them assess popularity by the number of links that are pointing to sites; the more links, the greater the popularity, i.e., value of the page.
How Current Is the Data retrieved by
a Search Engine?
Spiders regularly return to the web pages they index to
look for changes. When changes occur, the index is updated to reflect the new
information. However, the process of updating can take a while, depending upon
how often the spiders make their rounds and then, how promptly the information
they gather is added to the index. When you are using a search engine, you are NOT searching the entire
web as it exists at this moment, you are really using a search tool to look at a
portion of the web, indexed previously.
Most search engine companies have partnered with specialized news databases that are up to the minute, allowing the search engine to provide up-to-the-minute news, usually accessible by clicking a tab or link labeled "news" Good examples include All the Web News, Yahoo! News and Google News:
NOTE: Today, the line between search engines and subject directories is blurring. Search engines are partnering with subject directories, or creating their own directories, and returning results gathered from a variety of other guides and services as well.
What Are Some Examples of Search
Engines?
Google, Teoma and All the Web are good search engines to look
at.
What is A Metasearcher?
Unlike
search engines, that crawl the web compiling their own searchable databases,
metasearchers search the databases of multiple sets of individual search engines
simultaneously, from a single site and using the same interface. Metasearchers
provide a quick way of finding out which engines are retrieving the best results
for you in your search.
How Does A Metasearcher Work?
After
compiling results from several search engines, metasearchers present the results
of their searches in either a single merged list (without duplicate entries) or
in separate lists as they were received from each search engine (duplicate
entries may show up).
What Are
the Pros and Cons of Metasearchers?
They can
give you a fair picture of what's available on the Web and where it can be
found, and are usually very fast. You generally can't choose how your search is
configured or conducted, so you are at the mercy of the metasearch engine to
present your search.
How Does A Subject Directory Work?
When you
enter your search term, a directory tries to match your term or phrase with
those in its written descriptions. Subject directories include general
directories, academic directories, commercial directories, and portals. Portals
are directories created or used by private interests or companies to use as
gateways to the web. Another new trend is toward "Vortals" (vertical portals)
that are subject-specific. Examples include the Internet Movie Database,
SportSearch and
WebMD.
Dead links (often created because a web page
changed content after inclusion in a directory) tend to be a real problem
for subject directories, and some people view them as being too heavily
populated with e-commerce sites.
When Should You Use Directories?
These are best for general searching and browsing
(think of the telephone book or Yellow Pages: if your picnic table is broken and
you want to find a repair person, first you find "furniture" then you go to
"outdoor furniture" and then "repair."
Library Gateways, Specialized Databases and
"Vortals":
What Are Library Gateways and
Subject-Specific Databases?
Library gateways are collections of databases
and organized lists of informational sites, created, recommended and reviewed by
specialists (usually librarians). These support reference and research by
identifying and pointing to academically-oriented pages on the Web.
Subject-specific databases or vortals ("vertical portals") are databases devoted
to a single subject. They tend to be created by governmental agencies, business
interests, professors, researchers, and other subject specialists in a
particular field.
How Can You Access the "Invisible
Web" sites?
In order to get to much of the Invisible or "Deep" web, you
have to point your browser directly at the sites. This is exactly what many
library gateways and subject-specific databases do. They are good sources for
direct links to database information stored on the "Invisible Web."
What Are Some Examples of "Vortals" (Subject-Specific Databases)?
It's tempting to just dive in to your search, but it's a good idea to THINK about your search before you begin. Create a search strategy in your head by asking yourself "What is it I want to do? Browse? Locate a specific piece of information? Find everything I can on a subject?"
The answer to this will steer you toward the best search tools to use and help you formulate your search strategy.
If you just start by entering more than one keyword into your search without using any accompanying sign, mark or symbol, the search engine will most commonly automatically add either AND or OR to link your search terms together. This could radically alter your search in unexpected ways. The defaults are the basic settings of the search engine you are using, and can often explain why your search results may not be what you expect them to be.
Strange things can happen for other reasons as well. Sometimes search engines use relevance ranking systems that can throw off your search by ignoring some of the words in your search statement. This might happen when the search engine recognizes your string of separate keywords as a phrase in its list of pre-determined phrases.
Another time this can happen is when the search engine is responding to its own internal list of "stop words" (these are words that some search tools ignore in order to cut down response time). Stop word lists tend to include small common words, such as a, about, an, and, are, as, at, be, by, from, how, i, in, is, it, of, on, or, that, the, this, to, we, what, when, where, which, with, etc. If you initiate a search at a site that maintains a list of stop words and you type any of those words into your search statement (even in phrases surrounded by quotes), they may well be ignored. An exception to this is Google, which has a stop word list but recognizes stop words within phrases surrounded by quotation marks, e.g., "to be or not to be" or "what you see is what you get".
You may never know the real reason why your search retrieves so many irrelevant responses, and it can be frustrating!
You can us "Boolean operators" such as AND, OR and NOT to include or exclude keywords from a search. In other words, if you were trying to find information about lions and tigers, you could structure the following search:
lions AND tigers
This would retrieve any sites that include references to both lions and tigers
lions OR tigers
This would retrieve any sites that include a reference to the keyword lions or the keyword tigers, but not necessarily to both in the same site.
lions OR tigers NOT Detroit
This would retrieve any sites that include a reference to the keyword lions or the keyword tigers but not the keyword Detroit, so it would exclude search results that were about either Detroit's baseball team (Tigers) or NFL football team (Lions). Here is a PDF guide to assist you in constructing boolean searches.
Research Steps
Simplified:
To perform research effectively, both online and using print
materials:
1. Identify your topic (a good technique is to
state your topic as a question)
2. Find background
information (look up keywords in subject encyclopedias).
3. Use catalogs to find books (start with the MPC Library online
catalog)
4. Use indexes to
find periodical articles (available from the MPC library web
page
5. Find internet, audio
and video resources
6. Evaluate your search
results
7. Cite your sources in a standard format (MPC
library has online information)
The following tips can help determine the terms to use when formulating your search query for online searching:
- Be specific
EXAMPLE: Hurricane Hugo- Put the most important terms first in your keyword list; to ensure that they will be searched, put a +sign in front of each one
EXAMPLE: +hybrid +electric +gas +vehicles- Use at least three keywords in your query
EXAMPLE: medication alcohol interaction- Combine keywords, whenever possible, into phrases
EXAMPLE: "website creation tutorial"- Avoid common words, e.g., water, unless they're part of a phrase
EXAMPLE: "bottled water"- Think about words you'd expect to find in the body of the page, and use them as keywords
EXAMPLE: compulsive eating bulimia eating disorder
Online Tutorials and Additional Resources
© 2004 Stephanie Tetter