Site hosted by Angelfire.com: Build your free website today!
  Using Internet as a Research Tool

Topic 1 Introduction to Internet Resources

Categories    Subjects    Purpose    Search Facilities  Search Strategy

Search Engines

  • Most commonly used facility to find information on the internet.
  • Use software to automatically generate a database of websites and pages.
  • Work behind the scene of a search system, in which automated programs called spiders or crawlers do indexing work for the search engines. They go out on to the Web and look at pages and the words on those pages, building huge lists of terms.


Work of Search Engine

  • Spiders/Crawlers
    • Visits a web page, reads it, then follows links to other pages within the site
    • Returns to the site on a regular basis, e.g. every month to look for changes
    • Differs in the depth
      • First 100 keywords
      • The entire full text
      • Top level of a site
    Some spiders only look only at the first bit of each page or just the top level. The top level might be the opening pages of the Web site or the first few pages of a multilevel site. Other spiders might scan the full text of a page.
    With about 4 billion pages and increasing, there is no way that the search engines can get to all of them. Thus, depending on how deep they go, this will determine the number of pages that they retrieve.
  • Index
    • Also known as catalog.
    • Contains a copy of every web page that the spider finds
    • If a web page changes, then this index is updated with new information.

  • Search engine software
    • Sifts through the millions of pages recorded in the index
    • Find matches to the search terms
    • Rank them in order of what it believes is most relevant

<Previous> <Next>


Home    Introduction    Topic    Assignment  Resources  Discussion Forum    Contact Us


© 2005 Temasek Polytechnic