User:Liblamb/how to search

From Wikibooks, open books for an open world
Jump to navigation Jump to search

This page has move to a more public area at How to search.


Ideas about a new book. Still sketchy. I'm trying to come up with an introduction and an approximate first chapter before I move it to a more publicly accessible location. If someone stumbles upon this and wants to implement their own thoughts, please do so or startup further discussion. I'm contemplating trying to give this more of an academic journal feel by using many citations. Don't know if this would work in a Wiki environment.

Purpose: {Still deciding; options below; maybe the two points below are two parts of the book.}

  1. Teach information seekers how to form accurate searches in electronic environments.
  2. To provide a quick reference guide to advanced searching mechanisms of electronic information.

Scope: Any electronic search environment. Examples include: Google, Pubmed, grep, Amazon, local library online catalogs...

Audience: All information seekers in electronic environments.

Authority: Authority will be higher if authors use external hyperlinks to the search engine or software provider's own documents.

How to search...[edit | edit source]

Introduction[edit | edit source]

When people began using computer technology and storing electronic documents, they needed new methods of finding those documents. Technical requirements governed the methods. Computers were built using logic gates which were either on or off, 0 or 1, and could calculate operations in boolean algebra. They recognized no other language and could accept no other input. With this in mind, it is little surprise that methods of finding documents also involved the use of boolean algebra.

Today, computers are still built from logic gates and we still use boolean algebra to search for documents. But, ingenious people have created ways to search for documents even when someone knows little about boolean algebra. These ways, often manifested in the form of templates, may be available with many popular search interfaces but not for all. When it comes to exact searches though (those that return the most accurate set of results according to the user's need), knowing boolean algebra and the inner workings behind the search are very helpful.


  1. The Theory of Search
    • rooted in IT (computer searches) which requires 1s and 0s - Logic
    • boolean
    • ven diagrams
  2. History of information searching before computers. {While this chapter may relate to the history of the book remember that this wikibook is titled "How to search" and should compare and contrast searching methods.}
  3. Specific searches
    • Categorization methods
      1. A-Z list by name
        • A,B,C...
      2. Date (most likely not worth doing)
        • ???Date added to wiki, date of database creation???
      3. Subject
        • Wiki method (whatever wiki people choose)
        • Dewey
        • LC
      4. Group search engines by common semantics
        • Phrase searches that use " "
          1. Google, Yahoo...
        • AND searches that use "AND" (not "and")
        • AND searches that use "and" (not "AND")
  1. Information to have on each search page. {I would like to come up with a template for the search pages. By search page I mean pages that tell, "how to search Google," "how to search my local library online catalog," or "how to search my Linux box using the software grep." Cite source by using external links to appropriate web pages. The search engine or provider's own documents are preferred.}
    • Purpose of the database
    • Audience
    • Authority
    • Scope of the database
      1. Library catalog - mostly an index of physical items such as books, magazines, music CDs, but also hyperlinks to digital content on the web.
      2. Google - the Internet; all documents that Google's web crawling software can process (social/cultural/legal exceptions aside)
      3. grep - all recognizable data on a computer
    • Automated vs. human created metadata
      1. Automated: Google, grep
      2. Human made: Yahoo directory, Periodical index, library catalogs
      3. {Many are a mix. Other details should describe this further.}
    • Full content vs. index only
      1. Full: google,
      2. index only:
    • Boolean logic
      1. Example as typed in search feild
      2. Prose explanation of the above example.
  2. Different sites to Search
    • Internet search engines
      1. Google, Yahoo...
    • Library Catalogs
      1. Swan/Mobius - iii
    • Published Document databases
      1. PubMed,
    • Personal Computer programs
      1. grep, windows file search
    • Online catalogs
      1. Amazon

Reference Guide[edit | edit source]

Google[edit | edit source]

  • Purpose: "Google's mission is to make the world's information universally accessible and useful." http://www.google.com/corporate/index.html
  • Audience: All Internet users.
  • Scope: The Internet. All documents that Google's web crawling software can process.
  • Refresh Rate: Constant

Boolean Operators[edit | edit source]

Boolean
Operator
Syntax Example
AND {Default}
skunk tomato
OR OR
skunk OR tomato
NOT -
skunk -tomato

Other Operators[edit | edit source]

Syntax Example Description
""
"skunk smell removal"
phrase search
+
skunk +and tomato
include a common word that Google usually does not search
~
{no example yet}
synonomous search
1..100
{no example yet}
number range search

Search Fields[edit | edit source]

Syntax Example Description
site:
site: wikibooks.org "how to search"
domain restrict

Other Features[edit | edit source]

Google also allows a number of searches which can only be done using templates on their web pages. These include search language and safesearch filter.

Other features not directly impacting search results include interface language, number of results, and results window.

PubMed[edit | edit source]

  • Purpose:"to provide access to citations from biomedical literature." http://www.ncbi.nlm.nih.gov/entrez/query/static/overview.html#Introduction
  • Audience: PubMed is run by National Center for Biotechnology Information (NCBI) which is part of the National Library of Medicine (NLM) in the United States. NCBI was established in 1988 as a national resource for molecular biology information, NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information - all for the better understanding of molecular processes affecting human health and disease. http://www.ncbi.nlm.nih.gov/
  • Scope: Biomedical literature. Pubmed "allows users to access a superset of NLM's MEDLINE database containing MEDLINE, in-process citations, and citations to articles from selectively indexed journals that normally would not be selected for MEDLINE indexing." http://www.nlm.nih.gov/services/pubmed.html
  • Refresh Rate: ?

Boolean Operators[edit | edit source]

Boolean
Operator
Syntax Example
AND {Default}
skunk tomato
OR OR
skunk OR tomato
NOT NOT
skunk NOT tomato
NEAR {N/A}

Other Operators[edit | edit source]

Syntax Example Description
{phrase search}
{N/A}
phrase search is not possible in PubMed

Search Fields[edit | edit source]

Syntax Example Description
Author
Journal Title
MeSH (Medical Subject Headings)
YYYY/MM/DD
1999/04/19
Date of Publication
YYYY/MM/DD
1999/04/19
Entrez Date - Date publication was entered into Pubmed
YYYY/MM/DD
1999/04/19
Date publication was given MeSH terms

Other Features[edit | edit source]

Theory and Instruction[edit | edit source]

The following book sections give general theory and instruction about searching in an electronic environment:

  • Types of resources

There is great variety within the set of searches that authors may include in this book. While the difference between the searches is self evident to some, it is not to others. For this reason, the types of searches are explained below.

  1. Internet search engine
  2. Indexing/abstracting journal database
  3. Citation index
  • Boolean Logic
  • The information search process