Statistics/Different Types of Data/PS

From Wikibooks, open books for an open world
Jump to: navigation, search

Statistics


  1. Introduction
    1. What Is Statistics?
    2. Subjects in Modern Statistics
    3. Why Should I Learn Statistics? 0% developed
    4. What Do I Need to Know to Learn Statistics?
  2. Different Types of Data
    1. Primary and Secondary Data
    2. Quantitative and Qualitative Data
  3. Methods of Data Collection
    1. Experiments
    2. Sample Surveys
    3. Observational Studies
  4. Data Analysis
    1. Data Cleaning
    2. Moving Average
  5. Summary Statistics
    1. Measures of center
      1. Mean, Median, and Mode
      2. Geometric Mean
      3. Harmonic Mean
      4. Relationships among Arithmetic, Geometric, and Harmonic Mean
      5. Geometric Median
    2. Measures of dispersion
      1. Range of the Data
      2. Variance and Standard Deviation
      3. Quartiles and Quartile Range
      4. Quantiles
  6. Displaying Data
    1. Bar Charts
    2. Comparative Bar Charts
    3. Histograms
    4. Scatter Plots
    5. Box Plots
    6. Pie Charts
    7. Comparative Pie Charts
    8. Pictograms
    9. Line Graphs
    10. Frequency Polygon
  7. Probability
    1. Introduction to Probability
    2. Bernoulli Trials
    3. Introductory Bayesian Analysis
  8. Distributions
    1. Discrete Distributions
      1. Uniform Distribution
      2. Bernoulli Distribution
      3. Binomial Distribution
      4. Poisson Distribution
      5. Geometric Distribution
      6. Negative Binomial Distribution
      7. Hypergeometric Distribution
    2. Continuous Distributions
      1. Uniform Distribution
      2. Exponential Distribution
      3. Gamma Distribution
      4. Normal Distribution
      5. Chi-Square Distribution
      6. Student-t Distribution
      7. F Distribution
      8. Beta Distribution
      9. Weibull Distribution
  9. Testing Statistical Hypothesis
    1. Purpose of Statistical Tests
    2. Formalism Used
    3. Different Types of Tests
    4. z Test for a Single Mean
    5. z Test for Two Means
    6. t Test for a single mean
    7. t Test for Two Means
    8. paired t Test for comparing Means
    9. One-Way ANOVA F Test
    10. z Test for a Single Proportion
    11. z Test for Two Proportions
    12. Testing whether Proportion A Is Greater than Proportion B in Microsoft Excel
    13. Spearman's Rank Coefficient
    14. Pearson's Product Moment Correlation Coefficient
    15. Chi-Squared Tests
      1. Chi-Squared Test for Multiple Proportions
      2. Chi-Squared Test for Contingency
    16. Approximations of distributions
  10. Point Estimates100% developed  as of 12:07, 28 March 2007 (UTC) (12:07, 28 March 2007 (UTC))
    1. Unbiasedness
    2. Measures of goodness
    3. UMVUE
    4. Completeness
    5. Sufficiency and Minimal Sufficiency
    6. Ancillarity
  11. Practice Problems
    1. Summary Statistics Problems
    2. Data-Display Problems
    3. Distributions Problems
    4. Data-Testing Problems
  12. Numerical Methods
    1. Basic Linear Algebra and Gram-Schmidt Orthogonalization
    2. Unconstrained Optimization
    3. Quantile Regression
    4. Numerical Comparison of Statistical Software
    5. Numerics in Excel
    6. Statistics/Numerical_Methods/Random Number Generation
  13. Multivariate Data Analysis
    1. Principal Component Analysis
    2. Factor Analysis for metrical data
    3. Factor Analysis for ordinal data
    4. Canonical Correlation Analysis
    5. Discriminant Analysis
  14. Analysis of Specific Datasets
    1. Analysis of Tuberculosis
  15. Appendix
    1. Authors
    2. Glossary
    3. Index
    4. Links

edit this box


Primary and Secondary Data[edit]

Data can be classified as either primary or secondary.

Primary Data[edit]

Primary data means original data that has been collected specially for the purpose in mind. It means someone collected the data from the original source first hand. Data collected this way is called primary data.

The people who gather primary data may be an authorized organization, investigator, enumerator or they may be just someone with a clipboard. These people are acting as a witness so primary data is only considered as reliable as the people who gathered it.

Research where one gathers this kind of data is referred to as field research.

For example: your own questionnaire.

Secondary Data[edit]

Secondary data is data that has been collected for another purpose. When we use Statistical Method with Primary Data from another purpose for our purpose we refer to it as Secondary Data. It means that one purpose's Primary Data is another purpose's Secondary Data. Secondary data is data that is being reused. Usually in a different context.

Research where one gathers this kind of data is referred to as desk research.

For example: data from a book.

Why Classify Data This Way?[edit]

Knowing how the data was collected allows critics of a study to search for bias in how it was conducted. A good study will welcome such scrutiny. Each type has its own weaknesses and strengths. Primary Data is gathered by people who can focus directly on the purpose in mind. This helps ensure that questions are meaningful to the purpose but can introduce bias in those same questions. Secondary Data doesn't have the privilege of this focus but is only susceptible to bias introduced in the choice of what data to reuse. Stated another way, those who gather Secondary Data get to pick the questions. Those who gather Primary Data get to write the questions.

<< Different Types of Data | Statistics | >> Qualitative and Quantitative