Talk:Statistics
From Wikibooks, the open-content textbooks collection
Contents |
[edit] About
- Started: 29 February 2004
- Size: 43,200 words. (June 2009)
- Recent changes
- All subpages
[edit] Practice Problems
I added a "practise problem" section. I see no precedent for this, nor any discussion of it, so I'm just going ahead and doing something. I'll mark the new problem as new. Please double check those and then remove the "new" marking. Thanks. AdamRetchless 14:05, 24 Aug 2004 (UTC)
- It might make more sense to put a 'practice problem' section at the end of each chapter. But what if the last chapter (where you have these practice problems now) had solutions to the "odd numbered problems" as is often the case in textbooks? --Murraytodd 19:32, 15 Sep 2004 (UTC)
[edit] Quantiles/Quartiles
This is probably fairly moot since the actual section hasn't even been written yet, but should the section be titled Quantiles or Quartiles? Both are common in statistics and I don't know which would be more likely to be the header of a section.--Carraway 19:05, 25 Aug 2004 (UTC)
- Good point. Quartiles are a better title given the context of the chapter. --Llywelyn 08:18, 26 Aug 2004 (UTC)
- I would disagree. Quantiles is a more general term. Ie. a quartile is a type of quantile, but the converse is not true. In fact, the quartile section now discusses quantiles, quinties and percentiles, none of which are quartiles! --Murraytodd 19:32, 15 Sep 2004 (UTC)
- I agree with Murraytodd, the section should be titled "Quantiles", since not only quartiles are mentioned, but also quintiles, deciles and percentiles.Nijdam 16:05, 31 March 2006 (UTC)
- As far as I know, quintiles is the general term. For example F(x) = 1 − α is called the alpha-quintile of the stochastic variable X. Quartile refers to fixed quintiles in steps of 25%, where alpha is 25%, 50% (median) and 75%. For steps of 10% we have got deciles, and for single percentages we have percentiles. I hope that I am bringing light to this discussion. I will see if I can fix it.
--Whisky Brewer 22:54, 27 Dec 2006 (UTC)
[edit] Distributions
I took the liberty or rearranging the distributions chapter a little bit. First, I separated the discrete and continuous distributions, then I placed them in a more natural order—from the simplest to the more complex. We should be careful when writing this chapter to create a consistent "train of thought" to link these distributions together, lest our "textbook" turn into an unreadable encyclopedia. I would suggest that whenever applicable we show phenomena in the real world that follow these sorts of distributions. For example, the Exponential distribution being a good measure of "time to failure" like the lifespan of lightbulbs; the Poisson distribution being the distribution of how many customers walk into a store in any 5-minute interval; the Binomial distribution being the number of students in the classroom with green eyes—and thus a sum of simple Bernoulli trials. (Oh, and I added the Bournoulli distribution!) --Murraytodd 19:32, 15 Sep 2004 (UTC)
[edit] A Programming Language
What would people think of the idea of adopting R as an optional tool for students within this text? R is arguably the up-and-coming standard in academic statistics programs (at least the graduate programs) and it's Open Source and available for Windows, Mac OS X and Linux. It would be nice to occasionally show how results can be computed using R. When we show graphs of distributions, etc., we could include the R code that was used to generate those graphs. It would be an excellent resource for potential students. --Murray Todd Williams 22:38, 15 Sep 2004 (UTC)
- While I like the idea of adding a section on R, I do not believe that it should be integrated through the text. There is too much of an emphasis today on getting a piece of software and/or hardware to do someone's thinking for them, and at least I would advocate that for an introductory course in something as basic as statistics should emphasis the process and the understanding and not the function, which is the emphasis of software packages.
- That having been said: I personally think a chapter, set of chapters, integrating sections into chapters of more advanced statistics (e.g., multivariate analysis), or even a separate textbook on R would be incredibly valuable. --Llywelyn 01:39, 16 Sep 2004 (UTC)
-
- I agree with the spirit of your concern, but I still think R would provide the student with a quick way to explore the field. Back in the days of SAS and SPSS I would wholeheartedly agree with you. Those packages focused on the "load up the data into the black-box, hit go, and assume the reams of meaningless output was a good thing". R, however, is more of an interactive language that makes it easy for the student to "play around with" and explore these concepts quickly.
-
- Instead of looking around for a calculator that can somehow handle factorials and punching in esoteric equations, the student can quickly type something like dpoisson(0:6,3) and see the probabilities of of Poisson distribution with lambda parameter=3 for X=0..6. Add a quick plot() function around that and the student can see that graphically. A little more and the student can overlay a normal distribution and see how well (or poorly) a normal curve will approximate it. Equally easy the student can create a vector with a random sample from that same Poisson distribution.
-
- Frankly, I can't imagine teaching statistics without providing some mechanism for playing around with examples and encouraging the student to do the same. Also, with some pre-generated sample data it becomes easy to create homework problems that encourage exploration, examination, etc. I don't mean saying "load this data set and perform an ANOVA and give me a p-value from the pre-canned report" but rather "Here's a set of data. Do you think an ANOVA would be suitable? What problems and questions are we facing? Do you feel the sample size is sufficient, or is the problem over-parameterized?"
-
- Of course, this discussion is (pardon the pun:) academic. Probably the best thing to do would be to go ahead and write a section or two (with problems) to demonstrate what I'm thinking and get feedback to see if people concur. --Murray Todd Williams 16:44, 16 Sep 2004 (UTC)
-
-
- It sounds like there are two independent issues here: 1) whether to add practical problems with online data that students can use to better understand the concepts and 2) whether to include statistical-package specific information on how to implement different types of analysis. In my mind, these can be implemented seperately.
-
-
-
- For the first case, the homework problems can include data for students to use with any of the common packages. For the second case, my opinion is that this should not be directly integrated into the text because people might be reading them to understand the core concepts and forcing them to read through a bunch of programming syntax while they do so will lose a lot of these types of readers.
-
-
-
- Rather an external page with a title like 'implementing historgrams' or 'how it's done' with a step by step process and code for major languages, Excel, oo and the like would be a great resource for more advanced readers. I'd think that a lot of this type of info could reference other textbooks in wikibooks or not to reduce duplication. Antonrojo 13:38, 17 March 2006 (UTC)
-
[edit] Random Variables?
The way I was taught probability, it doesn't make much sense to discuss distributions without any discussion of random variables. Is this a standard approach, or are the probability courses I have taken odd? I'm willing to add a section on random variables and adjust the sections regarding distributions if other people think this is a good idea. Adam Wolfe Gordon 02:34, 7 March 2006 (UTC)
[edit] Just a quick comment...
...And I guess it's a little ridiculous to add it to the discussion page -- no response required! -- but I just wanted to thank everyone who's contributing to this book, I'm really excited that there'll be an open, online introductory book on statistics, and hopefully a broader array of statistics books to come.
[edit] Frequency Density
I haven't done much yet other than the definitions, but in the histogram section I started frequency density. Do you think this can be applied to statistics and the course? Pinkie closes 20:00, 29 March 2006 (UTC)
[edit] merge "Research Methods" into this page?
I actually have a couple of comments.
First, this is fantastic! I hope it's a topic at a future statistics conference.
Second, I definitely like the idea of using R. However, that may well require another separate document on installing/updating/how to...
Finally, some material is synonomous with the Research Methods site. Since Statistics is currently more comprehensive, I'd suggest merging that data in with this one. I was expecting more of a study design flavor at that site, but didn't find it.
[edit] Who is this book aimed at?
Please make clear who this book is supposed to be aimed at. Doing so will guide the content. Whole books have been written on linear models, graphing data, testing hypotheses, and so on. This single book cannot cover all areas in the depth they deserve. The level of depth depends on the intended audience, though. A book aimed at 18 year old students of statistics will be very different from one aimed at 21 year old students of statistics or at medical practitioners, for example.
Basically we should ask who is willing to contribute? The contribution from me and my students (Numerical Methods & Analysis of Specific Datasets) might be little to high-level for student in basic statistics courses. On the other hand I would not mind to move it to another book, e.g. something like "Various topics in statistics". --sigbert 11:25, 02 April 2007 (MESZ)
[edit] Permutations and combinations
Should we have a section on permutations and combinations and if so where? Zginder 20:13, 30 September 2007 (UTC)