Saturday, November 20, 2010

RURS - the statistics edition

Last Wednesday, the RURS team had its second to last episode, visiting the statistics department on East Campus.

Participants expressed a strong orientation towards clients – clients provide the meaning, the thresholds relevant to the analysis. As a group statisticians value their neutrality. They see their role as making sense out of data, to provide clarity. One participant commented that “[t]he career of a statistician exists because of uncertainty.” However, participants were more comfortable describing random processes as variability rather than risk or uncertainty, and their role as quantifying variability. Participants also talked about quantified errors in two different ways as Type I error rate, and as the False Discovery Rate.

Risk is the expected value of a loss function. Risk is the probability of a Type I error, but “[I] would never use risk in a paper”. Risk is the probability of an adverse outcome precipitated by your actions. Risk is a probability but doesn’t have to be negative. Students in introductory statistics learn about “relative risk” – of two options both with some risk, which is the riskier risk?

One participant said that all of statistics is about reducing uncertainty, while another characterized it as assigning variation to different sources. Statistics is all about quantifying uncertainty, accepting it for what it is. Another participant described introductory statistics students as uncomfortable with uncertainty, while she is comfortable with uncertainty, with the rules not being clear. Certainty decreases with increasing familiarity with the discipline of statistics, as in life.

Another participant described two types of uncertainties – one type is driven by stochastic processes, and can be quantified with a probability distribution. In contrast, uncertainty about which model is appropriate is not quantifiable with a probability distribution.

An example of stochastic processes type of uncertainty: one has to use the parentage and genetic makeup of a Bull to predict the milk production of his daughters; this is not perfect, because each mating of a bull with a cow produces offspring that differ from one another. The uncertainty matters, because for each bull there are two types of risk: keep the bull when you shouldn’t, costing money unnecessarily, or castrating the bull and loosing access to that genetic potential.

Critical thresholds were expressed as the degree of confidence one has in a result – is it good enough to take action on? And this may vary between individuals – “It is different when talking about your surgery than my surgery.” The key piece is that critical thresholds reflect individual’s tolerance for risk and uncertainty. Statisticians need to extract these “thresholds” in the form of effect sizes in order to provide advice on sample sizes and experimental design. In the absence of thresholds, there is always the economic limit – “How many reps can you afford?”

An example offered was estimating variation in teacher performance; these point estimates come with an estimate of uncertainty as an interval. Communicating that interval to decision-makers is difficult and dealing with the population of teachers as a whole is different than making a personal decision about which teacher you want for your child.

Like Political Scientists, statisticians also did not discuss car accidents.

No comments:

Post a Comment