R: Current Knowledge on Modeling
Statistics exist to help make sense of a data set. However, statistics should only be used if you pick a model that makes the same assumptions as your data does. For example, you should only take the least squares and make a linear fit if the data is reasonably linear. This is true for any other type of fit as well such as quadratic or exponential. If you don’t look at your data beforehand and take the linear fit of widely dispersed data, you will produce an output that makes no sense.
I found Gigerenzer’s Mindless Statistics to be very interesting in this topic. He stated that the founders of statistics had strict debates about statistics, and had conflicting ideas. None of them thought that there should be one method for all hypotheses. I have been guilty of falling down the path of simply applying the “null ritual” in the past. Some questions do not require the Fisher and Neyman-Pearson hybrid. Instead, the model should be thought more critically based on the questions to be answered from the data.
I know how to take the least squares, I know how to find growth rates from those models and to analyze statistical significance. I know probability and how to apply some statistical tests. I also know how to do many of these calculations in R. However, I don’t have the best sense on how to determine when to use what model. Being able to convert English into a mathematical or statistical model is difficult for me in many contexts, and it will get better with time and practice. I am hoping that with my current project, I can better apply a model to a real world example and use a thought out statistical analysis that makes sense based on the data.
Comments
Post a Comment