![statplus regression wont pick up dependent variable statplus regression wont pick up dependent variable](https://venturebeat.com/wp-content/uploads/2018/09/image12.png)
These statistics might not agree because the manner in which each one defines "most important" is a bit different. Fortunately, there are several statistics that can help us determine which predictor variables are most important in regression models. We ruled out a couple of the more obvious statistics that can’t assess the importance of variables. Do Compare These Statistics To Help Determine Variable Importance Takeaway: Low p-values don’t necessarily identify predictor variables that are practically important. A statistically significant result may not be practically significant. A very low p-value can reflect properties other than importance, such as a very precise estimate and a large sample size.Įffects that are trivial in the real world can have very low p-values. P-value calculations incorporate a variety of properties, but a measure of importance is not among them. The coefficient value doesn’t indicate the importance a variable, but what about the variable’s p-value? After all, we look for low p-values to help determine whether the variable should be included in the model in the first place. Don’t Compare P-values to Determine Variable Importance Takeaway: Larger coefficients don’t necessarily identify more important predictor variables.
![statplus regression wont pick up dependent variable statplus regression wont pick up dependent variable](https://i.stack.imgur.com/XITN5.png)
The coefficient value changes greatly while the importance of the variable remains constant.
![statplus regression wont pick up dependent variable statplus regression wont pick up dependent variable](https://miro.medium.com/max/552/1*5_eMU4QFcbp2-S5Igf3glg.png)
If you fit models for the same data set using grams in one model and kilograms in another, the coefficient for weight changes by a factor of a thousand even though the underlying fit of the model remains unchanged. For example, weight can be measured in grams and kilograms. This problem is further complicated by the fact that there are different units within each type of measurement. For example, the meaning of a one-unit change is very different if you’re talking about temperature, weight, or chemical concentration. However, the units vary between the different types of variables, which makes it impossible to compare them directly. Consequently, it’s easy to think that variables with larger coefficients are more important because they represent a larger change in the response. The coefficient value represents the mean change in the response given a one-unit increase in the predictor. Regular regression coefficients describe the relationship between each predictor variable and the response. Don’t Compare Regular Regression Coefficients to Determine Variable Importance Then, I’ll move on to both statistical and non-statistical methods for determining which variables are the most important in regression models. I’ll start by showing you statistics that don’t answer the question about importance, which may surprise you. With these issues in mind, I’ll help you answer this question. For another, how you collect and measure your sample data can influence the apparent importance of each variable. For one thing, how you define “most important” often depends on your subject area and goals. This question is more complicated than it first appears. At this point, it’s common to ask, “Which variable is most important?” You’ve performed multiple linear regression and have settled on a model which contains several predictor variables that are statistically significant.