Under H0, data are generated by random processes. In other words, the controlled processes the experimental manipulations for example do not affect the data. Usually, H0 is a statement of equality equality between averages or between variances or between a correlation coefficient and zero, for example. We have drawn the grid below to guide you through the choice of an appropriate statistical test according to your question, the type of your variables i.
The guide proposes a formulation of the null hypothesis , as well as a concrete example in each situation. In columns Parametric tests and Parametric tests, you may click on the link to view a detailed tutorial related to the proposed test including a data file. Conditions of validity of parametric tests are listed in the paragraph following the grid. When available, nonparametric equivalents are proposed.
In some situations, parametric tests do not exist and so only nonparametric solutions are proposed. For a more theoretical background on statistical testing, please read the below articles:. The displayed tests are the most commonly used tests in statistics. Please scroll down to see the grid. Validity conditions we propose are rules of thumb. There are no precise rules in literature. Read our guide Which statistical model should you choose to learn how to choose the right model for your analysis i.
All Rights Reserved. Toggle SideBar. Have a Question? Ask or enter a search term here. My profile. Which statistical test should you use? What is a statistical test? A guide to choosing an appropriate test according to the situation XLSTAT provides a high number of statistical tests. For a more theoretical background on statistical testing, please read the below articles: What is a statistical test?
What is a difference between paired and independent samples tests? What is the difference between a parametric and a nonparametric test? What is the difference between a two-tailed and a one-tailed test?
How to interpret the output of a statistical test: the significance level alpha and the p-value The grid The displayed tests are the most commonly used tests in statistics. Biserial correlation test Normality of the quantitative variable Test the association between a series of proportions and an ordinal variable Contingency table or proportions and sample sizes Proportions do not change according to the ordinal variable Did birth rates change from year to year during the last decade?
Cochran-Armitage trend test Test the association between two tables of quantitative variables Two tables of quantitative variables Tables are independent Does the evaluation of a series of products on a series of attributes change from a panel to another? RV coefficient test Test the association between two proximity matrices Two proximity matrices Proximity matrices are independent Is geographic distance between populations of bees correlated with genetic distance? Mantel's test Time series tests Test the presence of a trend across time One series of data sorted by date time series There is no trend across time for the measured variable Did stock value change across the last 10 years?
This variable will have the values 1, 2 and 3, indicating a low, medium or high writing score. We do not generally recommend categorizing a continuous variable in this way; we are simply creating a variable to use for this example.
We will use gender female , reading score read and social studies score socst as predictor variables in this model. We will use a logit link and on the print subcommand we have requested the parameter estimates, the model summary statistics and the test of the parallel lines assumption. There are two thresholds for this model because there are three levels of the outcome variable. One of the assumptions underlying ordinal logistic and ordinal probit regression is that the relationship between each pair of outcome groups is the same.
In other words, ordinal logistic regression assumes that the coefficients that describe the relationship between, say, the lowest versus all higher categories of the response variable are the same as those that describe the relationship between the next lowest category and all higher categories, etc. This is called the proportional odds assumption or the parallel regression assumption.
Because the relationship between all pairs of groups is the same, there is only one set of coefficients only one model. If this was not the case, we would need different models such as a generalized ordered logit model to describe the relationship between each pair of outcome groups.
A factorial logistic regression is used when you have two or more categorical independent variables but a dichotomous dependent variable. For example, using the hsb2 data file we will use female as our dependent variable, because it is the only dichotomous variable in our data set; certainly not because it common practice to use gender as an outcome variable. We will use type of program prog and school type schtyp as our predictor variables.
Because prog is a categorical variable it has three levels , we need to create dummy codes for it. SPSS will do this for you by making dummy codes for all variables listed after the keyword with. SPSS will also create the interaction term; simply list the two variables that will make up the interaction separated by the keyword by. Furthermore, none of the coefficients are statistically significant either. This shows that the overall effect of prog is not significant. A correlation is useful when you want to see the relationship between two or more normally distributed interval variables.
For example, using the hsb2 data file we can run a correlation between two continuous variables, read and write. In the second example, we will run a correlation between a dichotomous variable, female , and a continuous variable, write. Although it is assumed that the variables are interval and normally distributed, we can include dummy variables when performing correlations. In the first example above, we see that the correlation between read and write is 0.
By squaring the correlation and then multiplying by , you can determine what percentage of the variability is shared. In the output for the second example, we can see the correlation between write and female is 0. Squaring this number yields. Simple linear regression allows us to look at the linear relationship between one normally distributed interval predictor and one normally distributed interval outcome variable. For example, using the hsb2 data file , say we wish to look at the relationship between writing scores write and reading scores read ; in other words, predicting write from read.
We see that the relationship between write and read is positive. Hence, we would say there is a statistically significant positive linear relationship between reading and writing. A Spearman correlation is used when one or both of the variables are not assumed to be normally distributed and interval but are assumed to be ordinal.
The values of the variables are converted in ranks and then correlated. In our example, we will look for a relationship between read and write. We will not assume that both of these variables are normal and interval. Logistic regression assumes that the outcome variable is binary i. We have only one variable in the hsb2 data file that is coded 0 and 1, and that is female.
We understand that female is a silly outcome variable it would make more sense to use it as a predictor variable , but we can use female as the outcome variable to illustrate how the code for this command is structured and how to interpret the output.
The first variable listed after the logistic command is the outcome or dependent variable, and all of the rest of the variables are predictor or independent variables.
In our example, female will be the outcome variable, and read will be the predictor variable. As with OLS regression, the predictor variables must be either dichotomous or continuous; they cannot be categorical. The results indicate that reading score read is not a statistically significant predictor of gender i. Likewise, the test of the overall model is not statistically significant, LR chi-squared — 0.
Multiple regression is very similar to simple regression, except that in multiple regression you have more than one predictor variable in the equation. For example, using the hsb2 data file we will predict writing score from gender female , reading, math, science and social studies socst scores. Furthermore, all of the predictor variables are statistically significant except for read.
Analysis of covariance is like ANOVA, except in addition to the categorical predictors you also have continuous predictors as well. For example, the one way ANOVA example used write as the dependent variable and prog as the independent variable. Multiple logistic regression is like simple logistic regression, except that there are two or more predictors. The predictors can be interval variables or dummy variables, but cannot be categorical variables.
If you have categorical predictors, they should be coded into one or more dummy variables. We have only one variable in our data set that is coded 0 and 1, and that is female. The first variable listed after the logistic regression command is the outcome or dependent variable, and all of the rest of the variables are predictor or independent variables listed after the keyword with.
In our example, female will be the outcome variable, and read and write will be the predictor variables. These results show that both read and write are significant predictors of female. Discriminant analysis is used when you have one or more normally distributed interval independent variables and a categorical dependent variable. It is a multivariate technique that considers the latent dimensions in the independent variables for predicting group membership in the categorical dependent variable.
For example, using the hsb2 data file , say we wish to use read , write and math scores to predict the type of program a student belongs to prog. Clearly, the SPSS output for this procedure is quite lengthy, and it is beyond the scope of this page to explain all of it. However, the main point is that two canonical variables are identified by the analysis, the first of which seems to be more related to program type than the second. For example, using the hsb2 data file , say we wish to examine the differences in read , write and math broken down by program type prog.
The students in the different programs differ in their joint distribution of read , write and math. Multivariate multiple regression is used when you have two or more dependent variables that are to be predicted from two or more independent variables.
In our example using the hsb2 data file , we will predict write and read from female , math , science and social studies socst scores. These results show that all of the variables in the model have a statistically significant relationship with the joint distribution of write and read. Canonical correlation is a multivariate technique used to examine the relationship between two groups of variables. For each set of variables, it creates latent variables and looks at the relationships among the latent variables.
Kolmogorov-Smirnov test Shapiro-Wilk test. Tell us what you think! Your comment will show up after approval from a moderator. Thank you so much for the work you do 1 … 3. Binomial test Z-test for 1 proportion. Chi-square goodness-of-fit test. Sign test for 1 median.
Wilcoxon signed-ranks test Sign test for 2 related medians. Z-test for 2 independent proportions. Chi-square independence test. Independent samples t-test means Levene's test variances. Pearson correlation.
0コメント