Best test to use?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view

Best test to use?

Dear forum,

I have conducted a content analysis in two countries, in 3 newspapers per country. I want to analyze the differences between the countries and the independent variables have been measured as dichotomous  outcomes. My supervisor and one of my colleague's recommended ANOVA, but I think this is not the correct method because the data will not be normally distributed.

Question: What is the best statistical test I can use? I have analyzed 417 articles, but the sample sizes in the two countries are not equal, as I have taken a total sample on a certain topic. I am considering trying to use a multilevel model for dichotomous outcomes, but am a bit of a novice and would like to check things out before I talk to my supervisor.

Do you have any advice for me?

Many thanks!

Reply | Threaded
Open this post in threaded view

Re: Best test to use?

Art Kendall
the independent variables have been measured as dichotomous  outcomes.
Outcomes are usually dichotomous. 
Usually in content analysis the researcher decides on the response scale.  What constructs were these measures meant to represent?   How much time would it take to go back and get less coarse measures for these constructs?

What are your independent variables (inputs, predictors, Xs, right-hand-side)? What are your dependent variables (outcomes, criteria, Ys, left-hand side)?  What variables do you want to "control for" (partial out)?

Did you look at inter-coder reliability?

If you have the population of articles that cell would have a sampling error of zero.  Do you have the weights for the other cells? I.e., what were the pop sizes for those cells?  How were the articles chosen when you did not have the whole pop?
Art Kendall
Social Research Consultants
On 5/14/2014 5:05 AM, Anna [via SPSSX Discussion] wrote:
Dear forum,

I have conducted a content analysis in two countries, in 3 newspapers per country. I want to analyze the differences between the countries and the independent variables have been measured as dichotomous  outcomes. My supervisor and one of my colleague's recommended ANOVA, but I think this is not the correct method because the data will not be normally distributed.

Question: What is the best statistical test I can use? I have analyzed 417 articles, but the sample sizes in the two countries are not equal, as I have taken a total sample on a certain topic. I am considering trying to use a multilevel model for dichotomous outcomes, but am a bit of a novice and would like to check things out before I talk to my supervisor.

Do you have any advice for me?

Many thanks!


If you reply to this email, your message will be added to the discussion below:
To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view

Re: Best test to use?

Maguin, Eugene
In reply to this post by Anna
When you say 'not normally distributed', what is the evidence of 'not-normal'? How may categories do the variables have? Two. Four. 15? What are the skew and kurtosis numbers? ANOVA is generally regarded as being 'robust' to violations of assumptions.

Do your questions involve between country differences, between newspaper differences or a country-newspaper interaction.

You do have a nested data structure with articles within newspapers nested within country. If your dependent variables are continuous, you can the mixed command; otherwise, you will need to use genlinmixed. The fact that your level 2 variable, newspaper, has just six subjects will limit what you can do. To see what I mean, suppose you averaged an article dv across articles within each newspaper. So, continuous variable, six subjects, dichotomous DV, country. Thus, between groups t-test with N of 6. You'll need big differences to find significance. Suppose you conclude on the basis of a random effects within country test that newspaper is irrelevant, i.e., not significant and the ICC is near zero. Now a t-test with N=417.
Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Anna
Sent: Wednesday, May 14, 2014 5:06 AM
To: [hidden email]
Subject: Best test to use?

Dear forum,

I have conducted a content analysis in two countries, in 3 newspapers per country. I want to analyze the differences between the countries and the independent variables have been measured as dichotomous  outcomes. My supervisor and one of my colleague's recommended ANOVA, but I think this is not the correct method because the data will not be normally distributed.

Question: What is the best statistical test I can use? I have analyzed 417 articles, but the sample sizes in the two countries are not equal, as I have taken a total sample on a certain topic. I am considering trying to use a multilevel model for dichotomous outcomes, but am a bit of a novice and would like to check things out before I talk to my supervisor.

Do you have any advice for me?

Many thanks!


View this message in context:
Sent from the SPSSX Discussion mailing list archive at

To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
For a list of commands to manage subscriptions, send the command
Reply | Threaded
Open this post in threaded view

Re: Best test to use?

David Marso
In reply to this post by Anna
You might try to find a local statistician/methodologist to provide professional guidance!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view

Re: Best test to use?

In reply to this post by Art Kendall
Thank you for your answer. The outcome variables are frames that can be present or not present in newspaper articles. The intercoder-reliability is between .70 and.90 krippendorff's alpha. I don't think it's feasible to go back to the data and recode the entire dataset and I doubt that I will be able to get a acceptable intercoder reliability if I try to code for the extent to which a frame is present.

My independent variables are the countries in which the articles are published, as well as the ideology of the newspapers (right-leaning and left-leaning). The subject of the articles is related to immigration, so I expect that will make a difference. I would like to control for the effect of the format of the article (episodic vs thematic).

I think I have the entire population of articles. I used a boolean keyword search and the aim was to generate a database with all the articles on this subject published in the 6 selected newspapers in the 2 countries.

Reply | Threaded
Open this post in threaded view

Re: Best test to use?

In reply to this post by Maguin, Eugene
Thank you for your answer. I see that my limited numbers in the higher levels do seriously pose a problem. I will try some different options. I have already tried t-tests and anova's and the results seemed promising, but now I am afraid they are not reliable.

There are two outcome possibilities: frame present or not present. I didn't measure the skew and kurtosis numbers. I thought with that type of outcome variables it was not possible to get a normal distribution and error variance.

Reply | Threaded
Open this post in threaded view

Re: Best test to use?

Maguin, Eugene
In reply to this post by Anna
I've combined your two replies. Re-summary of the data as it now exists: A total of 417 articles from three newspapers in country A and three newspapers in country B. Articles coded  as to whether certain elements are present or absent. One of these elements, format, is regarded as a control variable. THE DV is frame present. Reliabilities as shown. Newspapers themselves are coded as to ideology. Question: were all articles coded by a) two coders, b) all by the primary coder and a sample by a secondary coder, c) by multiple primary coders who also coded as sample of articles coded by other primary coders, d) something else? You have dichotomous data. Thus: crosstabs, logistic regression type models. I don't understand why your advisors suggested ANOVA. I suggest you ask them why they made that recommendation.

Two suggestions. I think it may be more useful in a productivity sense to find somebody there to work with. I think you're being confused by things that shouldn't be confusing but are. On your own: explore the data terrain by crosstabbing variables with the DV to identify differences with IVs: country, newspaper name, ideology, format.  Crosstab your IVs with each other to see how they are related.

Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Anna
Sent: Thursday, May 15, 2014 4:35 AM
To: [hidden email]
Subject: Re: Best test to use?

Thank you for your answer. The outcome variables are frames that can be present or not present in newspaper articles. The intercoder-reliability is between .70 and.90 krippendorff's alpha. I don't think it's feasible to go back to the data and recode the entire dataset and I doubt that I will be able to get a acceptable intercoder reliability if I try to code for the extent to which a frame is present.

My independent variables are the countries in which the articles are published, as well as the ideology of the newspapers (right-leaning and left-leaning). The subject of the articles is related to immigration, so I expect that will make a difference. I would like to control for the effect of the format of the article (episodic vs thematic).

I think I have the entire population of articles. I used a boolean keyword search and the aim was to generate a database with all the articles on this subject published in the 6 selected newspapers in the 2 countries.


Thank you for your answer. I see that my limited numbers in the higher levels do seriously pose a problem. I will try some different options. I have already tried t-tests and anova's and the results seemed promising, but now I am afraid they are not reliable.

There are two outcome possibilities: frame present or not present. I didn't measure the skew and kurtosis numbers. I thought with that type of outcome variables it was not possible to get a normal distribution and error variance.

View this message in context:
Sent from the SPSSX Discussion mailing list archive at

To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
For a list of commands to manage subscriptions, send the command