restructure?

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

restructure?

Joanne Tsai
Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.

ID      GROUP   QUESTION1       QUESTION2       QUESTION3
QUESTION4       QUESTION5       QUESTION6       QUESTION7
QUESTION8

ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

ViAnn Beadle
You'll get the total count of questions per id and the counts of 0's and 1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.

ID      GROUP   QUESTION1       QUESTION2       QUESTION3
QUESTION4       QUESTION5       QUESTION6       QUESTION7
QUESTION8

ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

Joanne Tsai
In reply to this post by Joanne Tsai
Hi, thank you for the response.
This is the message I got after I ran the crosstabs.


Warnings
The observed number of values for ID exceeds the CROSSTABS limit of 1000
values. To tabulate all values, try the TABLES procedure.
This command is not executed.


Do I now run the tables/ multiple response sets?

Thank you.

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:50 PM
To: Joanne Tsai; [hidden email]
Subject: RE: restructure?

You'll get the total count of questions per id and the counts of 0's and
1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.

ID      GROUP   QUESTION1       QUESTION2       QUESTION3
QUESTION4       QUESTION5       QUESTION6       QUESTION7
QUESTION8

ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

ViAnn Beadle
Do you have more than a 1000 ID values?

You can try the TABLES procedure if you have it (it's not part of the base).
Note that there are no multiple response sets with this data structure. You
have a single column of responses to tabulate here.

What do you ultimately want to do with this information? Perhaps an very
large table is not what you're after.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 2:09 PM
To: ViAnn Beadle; [hidden email]
Subject: RE: restructure?

Hi, thank you for the response.
This is the message I got after I ran the crosstabs.


Warnings
The observed number of values for ID exceeds the CROSSTABS limit of 1000
values. To tabulate all values, try the TABLES procedure.
This command is not executed.


Do I now run the tables/ multiple response sets?

Thank you.

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:50 PM
To: Joanne Tsai; [hidden email]
Subject: RE: restructure?

You'll get the total count of questions per id and the counts of 0's and
1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.

ID      GROUP   QUESTION1       QUESTION2       QUESTION3
QUESTION4       QUESTION5       QUESTION6       QUESTION7
QUESTION8

ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

John McConnell
In reply to this post by ViAnn Beadle
Joanne

I think Vianne's response will give you the crosstab you are looking
for.

If you want to create the shape of file that you describe to perform
other analysis the I don't think it is straightforward and you will
probably need a script (.sbs file) to do this. I have one that does
something similar (below). If you cut and paste it into an SPSS Script
and run it ... it might do what you want.

If the cut and paste does not work I can send you the script as an
attachment to se if that helps.

john

------ SCRIPT STARTS ------

Option Explicit

Const MAX_QUESTIONS As Integer = 100
Const VAR_QUESTION As String = "Question"
Const VAR_ANSWER As String = "Answer"
Const PREFIX_QVAR As String = "QUESTION"
Const BREAK_VAR1 As String = "ID"
Const BREAK_VAR2 As String = "GROUP"
'etc.

Sub Main

Dim strCommand As String
Dim iQuestions As Integer

strCommand = ""

'compute the new variables
For iQuestions = 1 To MAX_QUESTIONS
        strCommand = strCommand & "COMPUTE " & PREFIX_QVAR &
CStr(iQuestions) & "=0." & vbCrLf
        'to handle lower and upper case qnn and Qnn...
        strCommand = strCommand & "IF " & VAR_QUESTION & "='Q" &
CStr(iQuestions) & "' or " & VAR_QUESTION & "='q" & CStr(iQuestions) &
"' " & PREFIX_QVAR & CStr(iQuestions) & "=" & VAR_ANSWER & "." &  vbCrLf

Next iQuestions

objSpssApp.ExecuteCommands strCommand, True


'then aggregate to get the summarised file

strCommand = "AGGREGATE OUTFILE=*" & vbCrLf
strCommand = strCommand & " /BREAK " & vbCrLf
strCommand = strCommand & " " & BREAK_VAR1 & vbCrLf
strCommand = strCommand & " " & BREAK_VAR2 & vbCrLf

For iQuestions = 1 To MAX_QUESTIONS

        strCommand = strCommand & " /" & PREFIX_QVAR & CStr(iQuestions)
& "=max(" & PREFIX_QVAR & CStr(iQuestions) & ")" & vbCrLf

Next iQuestions
strCommand = strCommand & "."

objSpssApp.ExecuteCommands strCommand, True


End Sub

------ SCRIPT ENDS ------

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
ViAnn Beadle
Sent: 10 July 2007 20:50
To: [hidden email]
Subject: Re: restructure?

You'll get the total count of questions per id and the counts of 0's and
1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.

ID      GROUP   QUESTION1       QUESTION2       QUESTION3
QUESTION4       QUESTION5       QUESTION6       QUESTION7
QUESTION8

ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

Joanne Tsai
In reply to this post by Joanne Tsai
Hi,
Yeah. It's a huge dataset, and I have more than 10k ID values and 500
GROUPS.
As stated in my first email. I'd love to calculate the average number of
questions each person get asked, and how many yes/no there is given by
the number of questions.
See below, Group A and C both asked 8 questions, I would like to compute
the average number of yes a person answered when 8 questions were asked
vs. 4 questions were asked in group B. I hope I explain this clear
enough to understand. Thank you so much for your help.



Before data:
ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


The data I want to get:

ID  GROUP       Q1      Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 4:28 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

Do you have more than a 1000 ID values?

You can try the TABLES procedure if you have it (it's not part of the
base).
Note that there are no multiple response sets with this data structure.
You
have a single column of responses to tabulate here.

What do you ultimately want to do with this information? Perhaps an very
large table is not what you're after.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 2:09 PM
To: ViAnn Beadle; [hidden email]
Subject: RE: restructure?

Hi, thank you for the response.
This is the message I got after I ran the crosstabs.


Warnings
The observed number of values for ID exceeds the CROSSTABS limit of 1000
values. To tabulate all values, try the TABLES procedure.
This command is not executed.


Do I now run the tables/ multiple response sets?

Thank you.

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:50 PM
To: Joanne Tsai; [hidden email]
Subject: RE: restructure?

You'll get the total count of questions per id and the counts of 0's and
1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.


ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

Fry, Jonathan B.
In reply to this post by Joanne Tsai
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Joanne Tsai
Sent: Tuesday, July 10, 2007 2:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.

ID      GROUP   QUESTION1       QUESTION2       QUESTION3
QUESTION4       QUESTION5       QUESTION6       QUESTION7
QUESTION8

ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
------------------------------------
Here is an example job.  With your full dataset, the Compute for NYES (number of Yes answers) will need a different ending variable.

You could directly compute the numbers of questions answered and the number of yes answers using AGGREGATE, but I suspect you will have more questions for these data.  If so, restructuring by subject makes sense.

Jonathan Fry
SPSS Inc.

data list list/ID(f5)   GROUP(A1) QUESTION(a4) ANSWER (f1).
begin data
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1
end data
CASESTOVARS
 /ID = ID
 /INDEX = QUESTION
 /GROUPBY = INDEX
 /COUNT = nq "Number of questions answered" .
compute nyes = sum(q to q58).
format nyes(f3).
var label nyes 'Number of Yes answers'.
execute.
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

ViAnn Beadle
In reply to this post by Joanne Tsai
This is a a different description from your original question. Do you want
average numbers of yes questions for a & c pooled together vs. b? Or do you
want an average for a, an average for b, and an average for c.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:05 PM
To: ViAnn Beadle
Cc: [hidden email]
Subject: RE: restructure?

Hi,
Yeah. It's a huge dataset, and I have more than 10k ID values and 500
GROUPS.
As stated in my first email. I'd love to calculate the average number of
questions each person get asked, and how many yes/no there is given by
the number of questions.
See below, Group A and C both asked 8 questions, I would like to compute
the average number of yes a person answered when 8 questions were asked
vs. 4 questions were asked in group B. I hope I explain this clear
enough to understand. Thank you so much for your help.



Before data:
ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


The data I want to get:

ID  GROUP       Q1      Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 4:28 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

Do you have more than a 1000 ID values?

You can try the TABLES procedure if you have it (it's not part of the
base).
Note that there are no multiple response sets with this data structure.
You
have a single column of responses to tabulate here.

What do you ultimately want to do with this information? Perhaps an very
large table is not what you're after.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 2:09 PM
To: ViAnn Beadle; [hidden email]
Subject: RE: restructure?

Hi, thank you for the response.
This is the message I got after I ran the crosstabs.


Warnings
The observed number of values for ID exceeds the CROSSTABS limit of 1000
values. To tabulate all values, try the TABLES procedure.
This command is not executed.


Do I now run the tables/ multiple response sets?

Thank you.

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:50 PM
To: Joanne Tsai; [hidden email]
Subject: RE: restructure?

You'll get the total count of questions per id and the counts of 0's and
1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.


ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

Joanne Tsai
In reply to this post by Joanne Tsai
Well, I figured once I obtain the data I am trying to get, I can
aggregate the data by GROUP or ID if I want to do so. The GROUP is the
Survey questions group. Therefore, ID=1 can appear in both A GROUP and C
GROUP if s/he took both surveys.
ID  GROUP       Q1      Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1
1       C       0       0       1       0

Someone told me to try Pivot my original data, but I am not sure how to
do so as I am fairly new to this program.

Thank you guys.



-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 5:13 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

This is a a different description from your original question. Do you
want
average numbers of yes questions for a & c pooled together vs. b? Or do
you
want an average for a, an average for b, and an average for c.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:05 PM
To: ViAnn Beadle
Cc: [hidden email]
Subject: RE: restructure?

Hi,
Yeah. It's a huge dataset, and I have more than 10k ID values and 500
GROUPS.
As stated in my first email. I'd love to calculate the average number of
questions each person get asked, and how many yes/no there is given by
the number of questions.
See below, Group A and C both asked 8 questions, I would like to compute
the average number of yes a person answered when 8 questions were asked
vs. 4 questions were asked in group B. I hope I explain this clear
enough to understand. Thank you so much for your help.



Before data:
ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


The data I want to get:

ID  GROUP       Q1      Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 4:28 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

Do you have more than a 1000 ID values?

You can try the TABLES procedure if you have it (it's not part of the
base).
Note that there are no multiple response sets with this data structure.
You
have a single column of responses to tabulate here.

What do you ultimately want to do with this information? Perhaps an very
large table is not what you're after.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 2:09 PM
To: ViAnn Beadle; [hidden email]
Subject: RE: restructure?

Hi, thank you for the response.
This is the message I got after I ran the crosstabs.


Warnings
The observed number of values for ID exceeds the CROSSTABS limit of 1000
values. To tabulate all values, try the TABLES procedure.
This command is not executed.


Do I now run the tables/ multiple response sets?

Thank you.

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:50 PM
To: Joanne Tsai; [hidden email]
Subject: RE: restructure?

You'll get the total count of questions per id and the counts of 0's and
1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.


ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

Richard Ristow
In reply to this post by Joanne Tsai
At 03:24 PM 7/10/2007, Joanne Tsai wrote:

The following is SPSS 15 draft output (WRR:not saved separately), with
your questions in comments.

>I have a dataset looks like this:

|-----------------------------|---------------------------|
|Output Created               |10-JUL-2007 18:14:06       |
|-----------------------------|---------------------------|
[TestData]

   ID GROUP QUESTION ANSWER

    1 A     Q           0
    1 A     q33         0
    1 A     q34         0
    1 A     q35         0
    1 A     q36         1
    1 A     q37         1
    1 A     q38         1
    2 B     Q           0
    2 B     q56         0
    2 B     q57         1
    2 B     q58         1
    3 C     Q           0
    3 C     q101        0
    3 C     q201        1
    3 C     q301        1
    3 C     q401        1
    3 C     q102        1
    3 C     q202        1
    3 C     q302        1
    4 A     Q           0
    4 A     q33         1
    4 A     q34         1
    4 A     q35         0
    4 A     q36         0
    4 A     q37         1
    4 A     q38         1

Number of cases read:  26    Number of cases listed:  26


*  ................................................................. .
*  I.     "I want to manipulate the data so it would look like this:".

*  I don't think this is very useful. Among other things, it gives   .
*  all the answers with no indication what questions they're answers .
*  to, though that could be recovered from GROUP.                    .

*  If you only need this for the second calculation, skip it; the    .
*  second is easier without it. If you need it for other reasons,    .
*  you may want to think (and perhaps post about) what you're going  .
*  to use it for. It may be better structured some other way.        .

DATASET ACTIVATE TestData.
DATASET COPY     WideForm.
DATASET ACTIVATE WideForm  WINDOW=FRONT.

*  I don't think this form of CASESTOVARS can be clicked up from the .
*  menus. I edited it as syntax, from a clicked-up beginning.        .

CASESTOVARS
    /ID        = ID GROUP
    /DROP      = QUESTION
    /RENAME      ANSWER = Q
    /SEPARATOR = ''
    /GROUPBY = VARIABLE .


Cases to Variables
|-----------------------------|---------------------------|
|Output Created               |10-JUL-2007 18:14:07       |
|-----------------------------|---------------------------|
[WideForm]

Generated Variables
|---------|------|
|Original |Result|
|Variabl  |------|
|e        |Name  |
|-------|-|------|
|ANSWER |1|Q1    |
|       |2|Q2    |
|       |3|Q3    |
|       |4|Q4    |
|       |5|Q5    |
|       |6|Q6    |
|       |7|Q7    |
|       |8|Q8    |
|-------|-|------|

Processing Statistics
|---------------|---|
|Cases In       |26 |
|Cases Out      |4  |
|---------------|---|
|Cases In/      |6.5|
|Cases Out      |   |
|---------------|---|
|Variables In   |4  |
|Variables Out  |10 |
|---------------|---|
|Index Values   |8  |
|---------------|---|


LIST.

List

|-----------------------------|---------------------------|
|Output Created               |10-JUL-2007 18:14:08       |
|-----------------------------|---------------------------|
[WideForm]

   ID GROUP Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8

    1 A      0  0  0  0  1  1  1  .
    2 B      0  0  1  1  .  .  .  .
    3 C      0  0  1  1  1  1  1  1
    4 A      0  1  1  0  0  1  1  .


Number of cases read:  4    Number of cases listed:  4


*  ................................................................. .
*  II.    "I'd like to calculate how many questions of each          .
*  individual get asked, and how many yes answers he/she answered.   .
*  For example, for ID=1, he is in group A, and he had 7 questions,  .
*  and he answered 3 yes's."                                         .

*  This is most easily done without the previous restructuring.      .
*  Going back to the original input form:                            .

DATASET ACTIVATE TestData.
DATASET COPY     Aggregate.
DATASET ACTIVATE Aggregate WINDOW=FRONT.
AGGREGATE OUTFILE=*
    /BREAK   = ID GROUP
    /Q_ASKED  'No. of questions asked'          = NU
    /Q_ANSYES "No. of questions answered 'yes'" = SUM(ANSWER).

FORMATS Q_ASKED Q_ANSYES (F3).
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |10-JUL-2007 18:14:09       |
|-----------------------------|---------------------------|
   ID GROUP Q_ASKED Q_ANSYES

    1 A         7        3
    2 B         4        2
    3 C         8        6
    4 A         7        4

Number of cases read:  4    Number of cases listed:  4


=================================
APPENDIX: Test data, and all code
=================================
DATA LIST LIST SKIP = 1/
     ID   GROUP QUESTION ANSWER
    (F4,    A1,       A6,   F2).
BEGIN DATA.
     ID   GROUP QUESTION ANSWER
     1       A       Q       0
     1       A       q33     0
     1       A       q34     0
     1       A       q35     0
     1       A       q36     1
     1       A       q37     1
     1       A       q38     1
     2       B       Q       0
     2       B       q56     0
     2       B       q57     1
     2       B       q58     1
     3       C       Q       0
     3       C       q101    0
     3       C       q201    1
     3       C       q301    1
     3       C       q401    1
     3       C       q102    1
     3       C       q202    1
     3       C       q302    1
     4       A       Q       0
     4       A       q33     1
     4       A       q34     1
     4       A       q35     0
     4       A       q36     0
     4       A       q37     1
     4       A       q38     1
END DATA.
DATASET NAME     TestData  WINDOW=FRONT.

LIST.


*  ................................................................. .
*  I.     "I want to manipulate the data so it would look like this:".

*  I don't think this is very useful. Among other things, it gives   .
*  all the answers with no indication what questions they're answers .
*  to, though that could be recovered from GROUP.                    .

*  If you only need this for the second calculation, skip it; the    .
*  second is easier without it. If you need it for other reasons,    .
*  you may want to think (and perhaps post about) what you're going  .
*  to use it for. It may be better structured some other way.        .

DATASET ACTIVATE TestData.
DATASET COPY     WideForm.
DATASET ACTIVATE WideForm  WINDOW=FRONT.

*  I don't think this form of CASESTOVARS can be clicked up from the .
*  menus. I edited it as syntax, from a clicked-up beginning.        .

CASESTOVARS
    /ID        = ID GROUP
    /DROP      = QUESTION
    /RENAME      ANSWER = Q
    /SEPARATOR = ''
    /GROUPBY = VARIABLE .

LIST.

*  ................................................................. .
*  II.    "I'd like to calculate how many questions of each          .
*  individual get asked, and how many yes answers he/she answered.   .
*  For example, for ID=1, he is in group A, and he had 7 questions,  .
*  and he answered 3 yes's."                                         .

*  This is most easily done without the previous restructuring.      .
*  Going back to the original input form:                            .

DATASET ACTIVATE TestData.
DATASET COPY     Aggregate.
DATASET ACTIVATE Aggregate WINDOW=FRONT.

AGGREGATE OUTFILE=*
    /BREAK   = ID GROUP
    /Q_ASKED  'No. of questions asked'          = NU
    /Q_ANSYES "No. of questions answered 'yes'" = SUM(ANSWER).

FORMATS Q_ASKED Q_ANSYES (F3).
LIST.
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

ViAnn Beadle
In reply to this post by Joanne Tsai
I don't think you answered my question.

You do NOT need to restructure from narrow to wide if you are not interested
in answers to particular questions.

Note that the aggregated sum of the question variable by ID, given your 0/1
coding is a simple sum. If you want the mean of each id, than the aggregated
mean of the question variable gives you that. If you then want that average
computed across groups, all you need to do is to run the SUMMARIZE command.
This will be the mean of means and not the grand mean.

So here's another question--do you want the grand mean or the mean of means?


-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:57 PM
To: ViAnn Beadle
Cc: [hidden email]
Subject: RE: restructure?

Well, I figured once I obtain the data I am trying to get, I can
aggregate the data by GROUP or ID if I want to do so. The GROUP is the
Survey questions group. Therefore, ID=1 can appear in both A GROUP and C
GROUP if s/he took both surveys.
ID  GROUP       Q1      Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1
1       C       0       0       1       0

Someone told me to try Pivot my original data, but I am not sure how to
do so as I am fairly new to this program.

Thank you guys.



-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 5:13 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

This is a a different description from your original question. Do you
want
average numbers of yes questions for a & c pooled together vs. b? Or do
you
want an average for a, an average for b, and an average for c.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:05 PM
To: ViAnn Beadle
Cc: [hidden email]
Subject: RE: restructure?

Hi,
Yeah. It's a huge dataset, and I have more than 10k ID values and 500
GROUPS.
As stated in my first email. I'd love to calculate the average number of
questions each person get asked, and how many yes/no there is given by
the number of questions.
See below, Group A and C both asked 8 questions, I would like to compute
the average number of yes a person answered when 8 questions were asked
vs. 4 questions were asked in group B. I hope I explain this clear
enough to understand. Thank you so much for your help.



Before data:
ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


The data I want to get:

ID  GROUP       Q1      Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 4:28 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

Do you have more than a 1000 ID values?

You can try the TABLES procedure if you have it (it's not part of the
base).
Note that there are no multiple response sets with this data structure.
You
have a single column of responses to tabulate here.

What do you ultimately want to do with this information? Perhaps an very
large table is not what you're after.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 2:09 PM
To: ViAnn Beadle; [hidden email]
Subject: RE: restructure?

Hi, thank you for the response.
This is the message I got after I ran the crosstabs.


Warnings
The observed number of values for ID exceeds the CROSSTABS limit of 1000
values. To tabulate all values, try the TABLES procedure.
This command is not executed.


Do I now run the tables/ multiple response sets?

Thank you.

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:50 PM
To: Joanne Tsai; [hidden email]
Subject: RE: restructure?

You'll get the total count of questions per id and the counts of 0's and
1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.


ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne
Reply | Threaded
Open this post in threaded view
|

Re: restructure?

Joanne Tsai
In reply to this post by Joanne Tsai
Hi, thank you for all of your responses. Yeah, I realized I did not need
to restructure the dataset in order to compute the means of the means.
(Thanks to you, Mr. Ristow, and Mr. Fry.)
It's just how it had been done previously here, and I wanted to learn
how to make the dataset look exactly the same. So thank you all, I
definitely learn something very useful in CASESTOVARS command.

Joanne


-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 6:44 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

I don't think you answered my question.

You do NOT need to restructure from narrow to wide if you are not
interested
in answers to particular questions.

Note that the aggregated sum of the question variable by ID, given your
0/1
coding is a simple sum. If you want the mean of each id, than the
aggregated
mean of the question variable gives you that. If you then want that
average
computed across groups, all you need to do is to run the SUMMARIZE
command.
This will be the mean of means and not the grand mean.

So here's another question--do you want the grand mean or the mean of
means?


-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:57 PM
To: ViAnn Beadle
Cc: [hidden email]
Subject: RE: restructure?

Well, I figured once I obtain the data I am trying to get, I can
aggregate the data by GROUP or ID if I want to do so. The GROUP is the
Survey questions group. Therefore, ID=1 can appear in both A GROUP and C
GROUP if s/he took both surveys.
ID  GROUP       Q1      Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1
1       C       0       0       1       0

Someone told me to try Pivot my original data, but I am not sure how to
do so as I am fairly new to this program.

Thank you guys.



-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 5:13 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

This is a a different description from your original question. Do you
want
average numbers of yes questions for a & c pooled together vs. b? Or do
you
want an average for a, an average for b, and an average for c.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:05 PM
To: ViAnn Beadle
Cc: [hidden email]
Subject: RE: restructure?

Hi,
Yeah. It's a huge dataset, and I have more than 10k ID values and 500
GROUPS.
As stated in my first email. I'd love to calculate the average number of
questions each person get asked, and how many yes/no there is given by
the number of questions.
See below, Group A and C both asked 8 questions, I would like to compute
the average number of yes a person answered when 8 questions were asked
vs. 4 questions were asked in group B. I hope I explain this clear
enough to understand. Thank you so much for your help.



Before data:
ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


The data I want to get:

ID  GROUP       Q1      Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 4:28 PM
To: Joanne Tsai
Cc: [hidden email]
Subject: RE: restructure?

Do you have more than a 1000 ID values?

You can try the TABLES procedure if you have it (it's not part of the
base).
Note that there are no multiple response sets with this data structure.
You
have a single column of responses to tabulate here.

What do you ultimately want to do with this information? Perhaps an very
large table is not what you're after.

-----Original Message-----
From: Joanne Tsai [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 2:09 PM
To: ViAnn Beadle; [hidden email]
Subject: RE: restructure?

Hi, thank you for the response.
This is the message I got after I ran the crosstabs.


Warnings
The observed number of values for ID exceeds the CROSSTABS limit of 1000
values. To tabulate all values, try the TABLES procedure.
This command is not executed.


Do I now run the tables/ multiple response sets?

Thank you.

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tuesday, July 10, 2007 3:50 PM
To: Joanne Tsai; [hidden email]
Subject: RE: restructure?

You'll get the total count of questions per id and the counts of 0's and
1's
with a simple CROSSTABS of ID by ANSWER.

data list list / ID (f1) Group(a1) Question(a3) Answer(f1).
begin data
rows of data here
end data.
crosstabs /table id by answer.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Joanne Tsai
Sent: Tuesday, July 10, 2007 1:25 PM
To: [hidden email]
Subject: restructure?

Hi, Dear co-listers:
Thank you for your previous responses. They were really helpful.
I am now trying to tackle something as follows:
I have a dataset looks like this:
ID represents each person. Depends on which group this person is in, it
will answer different number of questions, and the answer is true or
false. For example, person #1 is in group A, and he answers from Q to
q38. Every group asks the same question, Q, but all the other questions
are different.

ID   GROUP QUESTION ANSWER
1       A       Q       0
1       A       q33     0
1       A       q34     0
1       A       q35     0
1       A       q36     1
1       A       q37     1
1       A       q38     1
2       B       Q       0
2       B       q56     0
2       B       q57     1
2       B       q58     1
3       C       Q       0
3       C       q101    0
3       C       q201    1
3       C       q301    1
3       C       q401    1
3       C       q102    1
3       C       q202    1
3       C       q302    1
4       A       Q       0
4       A       q33     1
4       A       q34     1
4       A       q35     0
4       A       q36     0
4       A       q37     1
4       A       q38     1


And I want to manipulate the data so it would look like this:
(I don't care for the content of questions each person get asked, but I
care more for the number of questions). So, in the end, I'd like to
calculate how many questions of each individual get asked, and how many
yes answers he/she answered. For example, for ID=1, he is in group A,
and he had 7 questions, and he answered 3 yes's.


ID  GROUP  Q1    Q2    Q3     Q4   Q5     Q6    Q7   Q8
1       A       0       0       0       0       1       1       1
2       B       0       0       1       1
3       C       0       0       1       1       1       1       1
4       A       0       1       1       0       0       0       1


Please give me any advice on this topic. I really appreciate your help.

Thanks in advance,

Joanne