Combining Columns in SPSS

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Combining Columns in SPSS

Radhi
Hi everyone,

I would like to gather people’s ideas on how to combine data as below.

screenshot#1 in the file attached, I would like to combine several
multiresponse into one single answer variable as below.

<http://spssx-discussion.1045642.n5.nabble.com/file/t341350/screenshot%231.png>

<http://spssx-discussion.1045642.n5.nabble.com/file/t341350/screenshot%232.png>

I have done manually for several IDs and would like to ask if there is any
other way or syntax of solving this task. Thanks!



Regards,
Radhi



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Combining Columns in SPSS

David Marso-2
See VECTOR , LOOP, CHAR.SUBSTR, CONCAT
Untested ( no SPSS on this tablet device)
you need to fill D...Z.

STRING #CHARS="ABCD...Z".
STRING result (A26).
VECTOR V=var1 TO var26 .
LOOP #=1 TO 26.
IF (V(#) EQ 1) result=CONCAT("+", RTRIM( result), CHAR.SUBSTR(#CHARS,#,1)).
END LOOP.
COMPUTE result = CHAR.SUBSTR(result,2).

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Combining Columns in SPSS

John F Hall
In reply to this post by Radhi

Radhi

I'm trying to think of a way round your problem.

How many cases do you have?

Why are your variables not named in order of letters?

I've reproduced your sample data, but I can't get David Marso's syntax to work.

 Here is a very crude method using MULT RESPONSE:

 

mult resp groups =

    prod "Product chosen"

    (pr13a_1 to pr13a_88 (1))

    /freq prod

    / tab prod by prod

    / prod by prod by prod.

 

prod Frequencies

 

Responses

Percent of Cases

N

Percent

Product chosena

B

2

25.0%

66.7%

F

3

37.5%

100.0%

G

1

12.5%

33.3%

U

2

25.0%

66.7%

Total

8

100.0%

266.7%

a. Dichotomy group tabulated at value 1.

 

2-way table

prod*prod Crosstabulation

 

Product chosena

Total

B

F

G

U

Product chosena

B

Count

2

2

0

2

2

F

Count

2

3

1

2

3

G

Count

0

1

1

0

1

U

Count

2

2

0

2

2

Total

Count

2

3

1

2

3

Percentages and totals are based on respondents.

 

3-way table . . from which you can do the calculation manually if you don't have too many cases.

 

a. Dichotomy group tabulated at value 1.

 

prod*prod*prod Crosstabulation

Product chosena

Product chosena

Total

B

F

G

U

B

Product chosena

B

Count

2

2

 

2

2

F

Count

2

2

 

2

2

U

Count

2

2

 

2

2

Total

Count

2

2

 

2

2

F

Product chosena

B

Count

2

2

0

2

2

F

Count

2

3

1

2

3

G

Count

0

1

1

0

1

U

Count

2

2

0

2

2

Total

Count

2

3

1

2

3

G

Product chosena

F

Count

 

1

1

 

1

G

Count

 

1

1

 

1

Total

Count

 

1

1

 

1

U

Product chosena

B

Count

2

2

 

2

2

F

Count

2

2

 

2

2

U

Count

2

2

 

2

2

Total

Count

2

2

 

2

2

Percentages and totals are based on respondents.

a. Dichotomy group tabulated at value 1.

 

Programming in ordinary syntax to do exactly what you want for each case is very complex, but I’m thinking about ways to do it.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Radhi
Sent: 15 November 2017 02:14
To: [hidden email]
Subject: Combining Columns in SPSS

 

Hi everyone,

 

I would like to gather people’s ideas on how to combine data as below.

 

screenshot#1 in the file attached, I would like to combine several multiresponse into one single answer variable as below.

 

<http://spssx-discussion.1045642.n5.nabble.com/file/t341350/screenshot%231.png>

 

<http://spssx-discussion.1045642.n5.nabble.com/file/t341350/screenshot%232.png>

 

I have done manually for several IDs and would like to ask if there is any other way or syntax of solving this task. Thanks!

 

 

 

Regards,

Radhi

 

 

 

--

Sent from: http://spssx-discussion.1045642.n5.nabble.com/

 

=====================

To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Combining Columns in SPSS

David Marso-2
In reply to this post by Radhi
Tested this time.
A few slight changes:
--
NEW FILE.
DATASET CLOSE ALL.
MATRIX.
SAVE MAKE(111,26,0)/OUTFILE =*/VARIABLES var1 TO var26 .
END MATRIX.
DO REPEAT v=var1 TO var26 .
COMPUTE v=RV.BERNOULLI(.20).
END REPEAT.
FORMATS var1 TO var26 (F1.0).
VARIABLE WIDTH var1 TO var26(1).

STRING #CHARS(A26).
COMPUTE #CHARS="ABCDEFGHIJKLMNOPQRSTUVWXYZ".
STRING result (A26).
VECTOR V=var1 TO var26 .
LOOP #=1 TO 26.
IF (V(#) EQ 1) result=CONCAT( RTRIM( result), CHAR.SUBSTR(#CHARS,#,1),"+").
END LOOP.
COMPUTE result = CHAR.SUBSTR(result,1,LENGTH(RTRIM( result))-1 ).
EXECUTE.
LIST CASES=10.

 
va va va va va va va va va var var var var var var var var var var var var
r1 r2 r3 r4 r5 r6 r7 r8 r9 10  11  12  13  14  15  16  17  18  19  20  21  var22 var23 var24 var25 var26 result
 
 0  0  0  1  0  0  0  0  0  0   0   0   0   0   1   0   0   0   0   0   0    0     0     0     0     0   D+O
 0  0  0  1  0  0  0  1  0  0   0   0   0   0   0   0   1   0   1   0   0    1     0     0     1     1   D+H+Q+S+V+Y+Z
 0  0  0  0  0  0  1  0  0  1   1   0   0   0   0   0   0   0   1   0   0    0     0     0     0     1   G+J+K+S+Z
 0  0  0  0  1  0  0  1  0  0   0   0   0   0   0   0   1   0   1   0   0    1     0     0     0     0   E+H+Q+S+V
 0  0  0  0  0  0  0  0  0  0   0   0   0   0   1   0   0   0   0   0   0    0     1     1     0     0   O+W+X
 0  0  1  0  0  1  0  0  0  0   0   0   1   1   0   0   0   1   0   0   0    0     1     0     1     0   C+F+M+N+R+W+Y
 0  0  0  0  0  0  0  0  0  1   0   0   1   0   0   0   0   0   0   0   0    0     0     1     1     0   J+M+X+Y
 0  0  0  0  0  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   0    0     0     0     1     0   L+Y
 0  0  0  1  0  1  0  0  1  1   0   0   1   0   0   0   1   0   0   0   0    0     0     0     0     1   D+F+I+J+M+Q+Z
 0  0  0  0  0  0  1  0  1  1   0   0   0   0   0   0   0   0   0   0   0    0     0     0     0     1   G+I+J+Z
 
 
Number of cases read:  10    Number of cases listed:  10

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Combining Columns in SPSS

Jon Peck
Just for fun.

begin program.
import string
def select(*x):
    return "+".join([string.ascii_uppercase[i] for i, item in enumerate(x) if item == 1])
end program.

spssinc trans result = result2 type=26 /variables var1 to var26 /formula "select(<>)".

On Thu, Nov 16, 2017 at 9:10 AM, David Marso <[hidden email]> wrote:
Tested this time.
A few slight changes:
--
NEW FILE.
DATASET CLOSE ALL.
MATRIX.
SAVE MAKE(111,26,0)/OUTFILE =*/VARIABLES var1 TO var26 .
END MATRIX.
DO REPEAT v=var1 TO var26 .
COMPUTE v=RV.BERNOULLI(.20).
END REPEAT.
FORMATS var1 TO var26 (F1.0).
VARIABLE WIDTH var1 TO var26(1).

STRING #CHARS(A26).
COMPUTE #CHARS="ABCDEFGHIJKLMNOPQRSTUVWXYZ".
STRING result (A26).
VECTOR V=var1 TO var26 .
LOOP #=1 TO 26.
IF (V(#) EQ 1) result=CONCAT( RTRIM( result), CHAR.SUBSTR(#CHARS,#,1),"+").
END LOOP.
COMPUTE result = CHAR.SUBSTR(result,1,LENGTH(RTRIM( result))-1 ).
EXECUTE.
LIST CASES=10.


va va va va va va va va va var var var var var var var var var var var var
r1 r2 r3 r4 r5 r6 r7 r8 r9 10  11  12  13  14  15  16  17  18  19  20  21  var22 var23 var24 var25 var26 result

 0  0  0  1  0  0  0  0  0  0   0   0   0   0   1   0   0   0   0   0   0    0     0     0     0     0   D+O
 0  0  0  1  0  0  0  1  0  0   0   0   0   0   0   0   1   0   1   0   0    1     0     0     1     1   D+H+Q+S+V+Y+Z
 0  0  0  0  0  0  1  0  0  1   1   0   0   0   0   0   0   0   1   0   0    0     0     0     0     1   G+J+K+S+Z
 0  0  0  0  1  0  0  1  0  0   0   0   0   0   0   0   1   0   1   0   0    1     0     0     0     0   E+H+Q+S+V
 0  0  0  0  0  0  0  0  0  0   0   0   0   0   1   0   0   0   0   0   0    0     1     1     0     0   O+W+X
 0  0  1  0  0  1  0  0  0  0   0   0   1   1   0   0   0   1   0   0   0    0     1     0     1     0   C+F+M+N+R+W+Y
 0  0  0  0  0  0  0  0  0  1   0   0   1   0   0   0   0   0   0   0   0    0     0     1     1     0   J+M+X+Y
 0  0  0  0  0  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   0    0     0     0     1     0   L+Y
 0  0  0  1  0  1  0  0  1  1   0   0   1   0   0   0   1   0   0   0   0    0     0     0     0     1   D+F+I+J+M+Q+Z
 0  0  0  0  0  0  1  0  1  1   0   0   0   0   0   0   0   0   0   0   0    0     0     0     0     1   G+I+J+Z


Number of cases read:  10    Number of cases listed:  10

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Combining Columns in SPSS

David Marso-2
In reply to this post by Radhi
Jon,
  Just for fun would you mind unpacking that for the less python inclined users?
David

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Combining Columns in SPSS

Jon Peck
Sure.

SPSSINC TRANS is an extension command implemented in Python.
spssinc trans result = result2 type=26 /variables var1 to var26 /formula "select(<>)".

It can be installed from the Utility or Extension menu depending on the Statistics version, if it isn't already installed.

The parameters used here are the result variable name - result2, the result type - a 26-byte string, a TO construct - var1 to var26, and the formula.  The formula says to call a function named select with arguments of all the variables implied in the variables subcommand as indicated by the <>.  Writing it out, it would be select(var1, var2,..., var26).

Since the formula function did not specify a module, it is assumed to be defined in the current implicit module, which is the function defined in the begin program block.  If it had been something like utility.select, SPSSINC TRANS would have imported a module named utility and made it contents accessible.

So, now the function.

import string
def select(*x):
    return "+".join([string.ascii_uppercase[i] for i, item in enumerate(x) if item == 1])

The import statement loads the string module, which contains, among many other things, a constant named ascii_uppercase, which is just ABCDEFG... and can be indexed to pick out the ith letter.

The def statement defines the function, which takes an argument which is the list of variable names.  The * causes Python to treat the arguments as a single list var1, ...

The key part is the return value, which uses what Python calls a list comprehension.  This is a short way to write a do loop.  So it iterates over the list containing all the variable values, extracting the index and value by applying the enumerate function, but the if condition limits the values to those that are 1.  Then the string.ascii_uppercase[i] term picks out the ith letter from that alphabetical list made available by the import statement.

The result is a list of the letters corresponding to the 1 values, which is then turned into a single string by the join function using "+" as the separator character.  That becomes the value of result2 for that case.

It is possible that there are no values that meet the if criterion item = 1, in which case the join of the empty list  is just "".  (This case isn't handled in the Statistics code version, which results in a mostly harmless error message.)

The formula subcommand of SPSSINC TRANS can take a Python function call directly without the need to define the function separately, but this only works for simple code of the form module.function(arglist), so it was necessary to define the function as here and just reference it in the formula subcommand.

On Thu, Nov 16, 2017 at 10:50 AM, David Marso <[hidden email]> wrote:
Jon,
  Just for fun would you mind unpacking that for the less python inclined users?
David




--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Combining Columns in SPSS

Mike Donatello
This was really a useful exchange. Thanks!

On Thu, Nov 16, 2017 at 1:14 PM, Jon Peck <[hidden email]> wrote:
Sure.

SPSSINC TRANS is an extension command implemented in Python.
spssinc trans result = result2 type=26 /variables var1 to var26 /formula "select(<>)".

It can be installed from the Utility or Extension menu depending on the Statistics version, if it isn't already installed.

The parameters used here are the result variable name - result2, the result type - a 26-byte string, a TO construct - var1 to var26, and the formula.  The formula says to call a function named select with arguments of all the variables implied in the variables subcommand as indicated by the <>.  Writing it out, it would be select(var1, var2,..., var26).

Since the formula function did not specify a module, it is assumed to be defined in the current implicit module, which is the function defined in the begin program block.  If it had been something like utility.select, SPSSINC TRANS would have imported a module named utility and made it contents accessible.

So, now the function.

import string
def select(*x):
    return "+".join([string.ascii_uppercase[i] for i, item in enumerate(x) if item == 1])

The import statement loads the string module, which contains, among many other things, a constant named ascii_uppercase, which is just ABCDEFG... and can be indexed to pick out the ith letter.

The def statement defines the function, which takes an argument which is the list of variable names.  The * causes Python to treat the arguments as a single list var1, ...

The key part is the return value, which uses what Python calls a list comprehension.  This is a short way to write a do loop.  So it iterates over the list containing all the variable values, extracting the index and value by applying the enumerate function, but the if condition limits the values to those that are 1.  Then the string.ascii_uppercase[i] term picks out the ith letter from that alphabetical list made available by the import statement.

The result is a list of the letters corresponding to the 1 values, which is then turned into a single string by the join function using "+" as the separator character.  That becomes the value of result2 for that case.

It is possible that there are no values that meet the if criterion item = 1, in which case the join of the empty list  is just "".  (This case isn't handled in the Statistics code version, which results in a mostly harmless error message.)

The formula subcommand of SPSSINC TRANS can take a Python function call directly without the need to define the function separately, but this only works for simple code of the form module.function(arglist), so it was necessary to define the function as here and just reference it in the formula subcommand.

On Thu, Nov 16, 2017 at 10:50 AM, David Marso <[hidden email]> wrote:
Jon,
  Just for fun would you mind unpacking that for the less python inclined users?
David




--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



--
Mike Donatello
703.582.5680
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD