SPSSX Discussion

Writing syntax for 500 times

Classic

List

Threaded

8 messages Options

John Watson-12

Writing syntax for 500 times

Team:
I have a complex problem and striving to get the solution asap. I have 500 x variable - say it x1 to x500. The value in x is 0 or 1. I also have another 500 ID variable (id1 to id500) but has duplicates ids. Values same across ID variable. Now, I need to find out unique cases for each id variable for cases where corresponding x value is 1.
i.e. for the first set of ID1 and X1, I will first select if x1=1 and then run typical syntax of identifying duplicates which will give primary first frequency. I do not want to repeat this for 500 times. Is there any shortcut method do this? I badly need to get this done as soon as possible. I am not very familiar with loop otherwise I would have tried it.
ID1 X1 ID2 X2
1 1 1 0
1 0 1 1
2 1 2 1
3 1 3 1
4 0 4 1
4 0 4 0
John

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Richard Ristow

Re: Writing syntax for 500 times

I don't entirely understand your problem, but let me see what I can do with it.

At 01:01 PM 8/23/2008, John Watson wrote:

>I have 500 x variable - say it x1 to x500. The value in x is 0 or 1.
>I also have another 500 ID variable (id1 to id500) but has
>duplicates ids. Values [are the] same across ID variable.

That is (also looking at your sample data), all 500 ID variables in
one case have the same value; and an ID value may occur in more than
one case in the file:

ID1 X1 ID2 X2
1 1 1 0
1 0 1 1
2 1 2 1
3 1 3 1
4 0 4 1
4 0 4 0

>Now, I need to find out unique cases for each id variable for cases
>where corresponding x value is 1. i.e. for the first set of ID1 and
>X1, I will first select if x1=1 and then run typical syntax of
>identifying duplicates which will give primary first frequency.

Here, I don't understand you. Could you give the results you'd like
to see from the sample data you've given us?

-Best of luck and best wishes,
Richard

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Albert-Jan Roskam

Re: Writing syntax for 500 times

In reply to this post by John Watson-12

Hi,

I read your post about five times, and I still don't entirely get it. Maybe it's because I'm reading it from a 9" screen ;-)

varstocases
/make id from id1 to id500
/make x from x1 to x500
/index = index.

sort cases by index (a) id (a) x (d).
select if id ne lag (id) and x = 1.
exe.

... is this what you\re looking for?

Cheers!!
Albert-Jan

--- On Sat, 8/23/08, John Watson <[hidden email]> wrote:

> From: John Watson <[hidden email]>
> Subject: Writing syntax for 500 times
> To: [hidden email]
> Date: Saturday, August 23, 2008, 7:01 PM
> Team:
> I have a complex problem and striving to get the solution
> asap. I have 500 x variable - say it x1 to x500. The value
> in x is 0 or 1. I also have another 500 ID variable (id1 to
> id500) but has duplicates ids. Values same across ID
> variable. Now, I need to find out unique cases for each id
> variable for cases where corresponding x value is 1.
> i.e. for the first set of ID1 and X1, I will first select
> if x1=1 and then run typical syntax of identifying
> duplicates which will give primary first frequency. I do not
> want to repeat this for 500 times. Is there any shortcut
> method do this? I badly need to get this done as soon as
> possible. I am not very familiar with loop otherwise I would
> have tried it.
> ID1 X1 ID2 X2
> 1 1 1 0
> 1 0 1 1
> 2 1 2 1
> 3 1 3 1
> 4 0 4 1
> 4 0 4 0
> John
>
>
>
>
> ====================To manage your subscription to SPSSX-L,
> send a message to
> [hidden email] (not to SPSSX-L), with no body
> text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the
> command
> INFO REFCARD

Carlos Renato (www.estatistico.org)

Re: Writing syntax for 500 times

Dear friend?

Can you explain me more about your objective? Want you generate an output
table with the unique cases for each variable or create a variable that
identify this
por each?

It's easy to make, but I need more information.

Carlos Renato
Statistician
Recife - PE - Brazil

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Carlos Renato (www.estatistico.org)

Re: Writing syntax for 500 times

In reply to this post by John Watson-12

Dear Friend

Try this and report me about.

DEFINE macrorun500 (!POSITIONAL !TOKENS(1))

!DO !Var= 1 !TO !1.
/* First step.
USE ALL.
COMPUTE filter_$=(!CONCAT('X',!Var)=1).
VARIABLE LABEL filter_$ 'x1=1 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE .
USE ALL.

COMPUTE !CONCAT('FILTER_',!Var)=filter_$.
EXECUTE.

/*Second step.

* Identify Duplicate Cases.
SORT CASES BY !CONCAT('ID',!Var) (A) .
MATCH FILES /FILE = * /BY !CONCAT('ID',!Var)
/FIRST = PrimaryFirst /LAST = PrimaryLast.
DO IF (PrimaryFirst).
COMPUTE MatchSequence = 1 - PrimaryLast.
ELSE.
COMPUTE MatchSequence = MatchSequence + 1.
END IF.
LEAVE MatchSequence.
FORMAT MatchSequence (f7).
COMPUTE InDupGrp = MatchSequence > 0.
SORT CASES InDupGrp(D).
MATCH FILES /FILE = * /DROP = PrimaryFirst InDupGrp MatchSequence.
VARIABLE LABELS PrimaryLast 'Indicator of each last matching case as
Primary' .
VALUE LABELS PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
VARIABLE LEVEL PrimaryLast (ORDINAL).
FREQUENCIES VARIABLES = PrimaryLast .
EXECUTE.

RENAME VARIABLES (PrimaryLast=!CONCAT('IDENT_DUPLICATES_ID',!Var).
EXECUTE.

!DOEND.
!ENDDEFINE.

MacroRun500 3.

Carlos Renato
Statistician
Brazil

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Art Kendall

Re: Writing syntax for 500 times

I have not done macros in some time, but it looks like you are trying to
find duplicate cases on each of 500 variables! which does not make sense?

Or it seems that you are trying to find duplicates within 500 ID numbers.
There would be no need to use a macro for this.

Please explain in more detail what it is you want to do.

Art

Carlos Renato wrote:

> Dear Friend
>
> Try this and report me about.
>
> DEFINE macrorun500 (!POSITIONAL !TOKENS(1))
>
> !DO !Var= 1 !TO !1.
> /* First step.
> USE ALL.
> COMPUTE filter_$=(!CONCAT('X',!Var)=1).
> VARIABLE LABEL filter_$ 'x1=1 (FILTER)'.
> VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
> FORMAT filter_$ (f1.0).
> FILTER BY filter_$.
> EXECUTE .
> USE ALL.
>
> COMPUTE !CONCAT('FILTER_',!Var)=filter_$.
> EXECUTE.
>
> /*Second step.
>
> * Identify Duplicate Cases.
> SORT CASES BY !CONCAT('ID',!Var) (A) .
> MATCH FILES /FILE = * /BY !CONCAT('ID',!Var)
> /FIRST = PrimaryFirst /LAST = PrimaryLast.
> DO IF (PrimaryFirst).
> COMPUTE MatchSequence = 1 - PrimaryLast.
> ELSE.
> COMPUTE MatchSequence = MatchSequence + 1.
> END IF.
> LEAVE MatchSequence.
> FORMAT MatchSequence (f7).
> COMPUTE InDupGrp = MatchSequence > 0.
> SORT CASES InDupGrp(D).
> MATCH FILES /FILE = * /DROP = PrimaryFirst InDupGrp MatchSequence.
> VARIABLE LABELS PrimaryLast 'Indicator of each last matching case as
> Primary' .
> VALUE LABELS PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
> VARIABLE LEVEL PrimaryLast (ORDINAL).
> FREQUENCIES VARIABLES = PrimaryLast .
> EXECUTE.
>
> RENAME VARIABLES (PrimaryLast=!CONCAT('IDENT_DUPLICATES_ID',!Var).
> EXECUTE.
>
>
> !DOEND.
> !ENDDEFINE.
>
> MacroRun500 3.
>
> Carlos Renato
> Statistician
> Brazil
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

Art Kendall
Social Research Consultants

John Watson-12

Re: Writing syntax for 500 times

Thanks all for great support. With great input from albert and richard I got the problem solved.

--- On Thu, 8/28/08, Art Kendall <[hidden email]> wrote:

From: Art Kendall <[hidden email]>
Subject: Re: Writing syntax for 500 times
To: [hidden email]
Date: Thursday, August 28, 2008, 1:34 PM

I have not done macros in some time, but it looks like you are trying to
find duplicate cases on each of 500 variables! which does not make sense?

Or it seems that you are trying to find duplicates within 500 ID numbers.
There would be no need to use a macro for this.

Please explain in more detail what it is you want to do.

Art

Carlos Renato wrote:

Case'.

> VARIABLE LEVEL PrimaryLast (ORDINAL).
> FREQUENCIES VARIABLES = PrimaryLast .
> EXECUTE.
>
> RENAME VARIABLES (PrimaryLast=!CONCAT('IDENT_DUPLICATES_ID',!Var).
> EXECUTE.
>
>
> !DOEND.
> !ENDDEFINE.
>
> MacroRun500 3.
>
> Carlos Renato
> Statistician
> Brazil
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Richard Ristow

Re: Writing syntax for 500 times

In reply to this post by John Watson-12

Discussion of this problem moved off-list; here's how it was finally
resolved. It's of interest partly because the problem initially
appeared to need a code loop, but was actually solved more easily
using SPSS's own basic loop.

Test data for the problem eventually solved looked like this, notably
different from that originally posted:
|-----------------------------|---------------------------|
|Output Created |30-AUG-2008 13:56:33 |
|-----------------------------|---------------------------|
[TestData]
id Index1 id_check trans2

1 id_1 1 0
1 id_2 1 0
1 id_3 1 0
2 id_1 2 1
2 id_2 2 1
2 id_3 2 1
3 id_1 3 1
3 id_2 3 1
3 id_3 3 1
4 id_1 4 1
4 id_2 4 0
4 id_3 4 0
5 id_1 5 0
5 id_2 5 0
5 id_3 5 0
6 id_1 6 1
6 id_2 6 0
6 id_3 6 1
7 id_1 7 0
7 id_2 7 1
7 id_3 7 0
8 id_1 8 1
8 id_2 8 1
8 id_3 8 1
9 id_1 9 0
9 id_2 9 0
9 id_3 9 1
10 id_1 10 1
10 id_2 10 0
10 id_3 10 1

Number of cases read: 30 Number of cases listed: 30

and the stated requirement was,

>Below is the syntax that give me the count of unique cases that I am
>looking for. I have 499 similar sets for which I will have to repeat
>the following syntax 499 times. [Syntax reformatted and simplified]

* Identify Duplicate Cases.
***trans2 is x variable.

select if trans2=1.

SORT CASES BY id_check(A) .
MATCH FILES
/FILE = *
/BY id_check
/LAST = PrimaryLast.

VARIABLE LABELS PrimaryLast
'Indicator of each last matching case as Primary' .

VALUE LABELS PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
FORMATS PrimaryLast (F2).

FREQUENCIES VARIABLES = PrimaryLast .
...........................
Rather than running some version of this code 500 times, the solution
was to group the records by "trans2" and identify duplicates within
groups - 'using SPSS's basic loop' through the records in a file, in
order. The code is very little changed, though note that the
FREQUENCIES is replaced by CROSSTABS to report the duplicate counts
by "trans2" group:

* Identify Duplicate Cases.
***trans2 is x variable.
. SELECT IF /* REVISED */
NOT MISSING (trans2) /* REVISED */.

SORT CASES BY trans2 (A) /* "trans2" ADDED */
id_check(A) .
MATCH FILES
/FILE = *
/BY trans2 /* "trans2" ADDED */
id_check
/LAST = PrimaryLast.

VARIABLE LABELS PrimaryLast
'Indicator of each last matching case as Primary' .

VALUE LABELS PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
FORMATS PrimaryLast (F2).

VARIABLE LEVEL PrimaryLast (ORDINAL).

CROSSTABS
/TABLES= trans2 BY PrimaryLast
/FORMAT= AVALUE TABLES
/CELLS= COUNT ROW
/COUNT ROUND CELL .
============================
APPENDIX I: Output from run
============================
* ................................................................. .
* ..... VERSION 3A: Run with one value of 'trans2' per pass ..... .
* (For this run, trans2=1) .
DATASET ACTIVATE TestData WINDOW=FRONT.
DATASET COPY Original WINDOW=FRONT.
DATASET ACTIVATE Original.
* Identify Duplicate Cases.

***trans2 is x variable.

select if trans2=1.

SORT CASES BY id_check(A) .
MATCH FILES
/FILE = *
/BY id_check
/LAST = PrimaryLast.

VARIABLE LABELS PrimaryLast
'Indicator of each last matching case as Primary' .

VALUE LABELS PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
FORMATS PrimaryLast (F2).

. /**/ LIST /*-*/.

List
|-----------------------------|---------------------------|
|Output Created |30-AUG-2008 13:56:35 |
|-----------------------------|---------------------------|
[Original]

id Index1 id_check trans2 PrimaryLast

2 id_1 2 1 0
2 id_2 2 1 0
2 id_3 2 1 1
3 id_1 3 1 0
3 id_2 3 1 0
3 id_3 3 1 1
4 id_1 4 1 1
6 id_1 6 1 0
6 id_3 6 1 1
7 id_2 7 1 1
8 id_1 8 1 0
8 id_2 8 1 0
8 id_3 8 1 1
9 id_3 9 1 1
10 id_1 10 1 0
10 id_3 10 1 1

Number of cases read: 16 Number of cases listed: 16

FREQUENCIES VARIABLES = PrimaryLast .

Frequencies
|-----------------------------|---------------------------|
|Output Created |30-AUG-2008 13:56:36 |
|-----------------------------|---------------------------|
[Original]

Statistics [suppressed -- no descriptives or missing data]

PrimaryLast Indicator of each last matching case as Primary
|-----|---------------|---------|-------|-------------|---------------|
| | |Frequency|Percent|Valid Percent|Cumulative |
| | | | | |Percent |
|-----|---------------|---------|-------|-------------|---------------|
|Valid|0 Duplicate |8 |50.0 |50.0 |50.0 |
| |Case | | | | |
| |---------------|---------|-------|-------------|---------------|
| |1 Primary Case|8 |50.0 |50.0 |100.0 |
| |---------------|---------|-------|-------------|---------------|
| |Total |16 |100.0 |100.0 | |
|-----|---------------|---------|-------|-------------|---------------|

* ................................................................. .
* ..... VERSION 2B: Run with all values of 'trans2' in one ..... .
* pass .
DATASET ACTIVATE TestData WINDOW=FRONT.
DATASET COPY NewLogic WINDOW=FRONT.
DATASET ACTIVATE NewLogic.
* Identify Duplicate Cases.
***trans2 is x variable.
. SELECT IF /* REVISED */
NOT MISSING (trans2) /* REVISED */.

SORT CASES BY trans2 (A) /* "trans2" ADDED */
id_check(A) .
MATCH FILES
/FILE = *
/BY trans2 /* "trans2" ADDED */
id_check
/LAST = PrimaryLast.

VARIABLE LABELS PrimaryLast
'Indicator of each last matching case as Primary' .

VALUE LABELS PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
FORMATS PrimaryLast (F2).

VARIABLE LEVEL PrimaryLast (ORDINAL).

. /**/ LIST /*-*/.

List
|-----------------------------|---------------------------|
|Output Created |30-AUG-2008 13:56:38 |
|-----------------------------|---------------------------|
[NewLogic]

id Index1 id_check trans2 PrimaryLast

1 id_1 1 0 0
1 id_2 1 0 0
1 id_3 1 0 1
4 id_2 4 0 0
4 id_3 4 0 1
5 id_1 5 0 0
5 id_2 5 0 0
5 id_3 5 0 1
6 id_2 6 0 1
7 id_1 7 0 0
7 id_3 7 0 1
9 id_1 9 0 0
9 id_2 9 0 1
10 id_2 10 0 1
2 id_1 2 1 0
2 id_2 2 1 0
2 id_3 2 1 1
3 id_1 3 1 0
3 id_2 3 1 0
3 id_3 3 1 1
4 id_1 4 1 1
6 id_1 6 1 0
6 id_3 6 1 1
7 id_2 7 1 1
8 id_1 8 1 0
8 id_2 8 1 0
8 id_3 8 1 1
9 id_3 9 1 1
10 id_1 10 1 0
10 id_3 10 1 1

Number of cases read: 30 Number of cases listed: 30

CROSSTABS /* Replaces previous FREQUENCIES REVISED */
/TABLES= trans2 BY PrimaryLast
/FORMAT= AVALUE TABLES
/CELLS= COUNT ROW
/COUNT ROUND CELL .

Crosstabs
|-----------------------------|---------------------------|
|Output Created |30-AUG-2008 13:56:38 |
|-----------------------------|---------------------------|
[NewLogic]

Case Processing Summary [suppressed - no missing cases]

trans2
* PrimaryLast Indicator of each last matching case as Primary Crosstabulation
|------|-|---------------|-------------------------------|------|
| | | |PrimaryLast Indicator of each |Total |
| | | |last matching case as Primary | |
| | | |---------------|---------------| |
| | | |0 Duplicate |1 Primary Case| |
| | | |Case | | |
|------|-|---------------|---------------|---------------|------|
|trans2|0|Count |7 |7 |14 |
| | |% within trans2|50.0% |50.0% |100.0%|
| |-|---------------|---------------|---------------|------|
| |1|Count |8 |8 |16 |
| | |% within trans2|50.0% |50.0% |100.0%|
|------|-|---------------|---------------|---------------|------|
|Total |Count |15 |15 |30 |
| |% within trans2|50.0% |50.0% |100.0%|
|--------|---------------|---------------|---------------|------|
=============================
APPENDIX II: Test data & code
=============================
* C:\Documents and Settings\Richard\My Documents .
* \Technical\spssx-l\Z-2008c .
* \2008-08-23 Watson - Writing syntax for 500 times-V3.SPS .

DATA LIST LIST /
id Index1 id_check trans2
(F2, A4, F3, F3).
BEGIN DATA
1 id_1 1 0
1 id_2 1 0
1 id_3 1 0
2 id_1 2 1
2 id_2 2 1
2 id_3 2 1
3 id_1 3 1
3 id_2 3 1
3 id_3 3 1
4 id_1 4 1
4 id_2 4 0
4 id_3 4 0
5 id_1 5 0
5 id_2 5 0
5 id_3 5 0
6 id_1 6 1
6 id_2 6 0
6 id_3 6 1
7 id_1 7 0
7 id_2 7 1
7 id_3 7 0
8 id_1 8 1
8 id_2 8 1
8 id_3 8 1
9 id_1 9 0
9 id_2 9 0
9 id_3 9 1
10 id_1 10 1
10 id_2 10 0
10 id_3 10 1
END DATA.
DATASET NAME TestData WINDOW=FRONT.
LIST.

* ................................................................. .
* ..... VERSION 3A: Run with one value of 'trans2' per pass ..... .
* (For this run, trans2=1) .
DATASET ACTIVATE TestData WINDOW=FRONT.
DATASET COPY Original WINDOW=FRONT.
DATASET ACTIVATE Original.

* Identify Duplicate Cases.

***trans2 is x variable.

select if trans2=1.

SORT CASES BY id_check(A) .
MATCH FILES
/FILE = *
/BY id_check
/LAST = PrimaryLast.

VARIABLE LABELS PrimaryLast
'Indicator of each last matching case as Primary' .

VALUE LABELS PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
FORMATS PrimaryLast (F2).

. /**/ LIST /*-*/.

FREQUENCIES VARIABLES = PrimaryLast .

* ................................................................. .
* ..... VERSION 2B: Run with all values of 'trans2' in one ..... .
* pass .
DATASET ACTIVATE TestData WINDOW=FRONT.
DATASET COPY NewLogic WINDOW=FRONT.
DATASET ACTIVATE NewLogic.

* Identify Duplicate Cases.
***trans2 is x variable.
. SELECT IF /* REVISED */
NOT MISSING (trans2) /* REVISED */.

SORT CASES BY trans2 (A) /* "trans2" ADDED */
id_check(A) .
MATCH FILES
/FILE = *
/BY trans2 /* "trans2" ADDED */
id_check
/LAST = PrimaryLast.

VARIABLE LABELS PrimaryLast
'Indicator of each last matching case as Primary' .

VALUE LABELS PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
FORMATS PrimaryLast (F2).

VARIABLE LEVEL PrimaryLast (ORDINAL).

. /**/ LIST /*-*/.

CROSSTABS /* Replaces previous FREQUENCIES REVISED */
/TABLES= trans2 BY PrimaryLast
/FORMAT= AVALUE TABLES
/CELLS= COUNT ROW
/COUNT ROUND CELL .

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD