spssinc compare datasets

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

spssinc compare datasets

Cathie Atkinson
I guess I need some help with this.  I'm trying to run the compare extension command but I can't get it to work.  Below is the syntax and the error message I get is
Warning: the following requested variables to be compared are not present in both datasets and will be omitted.
result

However, result is definitely in both files so I'm not sure what I'm doing wrong.  Also no log file is generated so nothing is compared.  I have regular syntax to do this but if I can figure out how this works it might be simpler. 

spssinc compare datasets ds1 = x ds2 = y VARIABLES=subject CollectionDate result /DICTIONARY none
/DATA   logfile = 'x:\diflog'.

the syntax below is what works.  what I really need to do is find differences in result for cases that match on subject collectiondate assay label and sampleno (oldfile is redundant).  I tried including this in the extension syntax but you can only input 1 id so I'm clearly not getting the logic.  I thought if I could see an example I would understand better what the extension command does, but I can't find much info.  Thanks so much! - Cathie


add files file = x/in = OldFile/file = y.
dataset name z.
SORT CASES BY subject(A) CollectionDate(A) Assay(A) LABEL(A) SampleNo(A) OldFile(A).
Do if (oldfile = 1) and lag(oldfile) = 0.
         if result ne lag(result) bad = 1.
End if.

SORT CASES BY subject(d) CollectionDate(d) Assay(d) LABEL(d) SampleNo(d) OldFile(d).
exe.
Do if (oldfile = 0) and lag(oldfile) = 1.
if lag(bad) = 1 bad = 1.
End if.

Temporary.
        Select if bad=1.
        SAVE TRANSLATE
         /CONNECT='DSN=Excel Files;DBQ=x:\data\fsh1\data\system\corelab compare.xls;DriverId=790;MaxBufferSize=2048;PageTimeout=5;'
         /TABLE="check"
        /KEEP =subject CollectionDate Assay  label SampleNo result oldfile
         /TYPE=ODBC
         /REPLACE.

EXECUTE .



Cathie Atkinson, PhD
Clinical Psychologist
Institute for Behavioral Medicine Research
The Ohio State University
(614) 292-0033
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: spssinc compare datasets

Peck, Jon

One definite problem is that an ID variable present in both datasets is required when comparing cases.  So you need something like

/DATA id=idvar

 

If you don't have an id variable, you can just compute one with

COMPUTE ID = $casenum.

 

The output you are requesting, once it runs, will be a detailed text log of all the differences.  You might also want a variable or variables in the dataset that flag differences.  You can add something like

DIFFCOUNT = differences

to the DATA subcommand to create a variable that counts the number of differences in each case.

 

The error message is issued before it gets that far, however.  My first guess is that the variable name "result" does not match the case in your dataset(s).  Variable names are case sensitive for this command, so if your variable is named RESULT in SPSS, you must write it in capitals in this command.

 

HTH,

Jon Peck

 


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Cathie Atkinson
Sent: Friday, February 20, 2009 9:34 AM
To: [hidden email]
Subject: [SPSSX-L] spssinc compare datasets

 

I guess I need some help with this.  I'm trying to run the compare extension command but I can't get it to work.  Below is the syntax and the error message I get is
Warning: the following requested variables to be compared are not present in both datasets and will be omitted.
result

However, result is definitely in both files so I'm not sure what I'm doing wrong.  Also no log file is generated so nothing is compared.  I have regular syntax to do this but if I can figure out how this works it might be simpler. 

spssinc compare datasets ds1 = x ds2 = y VARIABLES=subject CollectionDate result /DICTIONARY none
/DATA   logfile = 'x:\diflog'.

the syntax below is what works.  what I really need to do is find differences in result for cases that match on subject collectiondate assay label and sampleno (oldfile is redundant).  I tried including this in the extension syntax but you can only input 1 id so I'm clearly not getting the logic.  I thought if I could see an example I would understand better what the extension command does, but I can't find much info.  Thanks so much! - Cathie


add files file = x/in = OldFile/file = y.
dataset name z.
SORT CASES BY subject(A) CollectionDate(A) Assay(A) LABEL(A) SampleNo(A) OldFile(A).
Do if (oldfile = 1) and lag(oldfile) = 0.
         if result ne lag(result) bad = 1.
End if.

SORT CASES BY subject(d) CollectionDate(d) Assay(d) LABEL(d) SampleNo(d) OldFile(d).
exe.
Do if (oldfile = 0) and lag(oldfile) = 1.
if lag(bad) = 1 bad = 1.
End if.

Temporary.
        Select if bad=1.
        SAVE TRANSLATE
         /CONNECT='DSN=Excel Files;DBQ=x:\data\fsh1\data\system\corelab compare.xls;DriverId=790;MaxBufferSize=2048;PageTimeout=5;'
         /TABLE="check"
        /KEEP =subject CollectionDate Assay  label SampleNo result oldfile
         /TYPE=ODBC
         /REPLACE.

EXECUTE .




Cathie Atkinson, PhD
Clinical Psychologist
Institute for Behavioral Medicine Research
The Ohio State University
(614) 292-0033
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: spssinc compare datasets

Cathie Atkinson
thanks Jon! the case was indeed an issue and with the addition of the casenum id, I have a log file.  I'll play with it a bit to see what other features would be useful.


At 11:43 AM 2/20/2009, Peck, Jon wrote:
One definite problem is that an ID variable present in both datasets is required when comparing cases.  So you need something like
/DATA id=idvar
 
If you don't have an id variable, you can just compute one with
COMPUTE ID = $casenum.
 
The output you are requesting, once it runs, will be a detailed text log of all the differences.  You might also want a variable or variables in the dataset that flag differences.  You can add something like
DIFFCOUNT = differences
to the DATA subcommand to create a variable that counts the number of differences in each case.
 
The error message is issued before it gets that far, however.  My first guess is that the variable name "result" does not match the case in your dataset(s).  Variable names are case sensitive for this command, so if your variable is named RESULT in SPSS, you must write it in capitals in this command.
 
HTH,
Jon Peck
 

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Cathie Atkinson
Sent: Friday, February 20, 2009 9:34 AM
To: [hidden email]
Subject: [SPSSX-L] spssinc compare datasets
 
I guess I need some help with this.  I'm trying to run the compare extension command but I can't get it to work.  Below is the syntax and the error message I get is
Warning: the following requested variables to be compared are not present in both datasets and will be omitted.
result

However, result is definitely in both files so I'm not sure what I'm doing wrong.  Also no log file is generated so nothing is compared.  I have regular syntax to do this but if I can figure out how this works it might be simpler. 

spssinc compare datasets ds1 = x ds2 = y VARIABLES=subject CollectionDate result /DICTIONARY none
/DATA   logfile = 'x:\diflog'.

the syntax below is what works.  what I really need to do is find differences in result for cases that match on subject collectiondate assay label and sampleno (oldfile is redundant).  I tried including this in the extension syntax but you can only input 1 id so I'm clearly not getting the logic.  I thought if I could see an example I would understand better what the extension command does, but I can't find much info.  Thanks so much! - Cathie


add files file = x/in = OldFile/file = y.
dataset name z.
SORT CASES BY subject(A) CollectionDate(A) Assay(A) LABEL(A) SampleNo(A) OldFile(A).
Do if (oldfile = 1) and lag(oldfile) = 0.
        if result ne lag(result) bad = 1.
End if.

SORT CASES BY subject(d) CollectionDate(d) Assay(d) LABEL(d) SampleNo(d) OldFile(d).
exe.
Do if (oldfile = 0) and lag(oldfile) = 1.
if lag(bad) = 1 bad = 1.
End if.

Temporary.
        Select if bad=1.
        SAVE TRANSLATE
         /CONNECT='DSN=Excel Files;DBQ=x:\data\fsh1\data\system\corelab compare.xls;DriverId=790;MaxBufferSize=2048;PageTimeout=5;'
         /TABLE="check"
        /KEEP =subject CollectionDate Assay  label SampleNo result oldfile
         /TYPE=ODBC
         /REPLACE.

EXECUTE .




Cathie Atkinson, PhD
Clinical Psychologist
Institute for Behavioral Medicine Research
The Ohio State University
(614) 292-0033
[hidden email]


Spam
Not spam
Forget previous vote

Cathie Atkinson, PhD
Clinical Psychologist
Institute for Behavioral Medicine Research
The Ohio State University
(614) 292-0033
[hidden email]