SPSSX Discussion

Estimating and Adjusting Cutoff Scores

Classic

List

Threaded

12 messages Options

Johnny Amora

Estimating and Adjusting Cutoff Scores

Hi all,
(My apology for cross-posting)

Can you recommend an updated article which deals similar to this paper?

Mills and Melican(1988). Estimating and Adjusting Cutoff Scores: Features of Selected method. Applied Measurement in Education I(3), 261-275

Thank you.
Johnny

Try the new Yahoo! Messenger 9.0
Now with all you love about messenger and more!

Albert-Jan Roskam

comparing two datasets

Hi all,

I need to compare two datasets, D1 and D2. These sets are the result of two linkage strategies, so they're each comprised of two datasets, A and B. A and B each have two id variables.

I want to evaluate to what extent the two linkage strategies have led to different linkage pairs. In other words: what is the intersection between D1 and D2, what is the difference, and what do the differences look like.

I've been thinking about using either UPDATE or MATCH FILES for this, something like:
UPDATE FILE = d1 / RENAME (id_b = id_b_d1) / FILE = d2 / RENAME (id_b = id_b_d2) / BY = id_a .
COMPUTE intersection = ( CONCAT (id_a, id_b_d1) = CONCAT (id_a, id_b_d2) ).

Does this make sense? I'm getting kinda confused.

Oh, and unfortunately I can't use SPSSINC COMPAREDATASETS. :-( *grinding his teeth* ;-)

Cheers!!
Albert-Jan

Clive Downs

Re: comparing two datasets

Hi Albert-Jan,

Can you post small examples of both datasets, please, as I can't quite
envisage what you describe.

.. meanwhile, have you considered using Access for this, using standard
queries to select intersecting and non-intersecting records?

Thanks

Regards

Clive.

>Hi all,
>
>I need to compare two datasets, D1 and D2. These sets are the result of
two linkage strategies, so they're each comprised of two datasets, A and B.
A and B each have two id variables.
>
> I want to evaluate to what extent the two linkage strategies have led to
different linkage pairs. In other words: what is the intersection between
D1 and D2, what is the difference, and what do the differences look like.
>
>I've been thinking about using either UPDATE or MATCH FILES for this,
something like:
>UPDATE FILE = d1 / RENAME (id_b = id_b_d1) / FILE = d2 / RENAME (id_b =
id_b_d2) / BY = id_a .
>COMPUTE intersection = ( CONCAT (id_a, id_b_d1) = CONCAT (id_a, id_b_d2) ).
>
>Does this make sense? I'm getting kinda confused.
>
>Oh, and unfortunately I can't use SPSSINC COMPAREDATASETS. :-( *grinding
his teeth* ;-)
>
>Cheers!!
>Albert-Jan
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Handel, Richard W.

expoff.bat

Hi All,

My SPSS 16.0 site license is in the process of being renewed and I'm now
getting the annoying pop-up message about the license expiring soon.
Although the message indicates that running expoff.bat will stop the
message, I cannot locate this file in the spss directory. Also, trying
to search for this file in my whole system yields nothing even when I
include hidden files and folders. Does anyone know how to turn this
pop-up message off?

Thanks,
Rick

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Maguin, Eugene

Re: comparing two datasets

In reply to this post by Albert-Jan Roskam

Albert-jan,

I agree with Clive that some sample data from both D1 and D2 would be
helpful. In addition, I'd like to see a definition of the two linkage
strategies. I'm imagining that you are not doing a one-to-one match--as
would be done with a match files command--and have some sort of probability
match. Lastly, I'm guessing that you'd want a summary statistic that is the
proportion of A records that link to the same B record in D1 as in D2. True?
If this is true, then (and without knowing any thing else), I think I'd
match (match files) D1 and D2 by A file ID and rename the B file Id in D2 so
that one doesn't overwrite the other. I'm assume that every A file record is
on both D1 and D2. If so, then a match files will work. If not, then I think
I'd match files D1 and D2 to file A because A is the union of A records in
D1 and D2. Then you could compare the values of the D1 B file id and the D2
B file id.

Gene Maguin

>>I need to compare two datasets, D1 and D2. These sets are the result of
two linkage strategies, so they're each comprised of two datasets, A and B.
A and B each have two id variables.

I want to evaluate to what extent the two linkage strategies have led to
different linkage pairs. In other words: what is the intersection between D1
and D2, what is the difference, and what do the differences look like.

I've been thinking about using either UPDATE or MATCH FILES for this,
something like:
UPDATE FILE = d1 / RENAME (id_b = id_b_d1) / FILE = d2 / RENAME (id_b =
id_b_d2) / BY = id_a .
COMPUTE intersection = ( CONCAT (id_a, id_b_d1) = CONCAT (id_a, id_b_d2) ).

Does this make sense? I'm getting kinda confused.

Oh, and unfortunately I can't use SPSSINC COMPAREDATASETS. :-( *grinding his
teeth* ;-)

Cheers!!
Albert-Jan

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Peck, Jon

Re: comparing two datasets

In reply to this post by Albert-Jan Roskam

One reasonably easy strategy would be to read the case id's of the data into Python creating a Python set for each. Then you can use the standard set operators to compute all the differences. In particular

a.intersection(b)

a – b

b – a

a.symmetric_difference(b)

Too bad you can't use SPSSINC COMPARE DATASETS. That means you can't use my fuzzy matching module, FUZZY, to do the linking either. L

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Albert-jan Roskam
Sent: Friday, February 06, 2009 4:50 AM
To: [hidden email]
Subject: [SPSSX-L] comparing two datasets

SPSS Support

Re: expoff.bat

In reply to this post by Handel, Richard W.

Hello Rick,
The expoff.bat suggestion is out of date for SPSS 16, as noted in the resolution below. SPSS 15 was the last version where that file was included and effective. Renewing the license is the only way to turn off the message in SPSS 16.

David Matheson
SPSS Statistical Support
*****************
Resolution number: 76257 Created on: Mar 10 2008 Last Reviewed on: Jan 12 2009

Problem Subject: Unable to turn off "Your license renewal date has passed" alert -- expoff.bat does not exist in SPSS 16.0.x

Problem Description: When launching SPSS 16.0.x I am getting a message, 'Your license renewal date has passed. SPSS will stop working if a new license is not installed soon. If you don't want to see this message again, run expoff.bat in the SPSS directory'

I have looked in my SPSS program installation directory and I do not see expoff.bat. I am aware that if I click OK on this message that SPSS will continue to function properly until my license expires, however, I would prefer the convenience of turning off this alert. What can I do?

Resolution Subject: At present there is not a way to turn off this alert in SPSS 16.0.x other than updating your license.

Resolution Description:
At present there is not a way to turn off this alert in SPSS 16.0.x other than updating your license. This issue has been filed with SPSS Development and we apologize for the inconvenience.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Handel, Richard W.
Sent: Friday, February 06, 2009 7:53 AM
To: [hidden email]
Subject: expoff.bat

Hi All,

My SPSS 16.0 site license is in the process of being renewed and I'm now
getting the annoying pop-up message about the license expiring soon.
Although the message indicates that running expoff.bat will stop the
message, I cannot locate this file in the spss directory. Also, trying
to search for this file in my whole system yields nothing even when I
include hidden files and folders. Does anyone know how to turn this
pop-up message off?

Thanks,
Rick

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Albert-Jan Roskam

Re: comparing two datasets (2)

In reply to this post by Maguin, Eugene

Hello all, thanks for responding,

Let me tell you something about the background of the question of my previous post. The two datasets to be compared are the result of two linkage projects. One project used a probabilistic linkage technique, while the other used a (n-1) deterministic technique. The resulting datasets differ in a quantitative and in a qualitative way. The former refers to the percentage of linked records (probably higher using probab. technique) while the latter refers to the different linkages (pairs), depending on the technique used. Probabilistic linkage is fairly laborious and deterministic linkage is a routine task. We want to know if the latter is practically as good as the former, despite of its slightly lower linkage percentage. One outcome measure we use to evaluate this is mortality. It would be a bad sign, for example, if the mortality in the non-linkages differed from one group to the other. I hope the syntax below somewhat illustrates things.Basically, the
question I would like to answer is: are the outcome measures, esp. the mortality rate, technique-independent?

Thanks in advance for your replies!

Best wishes,
Albert-Jan

* sample syntax.
data list free / id_a (a4) id_b (a4) mort_d1 (f1).
begin data
1 23 1
2 45 1
3 56 1
4 22 0
5 88 0
7 10 0
9 100 0
end data.
dataset name d1.

data list free / id_a (a4) id_b (a4) mort_d2 (f1).
begin data
1 23 1
2 45 1
3 56 1
4 1 1
5 99 0
6 88 1
7 10 0
end data.
dataset name d2.

UPDATE FILE = d1 / RENAME (id_b = id_b_d1) / IN = in_d1 / FILE = d2 / RENAME (id_b = id_b_d2) / BY = id_a .
COMPUTE intersection = ( CONCAT (id_a, id_b_d1) = CONCAT (id_a, id_b_d2) ).
variable label mort_d1 'mortality (D1)' / mort_d2 'mortality (D2)'.
value labels intersection 0 'record pair not in both files' 1 'record pair in both files'
/ mort_d1 mort_d2 0 'alive' 1 'dead'
/ in_d1 1 'probabilistically linked data' 0 '(n-1)-deterministically linked data'.
crosstabs mort_d1 by intersection / cells = col.
crosstabs mort_d2 by intersection / cells = col.

dataset close all.

----- Original Message ----
From: Gene Maguin <[hidden email]>
To: [hidden email]
Sent: Friday, February 6, 2009 3:12:53 PM
Subject: Re: comparing two datasets

Albert-jan,

I agree with Clive that some sample data from both D1 and D2 would be
helpful. In addition, I'd like to see a definition of the two linkage
strategies. I'm imagining that you are not doing a one-to-one match--as
would be done with a match files command--and have some sort of probability
match. Lastly, I'm guessing that you'd want a summary statistic that is the
proportion of A records that link to the same B record in D1 as in D2. True?
If this is true, then (and without knowing any thing else), I think I'd
match (match files) D1 and D2 by A file ID and rename the B file Id in D2 so
that one doesn't overwrite the other. I'm assume that every A file record is
on both D1 and D2. If so, then a match files will work. If not, then I think
I'd match files D1 and D2 to file A because A is the union of A records in
D1 and D2. Then you could compare the values of the D1 B file id and the D2
B file id.

Gene Maguin

>>I need to compare two datasets, D1 and D2. These sets are the result of
two linkage strategies, so they're each comprised of two datasets, A and B.
A and B each have two id variables.

I want to evaluate to what extent the two linkage strategies have led to
different linkage pairs. In other words: what is the intersection between D1
and D2, what is the difference, and what do the differences look like.

I've been thinking about using either UPDATE or MATCH FILES for this,
something like:
UPDATE FILE = d1 / RENAME (id_b = id_b_d1) / FILE = d2 / RENAME (id_b =
id_b_d2) / BY = id_a .
COMPUTE intersection = ( CONCAT (id_a, id_b_d1) = CONCAT (id_a, id_b_d2) ).

Does this make sense? I'm getting kinda confused.

Oh, and unfortunately I can't use SPSSINC COMPAREDATASETS. :-( *grinding his
teeth* ;-)

Cheers!!
Albert-Jan

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Richard Ristow

Re: comparing two datasets

In reply to this post by Albert-Jan Roskam

Going back to the beginning -- at 06:49 AM 2/6/2009, Albert-jan Roskam wrote:

I need to compare two datasets, D1 and D2. These sets are the result of two linkage strategies, so they're each comprised of two datasets, A and B. A and B each have two id variables.

Right. A and B are not datasets of linked elements; they are datasets of the links themselves.

[I need to know] what is the intersection between D1 and D2, what is the difference, and what do the differences look like.

Supposing that in each file the two IDs are called ID_L and ID_R (for 'left' and 'right' member of the link). Then, what's wrong with (untested)

MATCH FILES /FILE=A IN=IN_A /FILE=B IN=IN_B /ID_L ID_R.Complications that may arise include,
. If a link can occur in both orders -- X as ID_L and Y as ID_R, or Y as ID_L and X as ID_R. The above won't find those are the same. If that's a problem, change all links to 'canonical' order, where ID_L < ID_R.
. Transitive closure: if X is linked to Y and Y is linked to Z, does this mean X is, by definition, linked to Z? If so, those implicit links need to be calculated and inserted as explicit links; that's a bit of a bother.

-Good luck to all,
Richard
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Art Kendall

Re: comparing two datasets

Is one of these an accurate interpretation?
1)
you have 2 case by case matrices. (entities by entities).
the entry in each cell is dichotomous. Either the two cases are linked
or they are not.
You wish to compare and contrast the two matrices.
2)
you have 2 case by case matrices. (entities by entities).
each cell is the distance (# of links, euclidean, etc) between the cases.
You wish to compare and contrast the two matrices.
3)
You have two hierarchical trees and you wish to compare and contrast them?

Art Kendall
Social Research Consultants

Richard Ristow wrote:

> Going back to the beginning -- at 06:49 AM 2/6/2009, Albert-jan Roskam
> wrote:
>
>> I need to compare two datasets, D1 and D2. These sets are the result
>> of two linkage strategies, so they're each comprised of two datasets,
>> A and B. A and B each have two id variables.
>
> Right. A and B are not datasets of linked elements; they are datasets
> of the links themselves.
>
>> [I need to know] what is the intersection between D1 and D2, what is
>> the difference, and what do the differences look like.
>
> Supposing that in each file the two IDs are called ID_L and ID_R (for
> 'left' and 'right' member of the link). Then, what's wrong with (untested)
>
> MATCH FILES
> /FILE=A IN=IN_A
> /FILE=B IN=IN_B
> /ID_L ID_R.
>
> Complications that may arise include,
> . If a link can occur in both orders -- X as ID_L and Y as ID_R, or Y
> as ID_L and X as ID_R. The above won't find those are the same. If
> that's a problem, change all links to 'canonical' order, where ID_L <
> ID_R.
> . Transitive closure: if X is linked to Y and Y is linked to Z, does
> this mean X is, by definition, linked to Z? If so, those implicit
> links need to be calculated and inserted as explicit links; that's a
> bit of a bother.
>
> -Good luck to all,
> Richard
> ===================== To manage your subscription to SPSSX-L, send a
> message to [hidden email] (not to SPSSX-L), with no body
> text except the command. To leave the list, send the command SIGNOFF
> SPSSX-L For a list of commands to manage subscriptions, send the
> command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Art Kendall
Social Research Consultants

Clive Downs

Re: comparing two datasets

In reply to this post by Albert-Jan Roskam

Hi

I, too, have two datasets that I want to compare to see if they are
identical.

While I can see a way to do this in SPSS syntax (by merging files, adding
variables, and looping thru each pair of variables, testing if they are
the same), "SPSSINC COMPAREDATASETS" sound as though it might be a smarter
way to do this job.

Is this procedure easily available please?

Thanks

Clive.

On Fri, 6 Feb 2009 03:49:52 -0800, Albert-jan Roskam <[hidden email]>
wrote:

>Hi all,
>
>I need to compare two datasets, D1 and D2. These sets are the result of
two linkage strategies, so they're each comprised of two datasets, A and B.
A and B each have two id variables.
>
> I want to evaluate to what extent the two linkage strategies have led to
different linkage pairs. In other words: what is the intersection between
D1 and D2, what is the difference, and what do the differences look like.
>
>I've been thinking about using either UPDATE or MATCH FILES for this,
something like:
>UPDATE FILE = d1 / RENAME (id_b = id_b_d1) / FILE = d2 / RENAME (id_b =
id_b_d2) / BY = id_a .
>COMPUTE intersection = ( CONCAT (id_a, id_b_d1) = CONCAT (id_a, id_b_d2) ).
>
>Does this make sense? I'm getting kinda confused.
>
>Oh, and unfortunately I can't use SPSSINC COMPAREDATASETS. :-( *grinding
his teeth* ;-)
>
>Cheers!!
>Albert-Jan
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Peck, Jon

Re: comparing two datasets

If you have Version 17, you can use the SPSSINC COMPARE DATASETS extension command. If you have Version 16, you can use the COMPDS extension command, which is similar but lacks a few features and does not have a dialog box interface.

These can be downloaded from SPSS Developer Central (www.spss.com/devcentral). They require the Python programmability plug-in. Installation instructions are in the download.

The commands can compare the variable dictionaries and/or the cases in two datasets (you must open the data files and name them in SPSS before calling either of these commands). I must warn you that these commands are slow with wide datasets, so be patient. We are working on improving performance in the underlying Dataset class used by these commands.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Clive Downs
Sent: Wednesday, February 11, 2009 7:07 AM
To: [hidden email]
Subject: Re: [SPSSX-L] comparing two datasets

Hi

I, too, have two datasets that I want to compare to see if they are
identical.

While I can see a way to do this in SPSS syntax (by merging files, adding
variables, and looping thru each pair of variables, testing if they are
the same), "SPSSINC COMPAREDATASETS" sound as though it might be a smarter
way to do this job.

Is this procedure easily available please?

Thanks

Clive.

On Fri, 6 Feb 2009 03:49:52 -0800, Albert-jan Roskam <[hidden email]>
wrote:

>Hi all,
>
>I need to compare two datasets, D1 and D2. These sets are the result of
two linkage strategies, so they're each comprised of two datasets, A and B.
A and B each have two id variables.
>
> I want to evaluate to what extent the two linkage strategies have led to
different linkage pairs. In other words: what is the intersection between
D1 and D2, what is the difference, and what do the differences look like.
>
>I've been thinking about using either UPDATE or MATCH FILES for this,
something like:
>UPDATE FILE = d1 / RENAME (id_b = id_b_d1) / FILE = d2 / RENAME (id_b =
id_b_d2) / BY = id_a .
>COMPUTE intersection = ( CONCAT (id_a, id_b_d1) = CONCAT (id_a, id_b_d2) ).
>
>Does this make sense? I'm getting kinda confused.
>
>Oh, and unfortunately I can't use SPSSINC COMPAREDATASETS. :-( *grinding
his teeth* ;-)
>
>Cheers!!
>Albert-Jan
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD