Encrypting question

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Encrypting question

Jeff-125
...hard to figure out how to word this one concisely.

I have some data I'm collecting with software other than Spss, but
after collection it will be analyzed in spss.

I have one variable that is a numeric identifier that is somewhat of
a sensitive nature, because it could potentially be used to link back
to a person in the unlikely event of some type of improper disclosure.

I think, but still have to confirm, that the identifier is completely
numeric, but it may contain characters - possibly even a hyphen or similar.

We need this number to match the new data that I'm collecting with
existing data that also contains the number.

What I thought I would do is to have the other software that will be
used for data collection alter the identifier in some known way (it
really doesn't have to be all that complex), and then do the same for
the existing data. Both sets can then be stored with the altered
identifier and merged as necessary, but neither will contain the
original identifier.

As long as the identifier in both data sets is altered using the same
function/algorithm, the merge/match will work fine and anyone (e.g.,
students/staff) working with me will be completely unaware of the
original number.

...but of course, I will need to know the function/algorithm so that
I can repeat the process and/or return the identifier to its original
form if necessary.

This can all be done manually, of course, but I was wondering if
there was some type of function built-in to spss that could do this
more easily, or if there were other standardized methods to
accomplish what I've described. The encrypting algorithm really does
not need to be all that complex since the data won't be used
publicly, but complexity wouldn't hurt as long as I can easily
program the procedure into different software types outside of spss.

Ideas?

Thanks

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: Encrypting question

Albert-Jan Roskam
Hi,

I asked myself a similar question recently, when I
heard about time-consuming procedures with Trusted
Third Parties (TTPs) at my work. Using some code I
found on spss-l I came up with this syntax. I never
got to use it, so please test it.

** generate some fake secret numbers.
set seed = 5463711.
input program.
numeric secret (N12).
+ loop #i = 1 to 10E3.
+ compute secret = trunc(rv.uniform(10E0,10E10)).
+ end case.
+ end loop.
+ end file.
end input program.

** The actual program.
set seed = 5463711.
numeric encrypted (N12).
compute encrypted = trunc(rv.uniform(10E0,10E10)).
** This file is highly confidential and can be used to
link anonymized and unanonymized data.
save outfile = 'd:\temp\master.sav'
    / keep = secret encrypted.
** This file is 'public' = anonymized.
sort cases by encrypted (a).
* save outfile = 'd:\temp\encrypted.sav' / drop =
secret.

** Verify if there are any replicates in the
anonymized key.
get file = 'd:\temp\master.sav'.
compute dummy = 1.
aggregate outfile = * / break = encrypted / dummy =
sum (dummy) / n = n.
* dummy and n should have identical values.

Btw, has anybody ever tried to implement RSA encyption
in SPSS? Based on some web article, I came up with
this, but computational limitations pose a problem.

** generate some fake secret numbers.
set seed = 5463711.
input program.
numeric secret (N12).
+ loop #i = 1 to 10E3.
+ compute secret = trunc(rv.uniform(10E0,10E10)).
+ end case.
+ end loop.
+ end file.
end input program.

** Encrypt variable 'sofi' according to RSA method.
compute #prime1 = 10000000019.
compute #prime2 = 10000000033.
compute #public = #prime1 * #prime2.
compute #prime12 = (#prime1 - 1) * (#prime2 - 1).
compute encrypt = (secret**#prime12) *
(mod(#public,1)).
compute decrypt = (encrypt**#public) * (mod
(#public,1)).
exe.

Cheers!!
Albert-Jan




--- Jeff <[hidden email]> wrote:

> ...hard to figure out how to word this one
> concisely.
>
> I have some data I'm collecting with software other
> than Spss, but
> after collection it will be analyzed in spss.
>
> I have one variable that is a numeric identifier
> that is somewhat of
> a sensitive nature, because it could potentially be
> used to link back
> to a person in the unlikely event of some type of
> improper disclosure.
>
> I think, but still have to confirm, that the
> identifier is completely
> numeric, but it may contain characters - possibly
> even a hyphen or similar.
>
> We need this number to match the new data that I'm
> collecting with
> existing data that also contains the number.
>
> What I thought I would do is to have the other
> software that will be
> used for data collection alter the identifier in
> some known way (it
> really doesn't have to be all that complex), and
> then do the same for
> the existing data. Both sets can then be stored with
> the altered
> identifier and merged as necessary, but neither will
> contain the
> original identifier.
>
> As long as the identifier in both data sets is
> altered using the same
> function/algorithm, the merge/match will work fine
> and anyone (e.g.,
> students/staff) working with me will be completely
> unaware of the
> original number.
>
> ...but of course, I will need to know the
> function/algorithm so that
> I can repeat the process and/or return the
> identifier to its original
> form if necessary.
>
> This can all be done manually, of course, but I was
> wondering if
> there was some type of function built-in to spss
> that could do this
> more easily, or if there were other standardized
> methods to
> accomplish what I've described. The encrypting
> algorithm really does
> not need to be all that complex since the data won't
> be used
> publicly, but complexity wouldn't hurt as long as I
> can easily
> program the procedure into different software types
> outside of spss.
>
> Ideas?
>
> Thanks
>
> Jeff
>


Cheers!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Did you know that 87.166253% of all statistics claim a precision of results that is not justified by the method employed? [HELMUT RICHTER]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


      ___________________________________________________________________________________
You snooze, you lose. Get messages ASAP with AutoCheck
in the all-new Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_html.html
Reply | Threaded
Open this post in threaded view
|

Re: Encrypting question

ViAnn Beadle
This algorithm is case-order sensitive so therefore not a good way to create
an id that has to be matched across files (if I understand the original
question). I think the solution sought must be some way to transform an
existing ID in a manner than can be repeated across files. Perhaps the OP is
looking for a way to obfuscate a social security number so that it doesn't
look like a social security number?

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Albert-jan Roskam
Sent: Saturday, June 02, 2007 4:27 AM
To: [hidden email]
Subject: Re: Encrypting question

Hi,

I asked myself a similar question recently, when I
heard about time-consuming procedures with Trusted
Third Parties (TTPs) at my work. Using some code I
found on spss-l I came up with this syntax. I never
got to use it, so please test it.

** generate some fake secret numbers.
set seed = 5463711.
input program.
numeric secret (N12).
+ loop #i = 1 to 10E3.
+ compute secret = trunc(rv.uniform(10E0,10E10)).
+ end case.
+ end loop.
+ end file.
end input program.

** The actual program.
set seed = 5463711.
numeric encrypted (N12).
compute encrypted = trunc(rv.uniform(10E0,10E10)).
** This file is highly confidential and can be used to
link anonymized and unanonymized data.
save outfile = 'd:\temp\master.sav'
    / keep = secret encrypted.
** This file is 'public' = anonymized.
sort cases by encrypted (a).
* save outfile = 'd:\temp\encrypted.sav' / drop =
secret.

** Verify if there are any replicates in the
anonymized key.
get file = 'd:\temp\master.sav'.
compute dummy = 1.
aggregate outfile = * / break = encrypted / dummy =
sum (dummy) / n = n.
* dummy and n should have identical values.

Btw, has anybody ever tried to implement RSA encyption
in SPSS? Based on some web article, I came up with
this, but computational limitations pose a problem.

** generate some fake secret numbers.
set seed = 5463711.
input program.
numeric secret (N12).
+ loop #i = 1 to 10E3.
+ compute secret = trunc(rv.uniform(10E0,10E10)).
+ end case.
+ end loop.
+ end file.
end input program.

** Encrypt variable 'sofi' according to RSA method.
compute #prime1 = 10000000019.
compute #prime2 = 10000000033.
compute #public = #prime1 * #prime2.
compute #prime12 = (#prime1 - 1) * (#prime2 - 1).
compute encrypt = (secret**#prime12) *
(mod(#public,1)).
compute decrypt = (encrypt**#public) * (mod
(#public,1)).
exe.

Cheers!!
Albert-Jan




--- Jeff <[hidden email]> wrote:

> ...hard to figure out how to word this one
> concisely.
>
> I have some data I'm collecting with software other
> than Spss, but
> after collection it will be analyzed in spss.
>
> I have one variable that is a numeric identifier
> that is somewhat of
> a sensitive nature, because it could potentially be
> used to link back
> to a person in the unlikely event of some type of
> improper disclosure.
>
> I think, but still have to confirm, that the
> identifier is completely
> numeric, but it may contain characters - possibly
> even a hyphen or similar.
>
> We need this number to match the new data that I'm
> collecting with
> existing data that also contains the number.
>
> What I thought I would do is to have the other
> software that will be
> used for data collection alter the identifier in
> some known way (it
> really doesn't have to be all that complex), and
> then do the same for
> the existing data. Both sets can then be stored with
> the altered
> identifier and merged as necessary, but neither will
> contain the
> original identifier.
>
> As long as the identifier in both data sets is
> altered using the same
> function/algorithm, the merge/match will work fine
> and anyone (e.g.,
> students/staff) working with me will be completely
> unaware of the
> original number.
>
> ...but of course, I will need to know the
> function/algorithm so that
> I can repeat the process and/or return the
> identifier to its original
> form if necessary.
>
> This can all be done manually, of course, but I was
> wondering if
> there was some type of function built-in to spss
> that could do this
> more easily, or if there were other standardized
> methods to
> accomplish what I've described. The encrypting
> algorithm really does
> not need to be all that complex since the data won't
> be used
> publicly, but complexity wouldn't hurt as long as I
> can easily
> program the procedure into different software types
> outside of spss.
>
> Ideas?
>
> Thanks
>
> Jeff
>


Cheers!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Did you know that 87.166253% of all statistics claim a precision of results
that is not justified by the method employed? [HELMUT RICHTER]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



____________________________________________________________________________
_______
You snooze, you lose. Get messages ASAP with AutoCheck
in the all-new Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_html.html
Reply | Threaded
Open this post in threaded view
|

Re: Encrypting question

Peck, Jon
In reply to this post by Jeff-125
The best way to do this would be to encrypt the identifier using public key cryptography.  The algorithm is deterministic so you can encrypt new ids as they appear and match against anything you already have.  Decrypting woudl require the secret key.
 
If you can use SPSS Python programmability, you can take advantage of the Python cryptography modules.  Probably ezPyCryto would be a good thing to look at.
(http://www.freenet.org.nz/ezPyCrypto)
 
I haven't used these myself, by ezPyCrypto is supposed to be, well, easy.  With just a few lines of code, you are ready to go.  You could hook this up with the trans module from SPSS Developer Central, which would apply the encryption function as a transformation in SPSS.
 
HTH,
Jon Peck
SPSS

________________________________

From: SPSSX(r) Discussion on behalf of Jeff
Sent: Fri 6/1/2007 6:33 PM
To: [hidden email]
Subject: [SPSSX-L] Encrypting question



...hard to figure out how to word this one concisely.

I have some data I'm collecting with software other than Spss, but
after collection it will be analyzed in spss.

I have one variable that is a numeric identifier that is somewhat of
a sensitive nature, because it could potentially be used to link back
to a person in the unlikely event of some type of improper disclosure.

I think, but still have to confirm, that the identifier is completely
numeric, but it may contain characters - possibly even a hyphen or similar.

We need this number to match the new data that I'm collecting with
existing data that also contains the number.

What I thought I would do is to have the other software that will be
used for data collection alter the identifier in some known way (it
really doesn't have to be all that complex), and then do the same for
the existing data. Both sets can then be stored with the altered
identifier and merged as necessary, but neither will contain the
original identifier.

As long as the identifier in both data sets is altered using the same
function/algorithm, the merge/match will work fine and anyone (e.g.,
students/staff) working with me will be completely unaware of the
original number.

...but of course, I will need to know the function/algorithm so that
I can repeat the process and/or return the identifier to its original
form if necessary.

This can all be done manually, of course, but I was wondering if
there was some type of function built-in to spss that could do this
more easily, or if there were other standardized methods to
accomplish what I've described. The encrypting algorithm really does
not need to be all that complex since the data won't be used
publicly, but complexity wouldn't hurt as long as I can easily
program the procedure into different software types outside of spss.

Ideas?

Thanks

Jeff
Reply | Threaded
Open this post in threaded view
|

Dyadic Analysis

Pirritano, Matthew
Hello all,

Is anyone out there familiar with dyadic analysis?  I'm reading this book by Kenny, Kashy, and Cook, "Dyadic Analysis" and I'm still unclear on what my dependent variable should be in my particular dyadic analysis. My analysis involves a between-dyads variable (4 groups of coping styles) and a within-dyads variable (biological sex).

There are 4 groups of couple coping style situations (between-dyads variable):

1) Both partners are high on a coping style. They use it a lot.

2) Both partners are low.

3) Males are high and females are low

4) Females are high and males are low


Kenny et al. state what your DV should be in the case of a between-dyads analysis, or a within-dyads analysis, and they give an example of the case in which you have one within and one between dyads variable, but they do not state what the DV should be in that case. Should it be an average across partners, or a difference, or something else?

Any help would be much appreciated.

Thanks
Matt

Matthew Pirritano, Ph.D.
Assistant Professor of Psychology
Smith Hall 116C
Chapman University
Department of Psychology
One University Drive
Orange, CA 92866
Telephone (714)744-7940
FAX (714)997-6780



-----Original Message-----
From: SPSSX(r) Discussion on behalf of Peck, Jon
Sent: Sat 6/2/2007 3:23 PM
To: [hidden email]
Subject:      Re: Encrypting question

The best way to do this would be to encrypt the identifier using public key cryptography.  The algorithm is deterministic so you can encrypt new ids as they appear and match against anything you already have.  Decrypting woudl require the secret key.

If you can use SPSS Python programmability, you can take advantage of the Python cryptography modules.  Probably ezPyCryto would be a good thing to look at.
(http://www.freenet.org.nz/ezPyCrypto)

I haven't used these myself, by ezPyCrypto is supposed to be, well, easy.  With just a few lines of code, you are ready to go.  You could hook this up with the trans module from SPSS Developer Central, which would apply the encryption function as a transformation in SPSS.

HTH,
Jon Peck
SPSS

________________________________

From: SPSSX(r) Discussion on behalf of Jeff
Sent: Fri 6/1/2007 6:33 PM
To: [hidden email]
Subject: [SPSSX-L] Encrypting question



...hard to figure out how to word this one concisely.

I have some data I'm collecting with software other than Spss, but
after collection it will be analyzed in spss.

I have one variable that is a numeric identifier that is somewhat of
a sensitive nature, because it could potentially be used to link back
to a person in the unlikely event of some type of improper disclosure.

I think, but still have to confirm, that the identifier is completely
numeric, but it may contain characters - possibly even a hyphen or similar.

We need this number to match the new data that I'm collecting with
existing data that also contains the number.

What I thought I would do is to have the other software that will be
used for data collection alter the identifier in some known way (it
really doesn't have to be all that complex), and then do the same for
the existing data. Both sets can then be stored with the altered
identifier and merged as necessary, but neither will contain the
original identifier.

As long as the identifier in both data sets is altered using the same
function/algorithm, the merge/match will work fine and anyone (e.g.,
students/staff) working with me will be completely unaware of the
original number.

...but of course, I will need to know the function/algorithm so that
I can repeat the process and/or return the identifier to its original
form if necessary.

This can all be done manually, of course, but I was wondering if
there was some type of function built-in to spss that could do this
more easily, or if there were other standardized methods to
accomplish what I've described. The encrypting algorithm really does
not need to be all that complex since the data won't be used
publicly, but complexity wouldn't hurt as long as I can easily
program the procedure into different software types outside of spss.

Ideas?

Thanks

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: Dyadic Analysis

Dale Glaser
Mathew.......an example from which I have conducted dyadic analysis is, e.g, a Marital Satisfaction survey where both husband and wife have responded to the same survey, and in this case, spouse is nested within couple/dyad...so in this case the DV is a summated scale score of the survey.......now, and it has been awhile since I did this analysis, since n(i) = 2, for the slopes and intercpets model you aren't going to have much variation at the j-couple level............(i.e, the macro level)......so (and please someone on this list correct me) I believe you would only be interested in the stochastic parameter for the intercepts and not necessarily the slopes...............in fact, to make sure I wasn't a victim of poor recall I just checked on p. 89 in Kenny et al and indeed ""slopes must be constrain3d to be equal across all dyads" but the stochastic parameter for the intercepts are of interest.

  hope this helps....dale

"Pirritano, Matthew" <[hidden email]> wrote:
  Hello all,

Is anyone out there familiar with dyadic analysis? I'm reading this book by Kenny, Kashy, and Cook, "Dyadic Analysis" and I'm still unclear on what my dependent variable should be in my particular dyadic analysis. My analysis involves a between-dyads variable (4 groups of coping styles) and a within-dyads variable (biological sex).

There are 4 groups of couple coping style situations (between-dyads variable):

1) Both partners are high on a coping style. They use it a lot.

2) Both partners are low.

3) Males are high and females are low

4) Females are high and males are low


Kenny et al. state what your DV should be in the case of a between-dyads analysis, or a within-dyads analysis, and they give an example of the case in which you have one within and one between dyads variable, but they do not state what the DV should be in that case. Should it be an average across partners, or a difference, or something else?

Any help would be much appreciated.

Thanks
Matt

Matthew Pirritano, Ph.D.
Assistant Professor of Psychology
Smith Hall 116C
Chapman University
Department of Psychology
One University Drive
Orange, CA 92866
Telephone (714)744-7940
FAX (714)997-6780



-----Original Message-----
From: SPSSX(r) Discussion on behalf of Peck, Jon
Sent: Sat 6/2/2007 3:23 PM
To: [hidden email]
Subject: Re: Encrypting question

The best way to do this would be to encrypt the identifier using public key cryptography. The algorithm is deterministic so you can encrypt new ids as they appear and match against anything you already have. Decrypting woudl require the secret key.

If you can use SPSS Python programmability, you can take advantage of the Python cryptography modules. Probably ezPyCryto would be a good thing to look at.
(http://www.freenet.org.nz/ezPyCrypto)

I haven't used these myself, by ezPyCrypto is supposed to be, well, easy. With just a few lines of code, you are ready to go. You could hook this up with the trans module from SPSS Developer Central, which would apply the encryption function as a transformation in SPSS.

HTH,
Jon Peck
SPSS

________________________________

From: SPSSX(r) Discussion on behalf of Jeff
Sent: Fri 6/1/2007 6:33 PM
To: [hidden email]
Subject: [SPSSX-L] Encrypting question



...hard to figure out how to word this one concisely.

I have some data I'm collecting with software other than Spss, but
after collection it will be analyzed in spss.

I have one variable that is a numeric identifier that is somewhat of
a sensitive nature, because it could potentially be used to link back
to a person in the unlikely event of some type of improper disclosure.

I think, but still have to confirm, that the identifier is completely
numeric, but it may contain characters - possibly even a hyphen or similar.

We need this number to match the new data that I'm collecting with
existing data that also contains the number.

What I thought I would do is to have the other software that will be
used for data collection alter the identifier in some known way (it
really doesn't have to be all that complex), and then do the same for
the existing data. Both sets can then be stored with the altered
identifier and merged as necessary, but neither will contain the
original identifier.

As long as the identifier in both data sets is altered using the same
function/algorithm, the merge/match will work fine and anyone (e.g.,
students/staff) working with me will be completely unaware of the
original number.

...but of course, I will need to know the function/algorithm so that
I can repeat the process and/or return the identifier to its original
form if necessary.

This can all be done manually, of course, but I was wondering if
there was some type of function built-in to spss that could do this
more easily, or if there were other standardized methods to
accomplish what I've described. The encrypting algorithm really does
not need to be all that complex since the data won't be used
publicly, but complexity wouldn't hurt as long as I can easily
program the procedure into different software types outside of spss.

Ideas?

Thanks

Jeff



Dale Glaser, Ph.D.
Principal--Glaser Consulting
Lecturer/Adjunct Faculty--SDSU/USD/AIU
President-Elect, San Diego Chapter of
American Statistical Association
3115 4th Avenue
San Diego, CA 92103
phone: 619-220-0602
fax: 619-220-0412
email: [hidden email]
website: www.glaserconsult.com
Reply | Threaded
Open this post in threaded view
|

Dyadic Analysis

Pirritano, Matthew
In reply to this post by Pirritano, Matthew
Dale and everybody,

I wound up doing two separate regressions.  One with the difference
score on the outcome measure as the DV and one with the sum as the DV.
This is what my reading of Kenny, Kashy, and Cook (2006) recommends. The
regression on the difference score provides evidence of the effect of
the within-dyads effect (sex in my case) and each of the effect coded
independent variables (three variables for my between-dyads variable
that has 4 levels) provided info on the existence of interactions of the
within-dyads variable at the different levels of the grouping variable.
I'm pretty confident about this.

Any thoughts?

Thanks,
Matt

Matthew Pirritano, Ph.D.
Assistant Professor of Psychology
Smith Hall 116C
Chapman University
Department of Psychology
One University Drive
Orange, CA 92866
Telephone (714)744-7940
FAX (714)997-6780

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Dale Glaser
Sent: Monday, June 04, 2007 12:36 PM
To: [hidden email]
Subject: Re: Dyadic Analysis

Mathew.......an example from which I have conducted dyadic analysis is,
e.g, a Marital Satisfaction survey where both husband and wife have
responded to the same survey, and in this case, spouse is nested within
couple/dyad...so in this case the DV is a summated scale score of the
survey.......now, and it has been awhile since I did this analysis,
since n(i) = 2, for the slopes and intercpets model you aren't going to
have much variation at the j-couple level............(i.e, the macro
level)......so (and please someone on this list correct me) I believe
you would only be interested in the stochastic parameter for the
intercepts and not necessarily the slopes...............in fact, to make
sure I wasn't a victim of poor recall I just checked on p. 89 in Kenny
et al and indeed ""slopes must be constrain3d to be equal across all
dyads" but the stochastic parameter for the intercepts are of interest.

  hope this helps....dale

"Pirritano, Matthew" <[hidden email]> wrote:
  Hello all,

Is anyone out there familiar with dyadic analysis? I'm reading this book
by Kenny, Kashy, and Cook, "Dyadic Analysis" and I'm still unclear on
what my dependent variable should be in my particular dyadic analysis.
My analysis involves a between-dyads variable (4 groups of coping
styles) and a within-dyads variable (biological sex).

There are 4 groups of couple coping style situations (between-dyads
variable):

1) Both partners are high on a coping style. They use it a lot.

2) Both partners are low.

3) Males are high and females are low

4) Females are high and males are low


Kenny et al. state what your DV should be in the case of a between-dyads
analysis, or a within-dyads analysis, and they give an example of the
case in which you have one within and one between dyads variable, but
they do not state what the DV should be in that case. Should it be an
average across partners, or a difference, or something else?

Any help would be much appreciated.

Thanks
Matt

Matthew Pirritano, Ph.D.
Assistant Professor of Psychology
Smith Hall 116C
Chapman University
Department of Psychology
One University Drive
Orange, CA 92866
Telephone (714)744-7940
FAX (714)997-6780



-----Original Message-----
From: SPSSX(r) Discussion on behalf of Peck, Jon
Sent: Sat 6/2/2007 3:23 PM
To: [hidden email]
Subject: Re: Encrypting question

The best way to do this would be to encrypt the identifier using public
key cryptography. The algorithm is deterministic so you can encrypt new
ids as they appear and match against anything you already have.
Decrypting woudl require the secret key.

If you can use SPSS Python programmability, you can take advantage of
the Python cryptography modules. Probably ezPyCryto would be a good
thing to look at.
(http://www.freenet.org.nz/ezPyCrypto)

I haven't used these myself, by ezPyCrypto is supposed to be, well,
easy. With just a few lines of code, you are ready to go. You could hook
this up with the trans module from SPSS Developer Central, which would
apply the encryption function as a transformation in SPSS.

HTH,
Jon Peck
SPSS

________________________________

From: SPSSX(r) Discussion on behalf of Jeff
Sent: Fri 6/1/2007 6:33 PM
To: [hidden email]
Subject: [SPSSX-L] Encrypting question



...hard to figure out how to word this one concisely.

I have some data I'm collecting with software other than Spss, but
after collection it will be analyzed in spss.

I have one variable that is a numeric identifier that is somewhat of
a sensitive nature, because it could potentially be used to link back
to a person in the unlikely event of some type of improper disclosure.

I think, but still have to confirm, that the identifier is completely
numeric, but it may contain characters - possibly even a hyphen or
similar.

We need this number to match the new data that I'm collecting with
existing data that also contains the number.

What I thought I would do is to have the other software that will be
used for data collection alter the identifier in some known way (it
really doesn't have to be all that complex), and then do the same for
the existing data. Both sets can then be stored with the altered
identifier and merged as necessary, but neither will contain the
original identifier.

As long as the identifier in both data sets is altered using the same
function/algorithm, the merge/match will work fine and anyone (e.g.,
students/staff) working with me will be completely unaware of the
original number.

...but of course, I will need to know the function/algorithm so that
I can repeat the process and/or return the identifier to its original
form if necessary.

This can all be done manually, of course, but I was wondering if
there was some type of function built-in to spss that could do this
more easily, or if there were other standardized methods to
accomplish what I've described. The encrypting algorithm really does
not need to be all that complex since the data won't be used
publicly, but complexity wouldn't hurt as long as I can easily
program the procedure into different software types outside of spss.

Ideas?

Thanks

Jeff



Dale Glaser, Ph.D.
Principal--Glaser Consulting
Lecturer/Adjunct Faculty--SDSU/USD/AIU
President-Elect, San Diego Chapter of
American Statistical Association
3115 4th Avenue
San Diego, CA 92103
phone: 619-220-0602
fax: 619-220-0412
email: [hidden email]
website: www.glaserconsult.com
Reply | Threaded
Open this post in threaded view
|

Dyadic Analysis

Mcduff Pierre
In reply to this post by Pirritano, Matthew
I'm not sure that you need 2 separate regressions.  
From the example found in the following article,  
you should be able to do it with a mixed model.  
See Campbell, L. & Kashy, D.A.(2002).
Estimating actor, partner, and interaction effects for dyadic
data using Proc Mixed and HLM: a user-friendly guide.
Personal relationships. pp 327-342.
 
Hope it helps
 
Dale and everybody,

I wound up doing two separate regressions. One with the difference

score on the outcome measure as the DV and one with the sum as the DV.

This is what my reading of Kenny, Kashy, and Cook (2006) recommends. The

regression on the difference score provides evidence of the effect of

the within-dyads effect (sex in my case) and each of the effect coded

independent variables (three variables for my between-dyads variable

that has 4 levels) provided info on the existence of interactions of the

within-dyads variable at the different levels of the grouping variable.

I'm pretty confident about this.

Any thoughts?

Thanks,

Matt
Reply | Threaded
Open this post in threaded view
|

Re: Dyadic Analysis

paraplu
Dear friends,

I have lots of doubts with this dyad analysis but at least I'd like you to help me on two of them. One is that I have a condition (agresive and non-agressive behavior) that I assigned randomly per person per dyad, so now I have one dyad with each memeber in an agressive condition, other dyad with each member in a non-agressive condition, and a third dyad with one member in the agressive condition and the other one in non-agressive. Then I coded the first dyad as 1,1, the second as -1,-1 and the third as 1,-1. When I made the mean for all  variables this condition (my IV) became 1, -1, 0; and for the difference in 0,0,2. And now, how can I interpret a positive or negative actor effect or partner effect?. May I have to compare two by two instead of the three of them at once? Did I coded my IV right?

My second question, or better say problem, is how to interpret an interaction effect. First, I made the cross product of two variables, then per dyad I got the mean and the difference. After that and following Aiken & West, I enter in both between and within regression  first the IV then the moderator and third the cross product. When I made the analyses with the unstandardized coefficients of the cross products I got a significant partner effect of .357. How can I interpret this result? Did I have to get the others coeficients to calculate something? How can I know what's happening in high or low levels of the moderator?, how can I calculate it?

I pride someone of you can help me with it, I have to finish my thesis and noone at my faculty has no clue to solve it!!!???

Thank you very much just for read this long message!!!


Mcduff Pierre wrote
I'm not sure that you need 2 separate regressions.  
From the example found in the following article,  
you should be able to do it with a mixed model.  
See Campbell, L. & Kashy, D.A.(2002).
Estimating actor, partner, and interaction effects for dyadic
data using Proc Mixed and HLM: a user-friendly guide.
Personal relationships. pp 327-342.
 
Hope it helps
 
Dale and everybody,

I wound up doing two separate regressions. One with the difference

score on the outcome measure as the DV and one with the sum as the DV.

This is what my reading of Kenny, Kashy, and Cook (2006) recommends. The

regression on the difference score provides evidence of the effect of

the within-dyads effect (sex in my case) and each of the effect coded

independent variables (three variables for my between-dyads variable

that has 4 levels) provided info on the existence of interactions of the

within-dyads variable at the different levels of the grouping variable.

I'm pretty confident about this.

Any thoughts?

Thanks,

Matt