|
Hi SPSSers,
I'm having trouble with *apparent* data discrepancies following a file restructuring via CASESTOVARS. I saved my unrestructured file, opened it and saved it as a new file, then did my restructuring on the new file. My unrestructured file had a structure in which each diagnosis received by a patient was a record. This is a case-control study, so I counted the number of cases and controls before restructuring, and re-counted the unduplicated diagnosis frequency (unduplicated for the same patient, that is). This was the file structure (CA_CON indicates case or control status): Pt_ID CA_CON undup_dx 10023 CA headache 10023 CA abd pain 10023 CA burns 10036 CO chest pain 10036 CO headache 10047 CO fx lower limb 10047 CO asthma 10047 CO fx upper limb 10049 CO other GI 10049 CO abd pain I used this syntax to restructure my file: CASESTOVARS /ID=Pt_ID /rename undup_dx=dx /fixed=CA_CON. freq var CA_CON. ....and it seemed to work. I got, as intended, a string of variables -- dx.1 to dx.29 (one patient had 29 dx's): Pt_ID CA_CON dx.1 dx.2 dx.3 10023 CA headache abd pain burns 10036 CO chest pain headache 10047 CO fx lower limb asthma fx upper limb 10049 CO other GI abd pain ....etc. Next I set up a string of dummy variables, each corresponding to one of the diagnostic categories. that will hold dichotomous values (1/0) for presence or absence of the diagnosis for that patient. After initializing all these variables' values to 0, I used the syntax if any(dx.1 TO dx.29,"abd pain") dx_abd=1. if any(dx.1 TO dx.29,"chest pain") dx_cp=1. [etc.] After entering a few of these I ran crosstabs (e.g. dx_abd by CA_CON) .to see if they matched the diagnostic frequencies I had (via crosstabs for undup_dx by CA_CON) before the CASESTOVARS restructure. They don't match - current n's are much lower . I already know I have the same number of cases and controls as before restructuring, so that's not the problem. Perhaps my "any" syntax is? I appreciate any insights on this. I'm stumped. Tanya Temkin Research Associate AACC Reporting Northern California Regional Office The Permanente Medical Group (510) 625-6680 NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Ooops -- my bad. Bad syntax, that is. I realized (right after I posted my
previous note) what I should've been using wasn't IF ANY at all. I tried do repeat dxa=dx.1 TO dx.29. if dxa="abdominal pain" dx_abd=1. end repeat. crosstabs tables dx_abd by CA_CON. Much much better. I got the N's I expected, same as in unrestructured file. Tanya Temkin Research Associate AACC Reporting Northern California Regional Office The Permanente Medical Group (510) 625-6680 NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. ----- Forwarded by Tanya L TemKin/CA/KAIPERM on 10/25/2007 04:48 PM ----- Tanya L TemKin/CA/KAIPERM 10/25/2007 03:03 PM To [hidden email] cc Subject different frequencies before and after CASESTOVARS Hi SPSSers, I'm having trouble with *apparent* data discrepancies following a file restructuring via CASESTOVARS. I saved my unrestructured file, opened it and saved it as a new file, then did my restructuring on the new file. My unrestructured file had a structure in which each diagnosis received by a patient was a record. This is a case-control study, so I counted the number of cases and controls before restructuring, and re-counted the unduplicated diagnosis frequency (unduplicated for the same patient, that is). This was the file structure (CA_CON indicates case or control status): Pt_ID CA_CON undup_dx 10023 CA headache 10023 CA abd pain 10023 CA burns 10036 CO chest pain 10036 CO headache 10047 CO fx lower limb 10047 CO asthma 10047 CO fx upper limb 10049 CO other GI 10049 CO abd pain I used this syntax to restructure my file: CASESTOVARS /ID=Pt_ID /rename undup_dx=dx /fixed=CA_CON. freq var CA_CON. ....and it seemed to work. I got, as intended, a string of variables -- dx.1 to dx.29 (one patient had 29 dx's): Pt_ID CA_CON dx.1 dx.2 dx.3 10023 CA headache abd pain burns 10036 CO chest pain headache 10047 CO fx lower limb asthma fx upper limb 10049 CO other GI abd pain ....etc. Next I set up a string of dummy variables, each corresponding to one of the diagnostic categories. that will hold dichotomous values (1/0) for presence or absence of the diagnosis for that patient. After initializing all these variables' values to 0, I used the syntax if any(dx.1 TO dx.29,"abd pain") dx_abd=1. if any(dx.1 TO dx.29,"chest pain") dx_cp=1. [etc.] After entering a few of these I ran crosstabs (e.g. dx_abd by CA_CON) .to see if they matched the diagnostic frequencies I had (via crosstabs for undup_dx by CA_CON) before the CASESTOVARS restructure. They don't match - current n's are much lower . I already know I have the same number of cases and controls as before restructuring, so that's not the problem. Perhaps my "any" syntax is? I appreciate any insights on this. I'm stumped. Tanya Temkin Research Associate AACC Reporting Northern California Regional Office The Permanente Medical Group (510) 625-6680 NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Tanya Temkin
ANY should have worked, however it looks like you had the arguments
backwards--should have the test value (i.e. the diagnosis text, first followed by the variable list. The syntax "if any(dx.1 TO dx.29,"abd pain")." look for any value matching dx.1 in dx.2 to dx.29 and in abd pain. This would have worked better: if any("abd pain",dx.1 TO dx.29). Per SPSS help.... ANY. ANY(test,value[,value,...]). Logical. Returns 1 or true if the value of test matches any of the subsequent values; returns 0 or false otherwise. This function requires two or more arguments. Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Tanya Temkin Sent: Thursday, October 25, 2007 6:54 PM To: [hidden email] Subject: Re: [SPSSX-L] different frequencies before and after CASESTOVARS Ooops -- my bad. Bad syntax, that is. I realized (right after I posted my previous note) what I should've been using wasn't IF ANY at all. I tried do repeat dxa=dx.1 TO dx.29. if dxa="abdominal pain" dx_abd=1. end repeat. crosstabs tables dx_abd by CA_CON. Much much better. I got the N's I expected, same as in unrestructured file. Tanya Temkin Research Associate AACC Reporting Northern California Regional Office The Permanente Medical Group (510) 625-6680 NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. ----- Forwarded by Tanya L TemKin/CA/KAIPERM on 10/25/2007 04:48 PM ----- Tanya L TemKin/CA/KAIPERM 10/25/2007 03:03 PM To [hidden email] cc Subject different frequencies before and after CASESTOVARS Hi SPSSers, I'm having trouble with *apparent* data discrepancies following a file restructuring via CASESTOVARS. I saved my unrestructured file, opened it and saved it as a new file, then did my restructuring on the new file. My unrestructured file had a structure in which each diagnosis received by a patient was a record. This is a case-control study, so I counted the number of cases and controls before restructuring, and re-counted the unduplicated diagnosis frequency (unduplicated for the same patient, that is). This was the file structure (CA_CON indicates case or control status): Pt_ID CA_CON undup_dx 10023 CA headache 10023 CA abd pain 10023 CA burns 10036 CO chest pain 10036 CO headache 10047 CO fx lower limb 10047 CO asthma 10047 CO fx upper limb 10049 CO other GI 10049 CO abd pain I used this syntax to restructure my file: CASESTOVARS /ID=Pt_ID /rename undup_dx=dx /fixed=CA_CON. freq var CA_CON. ....and it seemed to work. I got, as intended, a string of variables -- dx.1 to dx.29 (one patient had 29 dx's): Pt_ID CA_CON dx.1 dx.2 dx.3 10023 CA headache abd pain burns 10036 CO chest pain headache 10047 CO fx lower limb asthma fx upper limb 10049 CO other GI abd pain ....etc. Next I set up a string of dummy variables, each corresponding to one of the diagnostic categories. that will hold dichotomous values (1/0) for presence or absence of the diagnosis for that patient. After initializing all these variables' values to 0, I used the syntax if any(dx.1 TO dx.29,"abd pain") dx_abd=1. if any(dx.1 TO dx.29,"chest pain") dx_cp=1. [etc.] After entering a few of these I ran crosstabs (e.g. dx_abd by CA_CON) .to see if they matched the diagnostic frequencies I had (via crosstabs for undup_dx by CA_CON) before the CASESTOVARS restructure. They don't match - current n's are much lower . I already know I have the same number of cases and controls as before restructuring, so that's not the problem. Perhaps my "any" syntax is? I appreciate any insights on this. I'm stumped. Tanya Temkin Research Associate AACC Reporting Northern California Regional Office The Permanente Medical Group (510) 625-6680 NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
