SPSSX Discussion

Restructuring data

Classic

List

Threaded

6 messages Options

Jan Veenstra

Restructuring data

I have a very large data file (over 1 million records) that contains
information listed in multiple rows that I need to restructure so that the
rows for each ID are combined into a single row. Please note that duplicates
are also allowed in this file. I have included a sample below to see if
someone can help me as I am a beginner and have not been able to get the
CASETOVARS syntax to work. I also tried to aggregate the data and then use
CASETOVARS but it only moved the row number around and not the data. The
training I've had has been almost completely menu driven so am not familiar
with using syntax in SPSS.

Below is my sample.

Billing_ID Line No. HCPC Code Modifier_1 Modifer_2
Many other variables ->->->->->->->->->->->->->->->->->->->->->

7707 1 9567 03 01
7710 1 9567
7710 2 9658
7710 3 9567

What I need them to look like is as follows. It basically moves line numbers
2 and 3 to new variables. I also have multiple line counts for each ID -
some have 3 lines, some have 14 lines, some have less, some have more.

Billing_ID Line No. HCPC Code HCPC_2 HCPC_3 Modifier_1
Many other variables ->->->->->->->->->->->->->->->->->->->->->

7707 1 9567
03
7710 1 9567 9658 9567

Thank you for any help you can provide me.

> Division of Medical Economics
> TEXAS MEDICAL ASSOCIATION
> 401 West 15th Street
> Austin, TX 78701-1680
> 512/370-1461 Phone
> 512/370-1693 FAX
>
>

--------------------------------------------------------

Improving the Health of All Texans

Texas Medical Association
www.texmed.org
401 W. 15th Street
Austin, TX 78701

Richard Ristow

Re: Restructuring data

At 01:22 PM 8/2/2007, Jan Veenstra wrote:

>I have a data file that contains information listed in multiple rows
>that I need to restructure so that the rows for each ID are combined
>into a single row. Please note that duplicates are also allowed in
>this file. I have included a sample below. [I]have not been able to
>get the CASETOVARS syntax to work. I also tried to aggregate the data
>and then use CASETOVARS but it only moved the row number around and
>not the data.

In the following, I haven't addressed the problem of duplicates. I'm
not sure what is 'duplicated' - just the HCPC_Code, the whole record,
or what? Anyway,

Input data (SPSS 15 draft output):
|-----------------------------|---------------------------|
|Output Created |02-AUG-2007 14:31:28 |
|-----------------------------|---------------------------|
Billing_ID Line_No HCPC_Code Modifier_1 Modifier_2

7707 1 9567 03 01
7710 1 9567 . .
7710 2 9658 . .
7710 3 9567 . .

Number of cases read: 4 Number of cases listed: 4

>What I need them to look like is as follows. It basically moves line
>numbers 2 and 3 to new variables. I have multiple line counts for
>each ID - some have 3 lines, some have more.
>
>Billing_ID Line_No HCPC_Code HCPC_2 HCPC_3 Modifier_1 Modifier_2
>7707 1 9567 03 01
>7710 1 9567 9658 9567

So far, except for the 'Modifier' variables, and the "Many other
variables" that I'm ignoring, this is straightforward CASESTOVARS that
could be clicked up from the menus. I clicked this one up, then added
the /RENAME and /SEPARATOR subcommands. It isn't *quite* the above
format, but it's very close. (You can use a /KEEP= list on ADD FILES to
reorder the variables, if you like.) Continuing SPSS 15 draft output
(WRR:not saved separately):

SORT CASES BY Billing_ID Line_No .
CASESTOVARS
/ID = Billing_ID
/INDEX = Line_No
/RENAME HCPC_Code=HCPC
/SEPARATOR = '_'
/GROUPBY = VARIABLE .

Cases to Variables
|----------------------------|---------------------------|
|Output Created |02-AUG-2007 14:55:02 |
|----------------------------|---------------------------|
Generated Variables
|---------|-------|------|
|Original |Line_No|Result|
|Variable | |------|
| | |Name |
|---------|-------|------|
|HCPC_Code|1 |HCPC_1|
| |2 |HCPC_2|
| |3 |HCPC_3|
|---------|-------|------|

Processing Statistics
|---------------|---|
|Cases In |4 |
|Cases Out |2 |
|---------------|---|
|Cases In/Cases |2.0|
|Out | |
|---------------|---|
|Variables In |5 |
|Variables Out |6 |
|---------------|---|
|Index Values |3 |
|---------------|---|

LIST.

List
|-----------------------------|---------------------------|
|Output Created |02-AUG-2007 14:55:02 |
|-----------------------------|---------------------------|
Billing_ID Modifier_1 Modifier_2 HCPC_1 HCPC_2 HCPC_3

7707 03 01 9567 . .
7710 . . 9567 9658 9567

Number of cases read: 2 Number of cases listed: 2
...............

Now, what of Modifier_1, Modifier_2, and the "many other variables"? If
they appear on every line for each ID, and have the same values on
every line, this code will work as is, though always putting the HCPC
codes at the end of the records. If, say, they only appear on Line 1,
you need something more, possibly using AGGREGATE but I can't say at
this point.

-Good luck,
Richard
===================
APPENDIX: Test data
===================
(This code works, but gives warning messages.)

DATA LIST LIST SKIP=1
/Billing_ID Line_No HCPC_Code Modifier_1 Modifier_2
(F4, F2, F4, N2, N2).
BEGIN DATA
Billing_ID Line No. HCPC Code Modifier_1 Modifier_2
7707 1 9567 03 01
7710 1 9567
7710 2 9658
7710 3 9567
END DATA.
FORMATS Modifier_1 Modifier_2 (N2).
LIST.

Frommelt, Robert A.

Transforming blood pressure measures

In reply to this post by Jan Veenstra

I have blood pressure measures that are stored as a string variable
120/80. I would like to convert that measure to two numeric variables
so that I will sysbp 120 and diabp 80.

Is there anyway to do that in SPSS using syntax?

Thanks,

Allen

This email and all attachments are confidential and intended solely
for the use of the individual or entity to which they are addressed.
If you have received this email in error please notify the sender
by replying to this message. If you are not the intended recipient,
please delete this message and all attachments immediately. Do not
copy, disclose, use or act upon the information contained. Please
note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the
company. Finally, the recipient should check this email and any
attachments for the presence of viruses. While every attempt is made
to verify that the contents are safe, the company accepts no liability
for any damage caused by any virus transmitted by this email.

Richard Ristow

Re: Transforming blood pressure measures

At 03:54 PM 8/2/2007, Frommelt, Robert A. wrote:

>I have blood pressure measures that are stored as a string variable
>120/80. I would like to convert that measure to two numeric variables
>so that I will sysbp 120 and diabp 80.

What the heck. Not tested, but if the string variable is named "bp":

NUMERIC sysbp diabp(F3).
COMPUTE #Break = INDEX(bp,'/').
COMPUTE sysbp = NUMBER(substr(bp,1,#Break-1),F4).
COMPUTE diabp = NUMBER(substr(bp,#Break+1) ,F4).

JOSE MARIA GARCIA TULICH

go to in spss15 vs spss13

In reply to this post by Jan Veenstra

In spss 15 I cannot make (go to) as it made it in version 13, that is to
say, with the window of go to digit the variable to which queria to go and
takes to me automatically.

there is some recommendable option for this case

Hector Maletta

Re: go to in spss15 vs spss13

Jose Maria,
Your problem is not clear. It may or may not be the same as mine.
You seem to mean that in the DATA Editor window you want to FIND some
particular value of a variable. In that case, you should position yourself
in the corresponding column, click on Edit - Find (that is "Editar - Buscar"
in Spanish), and that works perfectly in my own V.15 (mine is in English).
Another possible meaning of your question is that you cannot go to a certain
target (a digit, a word) in a SYNTAX editor window. In this case you click
on Edit - Find (or Replace). A third different problem is finding a
particular VARIABLE in the data editor: open the variable information
dialog, find a variable, and click Go To. I am not clear which one is your
trouble. My own problem is limited to the syntax editor. The other two work
fine in my version.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
JOSE MARIA GARCIA TULICH
Sent: 02 August 2007 19:28
To: [hidden email]
Subject: go to in spss15 vs spss13

In spss 15 I cannot make (go to) as it made it in version 13, that
is to
say, with the window of go to digit the variable to which queria to
go and
takes to me automatically.

there is some recommendable option for this case