Restructuring Data

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Restructuring Data

Michael Schuler
Dear Experts,

I've got a question concerning restructering a Dataset. The Data are looking like that:

Id      Varstr
1       Text1
1       Text2
1       Text3
1       Text4
2       Text5
2       Text2
2       Text6
3       Text3
4       Text1
4       Text5
...         ...



So I have multiple Measurements for each case, but different numbers of Measurments.
The Measurement  ("Varstr") is string format.


After Restructuring, Data schould look like ... :

Id   Text1  Text2  Text3  Text4  Text5 ...
1        1        1         1         1         0
2        0        1         0         0         1
3        0        0         1         0         0
4        1        0         0         0         1


I want to have one line per Id. The Names of the new (binary) Variables should be the Texts of
the old variable "Varstr.".

I have hundreds of different Measuments (e.g. "Text1"... "Text820") and thousands of cases and
don't want to create every new variable with hundreds of simple if-statements but i can't figure out
how to solve the problem with a short (Macro)Syntax.

Any help would be appreciated.

Thanks,
Michael




_______________________________________
GRATIS: Movie-FLAT. Jetzt freischalten!
http://freemail.web.de/club/maxdome.htm

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Restructuring Data

Spousta Jan
Hi Michael,

Try this:

* some training data: .
data list list /id (f1) text (a5) .
begin data.
1       Text1
1       Text2
1       Text3
1       Text4
2       Text5
2       Text2
2       Text6
3       Text3
4       Text1
4       Text5
end data.
exe.

* the solution: .
compute aux = 1 / ($casenum+1) /* to be between 0 and 0.5 and not constant */.
SORT CASES BY id text .
CASESTOVARS /ID = id /INDEX = text .
recode all (0 thru 0.6 = 1)(missing = 0)(else = copy).
form all (f2).
exe.

Best regards,

Jan

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Michael Schuler
Sent: Wednesday, January 16, 2008 5:46 PM
To: [hidden email]
Subject: Restructuring Data

Dear Experts,

I've got a question concerning restructering a Dataset. The Data are looking like that:

Id      Varstr
1       Text1
1       Text2
1       Text3
1       Text4
2       Text5
2       Text2
2       Text6
3       Text3
4       Text1
4       Text5
...         ...



So I have multiple Measurements for each case, but different numbers of Measurments.
The Measurement  ("Varstr") is string format.


After Restructuring, Data schould look like ... :

Id   Text1  Text2  Text3  Text4  Text5 ...
1        1        1         1         1         0
2        0        1         0         0         1
3        0        0         1         0         0
4        1        0         0         0         1


I want to have one line per Id. The Names of the new (binary) Variables should be the Texts of the old variable "Varstr.".

I have hundreds of different Measuments (e.g. "Text1"... "Text820") and thousands of cases and don't want to create every new variable with hundreds of simple if-statements but i can't figure out how to solve the problem with a short (Macro)Syntax.

Any help would be appreciated.

Thanks,
Michael




_______________________________________
GRATIS: Movie-FLAT. Jetzt freischalten!
http://freemail.web.de/club/maxdome.htm

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



_____

Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem.

This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission.

-.- --

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Restructuring Data

Maguin, Eugene
In reply to this post by Michael Schuler
Michael,

Exactly, what is 'text1' 'text2' etc? For instance, does 'text1' equal
'abcde1', 'text2' equal 'abcde2', ... 'text21' equals 'abcde21', etc OR does
'text1' equal 'a1si6d', 'text2' equal 'xYw290', etc? If the answer is the
second description, then working this problem is going to be relatively to
very hard. If, however, the answer is the first description, then working
the problem is much, much easier because the value of the variable Varstr
consists of a constant stem, the 'abcde' part plus a number which can be
used to refer to the sequential position of the variable value. That is,
Varstr='abcde34' indicates that the 34th element in the vector abcde has a
value of 1. This is equivalent to saying that the variable abcde34 has a
value of 1.

So, which is it? Or, is the another possibility?

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Restructuring Data

Katkowski, David
In reply to this post by Michael Schuler
Run the following syntax. Should fix your data exactly the way you want it:

SORT CASES BY id Varstr .
CASESTOVARS
 /ID = id
 /INDEX = Varstr
 /GROUPBY = VARIABLE
 /VIND ROOT = ind.

The only difference is that each of the Varstr variables has a prefix added to it.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Michael Schuler
Sent: Wednesday, January 16, 2008 11:46 AM
To: [hidden email]
Subject: Restructuring Data

Dear Experts,

I've got a question concerning restructering a Dataset. The Data are looking like that:

Id      Varstr
1       Text1
1       Text2
1       Text3
1       Text4
2       Text5
2       Text2
2       Text6
3       Text3
4       Text1
4       Text5
...         ...



So I have multiple Measurements for each case, but different numbers of Measurments.
The Measurement  ("Varstr") is string format.


After Restructuring, Data schould look like ... :

Id   Text1  Text2  Text3  Text4  Text5 ...
1        1        1         1         1         0
2        0        1         0         0         1
3        0        0         1         0         0
4        1        0         0         0         1


I want to have one line per Id. The Names of the new (binary) Variables should be the Texts of
the old variable "Varstr.".

I have hundreds of different Measuments (e.g. "Text1"... "Text820") and thousands of cases and
don't want to create every new variable with hundreds of simple if-statements but i can't figure out
how to solve the problem with a short (Macro)Syntax.

Any help would be appreciated.

Thanks,
Michael




_______________________________________
GRATIS: Movie-FLAT. Jetzt freischalten!
http://freemail.web.de/club/maxdome.htm

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.19.5/1228 - Release Date: 1/16/2008 9:01 AM


No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.19.5/1228 - Release Date: 1/16/2008 9:01 AM

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD