Hi all
I have behaviour data where the same behaviour can be repeated. [Behaviours are 1=Browse 2=Touch 3=Buy] Typical data can look like - id b1 b2 b3 b4 b5 b6 b7 1 1 1 2 3 1 2 3 i.e. browse, browse, touch, buy, browse, touch, buy I want to strip out the same behaviour if it's sequential - the above would change to this - id b1 b2 b3 b4 b5 b6 b7 1 1 2 3 1 2 3 i.e. browse, touch, buy, browse, touch, buy There are now 6 behaviours [was 7] and the repeated browse is removed Any suggestions on how to this using syntax? Regards -- Mark Webb Line +27 (21) 786 4379 Cell +27 (72) 199 1000 [Poor reception] Fax +27 (86) 260 1946 Skype tomarkwebb Email [hidden email] ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Variations on this question have appeared before, and two general types of solutions have been suggested. One is to use VECTOR and LOOP (you can probably find examples posted by David M). The other approach is to restructure the data from wide to long, use SELECT IF to get rid of unwanted records, and then restructure from LONG to WIDE to get back to the original structure. The first approach arguably has tighter code, but it is also (arguably) harder to understand.
Here's an example of the second approach using your data (plus a second row to test that it works with multiple IDs). new file. dataset close all. data list list / id b1 to b7 (8f2.0). begin data 1 1 1 2 3 1 2 3 2 1 2 2 2 3 1 3 end data. VARSTOCASES /MAKE b FROM b1 to b7 /INDEX=Index1(7) /KEEP=id /NULL=KEEP. compute discard = (id EQ lag(id) and b EQ lag(b)). if $casenum EQ 1 discard = 0. execute. select if NOT discard. execute. if $casenum EQ 1 OR id ne lag(id) index2 = 1. if missing(index2) index2 = lag(index2) + 1. formats index2(f2.0). execute. CASESTOVARS /ID=id /INDEX=index2 /GROUPBY=VARIABLE /DROP index1 discard /SEPARATOR = "". list. OUTPUT: id b1 b2 b3 b4 b5 b6 1 1 2 3 1 2 3 2 1 2 3 1 3 . HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Mark Webb-5
First, here is a solution in traditional
Statistics syntax.
data list list/b1 to b7(7f1.0). begin data 1 1 2 3 1 2 3 1 2 3 1 2 3 1 1 1 1 1 1 1 1 end data. dataset name behavior. vector v = b1 to b7. compute #count = 7. loop #y= 1 to 6. loop if (v(#y) eq v(#y+1)). loop #index = #y to #count-1. compute v(#index) = v(#index+1). end loop. compute v(#count) = $sysmis. compute #count = #count-1. end loop. end loop. exec. Second, here is a solution using the SPSSINC TRANS extension command. data list list/b1 to b7(7f1.0). begin data 1 1 2 3 1 2 3 1 2 3 1 2 3 1 1 1 1 1 1 1 1 end data. dataset name behavior. begin program. def f(*args): ll = len(args) return [args[i] for i in range(ll) if i == (ll-1) or args[i] != args[i+1]] end program. spssinc trans result = b1 to b7 /variables b1 to b7 /formula "f(<>)". Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Mark Webb <[hidden email]> To: [hidden email], Date: 09/04/2013 04:42 AM Subject: [SPSSX-L] Multi-mention to single-mention within record - syntax Sent by: "SPSSX(r) Discussion" <[hidden email]> Hi all I have behaviour data where the same behaviour can be repeated. [Behaviours are 1=Browse 2=Touch 3=Buy] Typical data can look like - id b1 b2 b3 b4 b5 b6 b7 1 1 1 2 3 1 2 3 i.e. browse, browse, touch, buy, browse, touch, buy I want to strip out the same behaviour if it's sequential - the above would change to this - id b1 b2 b3 b4 b5 b6 b7 1 1 2 3 1 2 3 i.e. browse, touch, buy, browse, touch, buy There are now 6 behaviours [was 7] and the repeated browse is removed Any suggestions on how to this using syntax? Regards -- Mark Webb Line +27 (21) 786 4379 Cell +27 (72) 199 1000 [Poor reception] Fax +27 (86) 260 1946 Skype tomarkwebb Email [hidden email] ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Mark Webb-5
Here is another version using VECTOR/LOOP.
data list list/b1 to b7(7f1.0). begin data 1 1 2 3 1 2 3 1 2 3 1 2 3 1 1 1 1 1 1 1 1 end data. dataset name behavior. VECTOR b=b1 TO b7 / #(7). RECODE #1 TO #7 (ELSE=0). COMPUTE ##=2. COMPUTE #(1)=b(1). LOOP #=2 TO 7. + DO IF b(#) NE b(#-1). + COMPUTE #(##)=b(#). + COMPUTE ##=##+1. + END IF. END LOOP. DO REPEAT b=B1 TO B7 / #=#1 TO #7. + COMPUTE b=#. END REPEAT. LIST. B1 B2 B3 B4 B5 B6 B7 1 2 3 1 2 3 0 1 2 3 1 2 3 1 1 0 0 0 0 0 0 Number of cases read: 3 Number of cases listed: 3
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |