LAG Question

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

LAG Question

DKUKEC
Dear Listers...

I have a dataset (SPSS V21, Win7 (64)) with accusedid (RPUID), arrestdates, individual charges, unique arrest ids, and most serious offense categoires.  I would like to count the specific number most serious offense categories attributed to one accusedid.  Where I am running into difficultly is the LAG function, where MSO categories from an accusedid above "creeps" into the next accusedid below.    

Here is a sample of the SYNTAX I am using

USE ALL.
SELECT IF MSO=6.
AGGREGATE
OUTFILE=* MODE=ADDVARIABLES /
BREAK=RPUID /
CHARGESNUMSO6=NU (RPUID).
* Fill in blank cells .
IF MISSING (CHARGESNUMSO6) CHARGESNUMSO6 = LAG (CHARGESNUMSO6).
VARIABLE LABEL CHARGESNUMSO6 'MSO6 Number of Charges for Study Period (SPSS)'.
EXECUTE.

I am running the above for each MSO category.  Probably not the best approach... any suggestions would be greatly appreciated.

Thank you in advance.

Damir



Reply | Threaded
Open this post in threaded view
|

Re: LAG Question

Andy W
I work with data like this everyday and I can't quite tell what you are doing, so more detail, an example of "creeps", example dataset with expected output etc. would be helpful.

From your description simply aggregating by accusedid *and* MSO after selecting only the top charges (and then CASESTOVARS) sounds like it will get you where you want to go, but I'm not sure.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: LAG Question

David Marso
Administrator
In reply to this post by DKUKEC
DO IF! Think about it.  This has been solved and posted many times. Check the archives.
DKUKEC wrote
Dear Listers...

I have a dataset (SPSS V21, Win7 (64)) with accusedid (RPUID), arrestdates, individual charges, unique arrest ids, and most serious offense categoires.  I would like to count the specific number most serious offense categories attributed to one accusedid.  Where I am running into difficultly is the LAG function, where MSO categories from an accusedid above "creeps" into the next accusedid below.    

Here is a sample of the SYNTAX I am using

USE ALL.
SELECT IF MSO=6.
AGGREGATE
OUTFILE=* MODE=ADDVARIABLES /
BREAK=RPUID /
CHARGESNUMSO6=NU (RPUID).
* Fill in blank cells .
IF MISSING (CHARGESNUMSO6) CHARGESNUMSO6 = LAG (CHARGESNUMSO6).
VARIABLE LABEL CHARGESNUMSO6 'MSO6 Number of Charges for Study Period (SPSS)'.
EXECUTE.

I am running the above for each MSO category.  Probably not the best approach... any suggestions would be greatly appreciated.

Thank you in advance.

Damir
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: LAG Question

DKUKEC
In reply to this post by Andy W
Thank you Andy & David,

I was able to solve my issue by first aggregating across the MSO categories, then completing the CASETOVARS, and then matching the results back into the main dataset.

Take care,
Damir
Reply | Threaded
Open this post in threaded view
|

Re: LAG Question

David Marso
Administrator
This post was updated on .
Damir,
Perhaps post your resulting code here for the benefit of the class?
What I was hinting at with DO IF was something like:

SORT CASES BY MSO RPUID accusedid
AGGREGATE OUTFILE * MODE = ADDVARIABLES
               / BREAK=MSO RPUID
              / CHARGESNUM=NU (RPUID).

* Fill in blank cells .
DO IF (MSO EQ LAG(MSO) AND (RPUID EQ LAG(RPUID) AND (accusedid EQ LAG(accusedid )).
+  IF MISSING (CHARGESNUM) CHARGESNUM = LAG (CHARGESNUM).  
END IF.

DKUKEC wrote
Thank you Andy & David,

I was able to solve my issue by first aggregating across the MSO categories, then completing the CASETOVARS, and then matching the results back into the main dataset.

Take care,
Damir
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: LAG Question

DKUKEC
Thank you David...

What I did, was very UGLY compared to your tight syntax.  After your initial email and Andy's comment I did the following... it is quite lengthy:

AGGREGATE
OUTFILE=* MODE=ADDVARIABLES /
BREAK=RPUID MSO  /
CHARGESCAT=NU (RPUID).
* Fill in blank cells .
VARIABLE LABEL CHARGESCAT 'Number of Charge Categories by Study Period (SPSS)'.
EXECUTE.

*** IDENTIFY DUPS ***

* Identify Duplicate Cases.
SORT CASES BY RPUID(A) CHARGESCAT(A) MSO(A).
MATCH FILES
  /FILE=*
  /BY RPUID CHARGESCAT MSO
  /FIRST=PrimaryFirst
  /LAST=UNQRECS.
DO IF (PrimaryFirst).
COMPUTE  MatchSequence=1-UNQRECS.
ELSE.
COMPUTE  MatchSequence=MatchSequence+1.
END IF.
LEAVE  MatchSequence.
FORMATS  MatchSequence (f7).
COMPUTE  InDupGrp=MatchSequence>0.
SORT CASES InDupGrp(D).
MATCH FILES
  /FILE=*
  /DROP=PrimaryFirst InDupGrp MatchSequence.
VARIABLE LABELS  UNQRECS 'Indicator of each last matching case as Primary'.
VALUE LABELS  UNQRECS 0 'Duplicate Case' 1 'Primary Case'.
VARIABLE LEVEL  UNQRECS (ORDINAL).
FREQUENCIES VARIABLES=UNQRECS.
EXECUTE.

****************************************************************

SELECT IF UNQRECS.
EXECUTE.

*****************************************************************

SORT CASES BY RPUID CHARGESCAT MSO.
CASESTOVARS
   /ID=RPUID
   /INDEX = MSO
   /GROUPBY = VARIABLE / DROP ARRESTID F_ACCUSED F_ACCUSEDBYYEAR ARRESTFLAG ARRESTNU ARRESTNUBYYEAR
                        CHARGESNU1 CHARGESNU2 CHARGESBYYEAR JUD_ARREST_DT JUD_ARREST_YEAR JUD_ARR_SEQ_CD
                        JUD_CRT_LIT JUD_OFF_CHAR JUD_CRT_STATUTE JUD_PROB_LIT JUD_PROB_LEN JUD_SENT_LIT JUD_SENT_DOCKET
                        JUD_CONFINE_LEN JUD_ARR_CHG_DEG JUD_ARR_CHG_LVL JUD_ARR_CHG_NBR JUD_ARR_CHG_STAT JUD_ARR_CNT
                        JUD_CHG_DEG JUD_CHG_LVL JUD_CHG_QUAL JUD_CHG_STAT JUD_COMM_CTRL JUD_COUNSEL_TYP
                        JUD_CRT_COST JUD_CRT_FINE JUD_DISP_CD JUD_OFF_CD JUD_PLEA_TYP JUD_SENT_CONFINE
                        JUD_SENT_STAT JUD_TRIAL_TYP JUD_DISP_DT JUD_SENT_DT JUD_COUNTY DKNOTES JUD_ARR_CHG_LVLX
                        JUD_ARR_CHG_DEGX UNQRECS.
EXECUTE.

VARIABLE LABELS CHARGESCAT.1 "1D Life".
VARIABLE LABELS CHARGESCAT.2 "Capital".
VARIABLE LABELS CHARGESCAT.3 "Life".
VARIABLE LABELS CHARGESCAT.4 "Felony - First Degree".
VARIABLE LABELS CHARGESCAT.5 "Felony - Second Degree".
VARIABLE LABELS CHARGESCAT.6 "Felony - Third Degree".
VARIABLE LABELS CHARGESCAT.7 "Felony - Not Applicable / Unknown".
VARIABLE LABELS CHARGESCAT.8 "Misdemeanor - First Degree".
VARIABLE LABELS CHARGESCAT.9 "Misdemeanor - Second Degree".
VARIABLE LABELS CHARGESCAT.11 "Misdemeanor - Not Applicable / Unknown".
VARIABLE LABELS CHARGESCAT.9999 "Not Applicable / Unknown".
EXECUTE.

SORT CASES BY RPUID (A).

SAVE OUTFILE='c:\DR2012_23_JUD 30713.sav'
  /COMPRESSED.


**** OPEN AND MATCH ON RPUID WITH THE MAIN DR FILE.

SORT CASES BY RPUID (A).
MATCH FILES /FILE=*
 /TABLE='dataset5'
 /BY RPUID.
EXECUTE.