Hi,
I am currently trying to determine which course sections (e.g. units) have high numbers of students (e.g. id's) co-enrolled so I can remove as many as possible to reduce duplicate surveying. We know that certain courses are likely suspects (freshman English and Math), but not which exact sections, patterns which are more related to scheduling. I didn't really find any similar query posted to this listserv after an hour of looking, though I am fairly certain it involves some combination of vectors, looping and macros. Or, better yet, the solution is painfully simple! The data would look like this: ID Section# 1 1111 1 2222 1 3333 2 1111 2 2222 2 4444 3 1111 3 2222 3 3333 3 4444 etc. What I would like ultimately is something that looks like this which ranks the co-enrollment course by course (N=frequency): Section CoEnrSec#1 CoEnrSec#1N CoEnrSec#2 CoEnrSec#2N etc 1111 2222 3 3333 2 2222 1111 3 3333 2 3333 1111 2 2222 2 4444 1111 2 2222 2 I am really only interested in the first three or four combinations for the sake of elimination (and recognize there will be duplication and some further aggregation needed). Any assistance is greatly appreciated. Sam Michalowski ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Sam,
I'm curious--but just curious. How many total courses and how many total students? Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Sam Michalowski Sent: Wednesday, February 13, 2013 12:37 PM To: [hidden email] Subject: Ranking co-enrolled course combinations Hi, I am currently trying to determine which course sections (e.g. units) have high numbers of students (e.g. id's) co-enrolled so I can remove as many as possible to reduce duplicate surveying. We know that certain courses are likely suspects (freshman English and Math), but not which exact sections, patterns which are more related to scheduling. I didn't really find any similar query posted to this listserv after an hour of looking, though I am fairly certain it involves some combination of vectors, looping and macros. Or, better yet, the solution is painfully simple! The data would look like this: ID Section# 1 1111 1 2222 1 3333 2 1111 2 2222 2 4444 3 1111 3 2222 3 3333 3 4444 etc. What I would like ultimately is something that looks like this which ranks the co-enrollment course by course (N=frequency): Section CoEnrSec#1 CoEnrSec#1N CoEnrSec#2 CoEnrSec#2N etc 1111 2222 3 3333 2 2222 1111 3 3333 2 3333 1111 2 2222 2 4444 1111 2 2222 2 I am really only interested in the first three or four combinations for the sake of elimination (and recognize there will be duplication and some further aggregation needed). Any assistance is greatly appreciated. Sam Michalowski ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by SamMichalowski
You could do this by first going LONG to WIDE with CASESTOVARS and then treating the new columns as variables in a MULT RESPONSE group.
or you could do a simple MATRIX program. * Data simmulation **. INPUT PROGRAM. LOOP ID=1 TO 5000. LOOP C=1 TO 5. COMPUTE CNUM=TRUNC(UNIFORM(20))+1. LEAVE ID. END CASE. END LOOP. END LOOP. END FILE. END INPUT PROGRAM. EXE. AGGREGATE OUTFILE * / BREAK ID CNUM / N=N. MATCH FILES / FILE * / DROP N. ** Start here **. AUTORECODE CNUM /INTO CNUMAR. CASESTOVARS /ID = id. COMPUTE @=$SYSMIS. * Multiple Response Tables. TABLES /FORMAT BLANK MISSING('') /MRGROUP $m1 cnumar.1 TO @ /MRGROUP $m2 cnumar.1 TO @ /GBASE=CASES /TABLE=$m1 BY $m2 . SET MXLOOPS 1000000. MATRIX. + GET DATA / VAR CNUMAR.1 TO @ / FILE * / MISSING -9999. + COMPUTE N=MMAX(DATA). + COMPUTE NR= NROW(DATA). + COMPUTE NC= NCOL(DATA)-1. + COMPUTE D=MAKE(N,N,0). + LOOP #NR=1 TO NR. + LOOP #1=1 TO NC-1. + DO IF (DATA(#NR,#1) NE -9999) . + LOOP #2=#1+1 TO NC. + DO IF (DATA(#NR,#2) NE -9999) . + COMPUTE D(DATA(#NR,#1) ,DATA(#NR,#2) ) = D(DATA(#NR,#1) ,DATA(#NR,#2) ) + 1. + END IF. + END LOOP. + END IF. + END LOOP. + END LOOP. PRINT D. END MATRIX.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by SamMichalowski
That kind of data is very common in Network Analysis – in this case nodes are Course Sections and the value of the tie between Node1 and Node2 is the number of people who are enrolled in both sections. You are trying to get “clusters” of
courses that concentrate high numbers. There are nice packages like Ucinet or Pajek that do this very easily. Another approach could use casestovars with CourseSections as items and people as variables, and gets a distance measure between course sections – like when someone compares ecological transects. This gets fed to a clustering routine,
you can get nice dendrograms. Of course you may have too many people and SPSS may not let you have so many columns. |
In reply to this post by SamMichalowski
Gene,
There are 40k enrollments across 1.7k courses. So, its a pretty sizeable job. I certainly don't need all of the combinations, just those with the highest loads. Going over the solutions posted yesterday now with bated breath. SM On Wed, 13 Feb 2013 13:14:52 -0500, Maguin, Eugene <[hidden email]> wrote: >Sam, > >I'm curious--but just curious. How many total courses and how many total students? > >Gene Maguin > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
I would suggest going with the MATRIX code rather than the MR TABLE approach.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |