Ranking co-enrolled course combinations

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Ranking co-enrolled course combinations

SamMichalowski
Hi,

I am currently trying to determine which course sections (e.g. units) have
high numbers of students (e.g. id's) co-enrolled so I can remove as many as
possible to reduce duplicate surveying.  We know that certain courses are
likely suspects (freshman English and Math), but not which exact sections,
patterns which are more related to scheduling.  I didn't really find any
similar query posted to this listserv after an hour of looking, though I am
fairly certain it involves some combination of vectors, looping and macros.
 Or, better yet, the solution is painfully simple!

The data would look like this:

ID  Section#
1   1111
1   2222
1   3333
2   1111
2   2222
2   4444
3   1111
3   2222
3   3333
3   4444
etc.

What I would like ultimately is something that looks like this which ranks
the co-enrollment course by course (N=frequency):

Section   CoEnrSec#1  CoEnrSec#1N CoEnrSec#2 CoEnrSec#2N etc
1111       2222           3        3333           2
2222       1111           3        3333           2
3333       1111           2        2222           2
4444       1111           2        2222           2

I am really only interested in the first three or four combinations for the
sake of elimination (and recognize there will be duplication and some
further aggregation needed).

Any assistance is greatly appreciated.

Sam Michalowski

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Ranking co-enrolled course combinations

Maguin, Eugene
Sam,

I'm curious--but just curious. How many total courses and how many total students?

Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Sam Michalowski
Sent: Wednesday, February 13, 2013 12:37 PM
To: [hidden email]
Subject: Ranking co-enrolled course combinations

Hi,

I am currently trying to determine which course sections (e.g. units) have high numbers of students (e.g. id's) co-enrolled so I can remove as many as possible to reduce duplicate surveying.  We know that certain courses are likely suspects (freshman English and Math), but not which exact sections, patterns which are more related to scheduling.  I didn't really find any similar query posted to this listserv after an hour of looking, though I am fairly certain it involves some combination of vectors, looping and macros.
 Or, better yet, the solution is painfully simple!

The data would look like this:

ID  Section#
1   1111
1   2222
1   3333
2   1111
2   2222
2   4444
3   1111
3   2222
3   3333
3   4444
etc.

What I would like ultimately is something that looks like this which ranks the co-enrollment course by course (N=frequency):

Section   CoEnrSec#1  CoEnrSec#1N CoEnrSec#2 CoEnrSec#2N etc
1111       2222           3        3333           2
2222       1111           3        3333           2
3333       1111           2        2222           2
4444       1111           2        2222           2

I am really only interested in the first three or four combinations for the sake of elimination (and recognize there will be duplication and some further aggregation needed).

Any assistance is greatly appreciated.

Sam Michalowski

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Ranking co-enrolled course combinations

David Marso
Administrator
In reply to this post by SamMichalowski
You could do this by first going LONG to WIDE with CASESTOVARS and then treating the new columns as variables in a MULT RESPONSE group.
or you could do a simple MATRIX program.
* Data simmulation **.
INPUT PROGRAM.
LOOP ID=1 TO 5000.
LOOP C=1 TO 5.
COMPUTE CNUM=TRUNC(UNIFORM(20))+1.
LEAVE ID.
END CASE.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXE.

AGGREGATE OUTFILE * / BREAK ID CNUM / N=N.
MATCH FILES / FILE * / DROP N.

** Start here **.
AUTORECODE CNUM /INTO CNUMAR.

CASESTOVARS /ID = id.
COMPUTE @=$SYSMIS.

* Multiple Response Tables.
TABLES
  /FORMAT BLANK MISSING('')  
  /MRGROUP  $m1  cnumar.1 TO @  
  /MRGROUP  $m2  cnumar.1 TO @
  /GBASE=CASES
  /TABLE=$m1 BY $m2 .

SET MXLOOPS 1000000.
MATRIX.
+  GET DATA / VAR CNUMAR.1 TO @ / FILE * / MISSING -9999.
+  COMPUTE N=MMAX(DATA).
+  COMPUTE NR= NROW(DATA).
+  COMPUTE NC= NCOL(DATA)-1.
+  COMPUTE D=MAKE(N,N,0).
+  LOOP #NR=1 TO NR.
+    LOOP #1=1 TO NC-1.
+      DO IF (DATA(#NR,#1) NE -9999) .
+        LOOP #2=#1+1 TO NC.
+          DO IF (DATA(#NR,#2) NE -9999) .
+            COMPUTE D(DATA(#NR,#1) ,DATA(#NR,#2) ) =  D(DATA(#NR,#1) ,DATA(#NR,#2) ) + 1.
+          END IF.
+        END LOOP.
+      END IF.
+   END LOOP.
+  END LOOP.
PRINT D.
END MATRIX.

SamMichalowski wrote
Hi,

I am currently trying to determine which course sections (e.g. units) have
high numbers of students (e.g. id's) co-enrolled so I can remove as many as
possible to reduce duplicate surveying.  We know that certain courses are
likely suspects (freshman English and Math), but not which exact sections,
patterns which are more related to scheduling.  I didn't really find any
similar query posted to this listserv after an hour of looking, though I am
fairly certain it involves some combination of vectors, looping and macros.
 Or, better yet, the solution is painfully simple!

The data would look like this:

ID  Section#
1   1111
1   2222
1   3333
2   1111
2   2222
2   4444
3   1111
3   2222
3   3333
3   4444
etc.

What I would like ultimately is something that looks like this which ranks
the co-enrollment course by course (N=frequency):

Section   CoEnrSec#1  CoEnrSec#1N CoEnrSec#2 CoEnrSec#2N etc
1111       2222           3        3333           2
2222       1111           3        3333           2
3333       1111           2        2222           2
4444       1111           2        2222           2

I am really only interested in the first three or four combinations for the
sake of elimination (and recognize there will be duplication and some
further aggregation needed).

Any assistance is greatly appreciated.

Sam Michalowski

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Ranking co-enrolled course combinations

Zuluaga, Juan
In reply to this post by SamMichalowski

That kind of data is very common in Network Analysis – in this case nodes are Course Sections and the value of the tie between Node1 and Node2 is the number of people who are enrolled in both sections. You are trying to get “clusters” of courses that concentrate high numbers.

 

There are nice packages like Ucinet or Pajek that do this very easily.

 

Another approach could use casestovars with CourseSections as items and people as variables,  and gets a distance measure between course sections – like when someone compares ecological transects. This gets fed to a clustering routine, you can get nice dendrograms. Of course you may have too many people and SPSS may not let you have so many columns.     

Reply | Threaded
Open this post in threaded view
|

Re: Ranking co-enrolled course combinations

SamMichalowski
In reply to this post by SamMichalowski
Gene,

There are 40k enrollments across 1.7k courses.  So, its a pretty sizeable
job.  I certainly don't need all of the combinations, just those with the
highest loads.  Going over the solutions posted yesterday now with bated
breath.

SM



On Wed, 13 Feb 2013 13:14:52 -0500, Maguin, Eugene <[hidden email]>
wrote:

>Sam,
>
>I'm curious--but just curious. How many total courses and how many total
students?
>
>Gene Maguin
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Ranking co-enrolled course combinations

David Marso
Administrator
I would suggest going with the MATRIX code rather than the MR TABLE approach.
SamMichalowski wrote
Gene,

There are 40k enrollments across 1.7k courses.  So, its a pretty sizeable
job.  I certainly don't need all of the combinations, just those with the
highest loads.  Going over the solutions posted yesterday now with bated
breath.

SM



On Wed, 13 Feb 2013 13:14:52 -0500, Maguin, Eugene <[hidden email]>
wrote:

>Sam,
>
>I'm curious--but just curious. How many total courses and how many total
students?
>
>Gene Maguin
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"