Identify if column set is less than x and create columns
Posted by wsu_wright on Sep 17, 2022; 7:07pm
URL: http://spssx-discussion.165.s1.nabble.com/Identify-if-column-set-is-less-than-x-and-create-columns-tp5741057.html
Background:
I use SPSS (v28) to extract data on classes (some refer to these as class sections of a course) of courses offered by semester from our student information system. The SIS table is a non-repeating class section by semester by year oracle table with class information (type, method, size, number of students, etc.). I then pull data from another SIS table of instructors in which a class section may have several instructors; the instructor SIS table is a repeating instructor by class section by semester by year, the SIS allows up to 6 instructors for a class section. Once the instructor data are extracted I slot the instructor by class section by semester by year (CASESTOVARS), this give me a single record for a class section with all the instructors IDs list in columns. Finally, I join (match files) the instructor data to the class section data mentioned above, creating a single record for a class section within a given semester and year with class information and instructors in slotted columns. This table is uploaded daily using SPSS bulk insert into an Oracle database that our enterprising reporting tool pulls for reports. This is an automated spss syntax file that runs daily during the early morning hours.
The problem trying to solve:
While the SIS allows up to 6 instructors per class section, it’s possible that in any given daily upload there will be less than 6 max instructors across the class sections and thus less than 6 columns (via CASESTOVARS) which then causes an upload failure as the database is expecting 6 columns for instructors (whether they have data or nulls). So in the automated spss syntax file mentioned above, I need a syntax method to identify when I have less than 6 columns and if less than 6 columns exist to create the additional columns needed (these can via the String command and empty or null columns).
Listed below is the current logic (minimalized for display) that pulls the data from the two databases. Since this syntax is shared among other programmers whom I cannot assume they have python, python is not preferred.
dataset close all.
*obtain classection instructor data.
GET Capture ODBC
/CONNECT=########
/select instructorID"ID", term, classsection
from table X.
CACHE.
EXECUTE.
sort cases by term classsection.
CASESTOVARS /ID=term classsection /separator="" /GROUPBY=INDEX.
*prep for join.
sort cases by term classsection.
dataset name instructors.
*obtain class section data- non-repeating class section by term.
GET Capture ODBC
/CONNECT=#######
/select classsection, term, type, method, stu_count
from table Y.
CACHE.
EXECUTE.
*join instructor data.
sort cases by term classsection.
match files file=* /table=instructors /by term classsection.
dataset close instructors.
*identify column count and create missing nulled columns for upload.
???.
David, thanks for any suggestions.