Every year, Census releases new population estimates for the recent year. The dataset they provide hard codes the year in the variable name (e.g., POPESTIMATE2013).
How I can I extract the year element (e.g., 2013) from the variable name and put that value in a new variable, CensusYear? I can count on the year portion always being 4 characters (e.g., "2013) and always occurring at the 12th character spot (i.e., after "POPESTIMATE"). My understanding is that the char.substr function only extracts characters from data values, not variable names. |
This would only make sense if the year
governs the entire dataset, but then what would be the point of putting
it in a new variable, where it would be a constant.
If you want to do it, though, it would require a few lines of Python code like this. It finds the POPESTIMATE<year> variable name, extracts the last four characters, and runs a compute on that value. begin program. import spss, spssaux yearpart = spssaux.VariableDict(pattern="POPESTIMATE\d\d\d\d").variables[0][-4:] spss.Submit("""Compute CensusYear = %s""" % yearpart) end program. begin program. import spss, spssaux Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Fiveja <[hidden email]> To: [hidden email] Date: 05/04/2015 08:18 AM Subject: [SPSSX-L] Extracting characters from variable name Sent by: "SPSSX(r) Discussion" <[hidden email]> Every year, Census releases new population estimates for the recent year. The dataset they provide hard codes the year in the variable name (e.g., POPESTIMATE2013). How I can I extract the year element (e.g., 2013) from the variable name and put that value in a new variable, CensusYear? I can count on the year portion always being 4 characters (e.g., "2013) and always occurring at the 12th character spot (i.e., after "POPESTIMATE"). My understanding is that the char.substr function only extracts characters from data values, not variable names. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Extracting-characters-from-variable-name-tp5729482.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Fiveja
Perhaps you've thought of my suggestion already and rejected it but if not.
If you have single year file with many variables, it seems to me that adding a year variable is kind of trivial, just a compute statement. The big problem is the rename operation. Somebody (and I see that Jon has done so) who knows Python will post code to strip out the number characters from the variable name and, if you use python that is the way to go--a documented, reusable segment of code. The hard work alternative is to use Display to list the variable names, use a text editor to find and replace the year, and then a rename variables command to do the actual rename operation. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Fiveja Sent: Monday, May 04, 2015 10:18 AM To: [hidden email] Subject: Extracting characters from variable name Every year, Census releases new population estimates for the recent year. The dataset they provide hard codes the year in the variable name (e.g., POPESTIMATE2013). How I can I extract the year element (e.g., 2013) from the variable name and put that value in a new variable, CensusYear? I can count on the year portion always being 4 characters (e.g., "2013) and always occurring at the 12th character spot (i.e., after "POPESTIMATE"). My understanding is that the char.substr function only extracts characters from data values, not variable names. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Extracting-characters-from-variable-name-tp5729482.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Jon K Peck
Thank you, Jon. That worked. I had to add an execute statement for the transformation to run.
The year does govern the entire dataset. The reason for storing the year as a data value is because the cases for this year will later be appended to a file containing data for previous years (with variables for State, Population, CensusYear). It builds a historical data file of all years, which can be distinguished by CensusYear. |
Free forum by Nabble | Edit this page |