I am asked to provide syntax to merge several datasets from a panel survey (same cases interviewed on several occasions). Variables and their names and labels are the same in all occasions, and there is an ID variable to match cases from one round to the next. In the resulting dataset I’d like the variables to have a suffix indicating the wave, so that a variable named VAR01 (existing in all rounds) should be renamed VAR01_2011_1 if corresponding to the first round in 2011, or VAR01_2011_2 if corresponding to the second round of that year, and so on. Up to now they are doing it with a long RENAME VARIABLES command, in which the whole detailed list of variables is included on both sides of the equal sign (there are about 150 variables). I think it should be possible to do it in a more elegant way, and I am sure I did it some time ago but somehow I cannot remember how. That is my question: how to add the same suffix (such as ‘_2011_1’) to the name of all variables, except for the ID variable which is the first one in the file. Thanks in advance. Hector |
Hi Hector,
have a look at this scripts on Ray's site: hope it helps, vlad
On Wed, Jul 18, 2012 at 10:48 PM, Hector Maletta <[hidden email]> wrote:
|
Dear Hector,
Alternatively, you can use a little Python block for this. It needs two tweeks: 1) after "exclude", you should specify (quoted) variable name(s) that you do not want to suffix and 2) after "suffix" you should specify the desired suffix. HTH, Ruben data list free/intnr gender age. begin data. end data. do repeat v=v1 to v10. compute v=0. end repeat. begin program. import spss,spssaux exclude=['intnr','gender','age'] suffix="_2011_1" vars=spssaux.GetVariableNamesList() for i in exclude: vars.remove(i) spss.Submit("rename variables ("+" ".join(vars)+"="+" ".join([i+suffix for i in vars])+").") end program. Date: Wed, 18 Jul 2012 23:27:20 +0300 From: [hidden email] Subject: Re: Add a suffix to all variable names To: [hidden email] Hi Hector, have a look at this scripts on Ray's site: hope it helps, vlad On Wed, Jul 18, 2012 at 10:48 PM, Hector Maletta <[hidden email]> wrote:
|
In reply to this post by Hector Maletta
Hector Off the top of my head and completely untested, assuming all variables are contiguous in alphanumeric order, what happens if you do something like this for each file [where n is highest varno]? This keeps both variables so you can check. do repeat x = var01 to var<n> /y = var01_2011_1 to var01_2011_<n>. compute y = x. end repeat. *check. corr var01 with var01_2011_1. [or] rename var (var01 to var<n> = var01_2011_1 to var01_2011_<n>). John F Hall (Mr) Email: [hidden email] Website: www.surveyresearch.weebly.com From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta I am asked to provide syntax to merge several datasets from a panel survey (same cases interviewed on several occasions). Variables and their names and labels are the same in all occasions, and there is an ID variable to match cases from one round to the next. In the resulting dataset I’d like the variables to have a suffix indicating the wave, so that a variable named VAR01 (existing in all rounds) should be renamed VAR01_2011_1 if corresponding to the first round in 2011, or VAR01_2011_2 if corresponding to the second round of that year, and so on. Up to now they are doing it with a long RENAME VARIABLES command, in which the whole detailed list of variables is included on both sides of the equal sign (there are about 150 variables). I think it should be possible to do it in a more elegant way, and I am sure I did it some time ago but somehow I cannot remember how. That is my question: how to add the same suffix (such as ‘_2011_1’) to the name of all variables, except for the ID variable which is the first one in the file. Thanks in advance. Hector |
In reply to this post by Ruben Geert van den Berg
Hi,
an alternative approach:
data list free/intnr gender age.
begin data.
end data.
vector v (10).
begin program.
import spss,spssaux
exclude=['intnr','gender','age']
renames = ["(" + v + "=" + v + "_2011_1)" for v in spssaux.GetVariableNamesList() if v not in exclude]
spss.Submit("rename variables %s." % " ".join(renames))
end program.
Regards,
Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
Administrator
|
In reply to this post by vlad simion
Feel free to adapt this:
Sub Main Dim varnames() As String Dim StrSyntax As String varnames=objSpssApp.Documents.GetDataDoc(0).GetVariables (False) StrSyntax="RENAME VARIABLES (" For i=0 To UBound(varnames) StrSyntax=StrSyntax & varnames(i) & " " Next StrSyntax=StrSyntax & "=" For i=0 To UBound(varnames) StrSyntax=StrSyntax & varnames(i) & "_" & objSpssApp.ScriptParameter(0) & " " Next StrSyntax=StrSyntax & ")." objSpssApp.ExecuteCommands StrSyntax,False End Sub GET FILE='C:\Program Files\SPSS\1991 U.S. General Social Survey.sav'. SCRIPT 'C:\Users\David Marso\Desktop\SPSSCode\RenameSuffix.sbs' ("2011"). Generates: . RENAME VARIABLES (sex race region happy life sibs childs age educ paeduc maeduc speduc prestg80 occcat80 tax usintl obey popular thnkself workhard helpoth hlth1 hlth2 hlth3 hlth4 hlth5 hlth6 hlth7 hlth8 hlth9 work1 work2 work3 work4 work5 work6 work7 work8 work9 prob1 prob2 prob3 prob4 =sex_2011 race_2011 region_2011 happy_2011 life_2011 sibs_2011 childs_2011 age_2011 educ_2011 paeduc_2011 maeduc_2011 speduc_2011 prestg80_2011 occcat80_2011 tax_2011 usintl_2011 obey_2011 popular_2011 thnkself_2011 workhard_2011 helpoth_2011 hlth1_2011 hlth2_2011 hlth3_2011 hlth4_2011 hlth5_2011 hlth6_2011 hlth7_2011 hlth8_2011 hlth9_2011 work1_2011 work2_2011 work3_2011 work4_2011 work5_2011 work6_2011 work7_2011 work8_2011 work9_2011 prob1_2011 prob2_2011 prob3_2011 prob4_2011 ).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
WTF was I thinking? ;-))
Sub Main With objSpssApp.Documents.GetDataDoc(0) objSpssApp.ExecuteCommands "RENAME VARIABLES ("& Join(.GetVariables (False)) & "=" & _ Join(.GetVariables (False),"_" & objSpssApp.ScriptParameter(0) & " ") & ").",False End With End Sub ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Albert-Jan Roskam
Here’s another alternative, I believe from Ray’s site (spsstools.net): *Rename variables – add the same suffix to a list of variables. define !rename (vlist = !charend('/') /suffix=!cmdend ) !do !vname !in (!vlist)
!let !nname = !concat(!vname, !suffix) rename variables (!vname = !nname). !doend !enddefine. !rename vlist =
VAR01 to VARN /suffix =
‘_2011_1’. From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Albert-Jan Roskam Hi, an alternative approach:
data list free/intnr gender age. begin data. end data. vector v (10). begin program. import spss,spssaux exclude=['intnr','gender','age'] renames = ["(" + v + "=" + v + "_2011_1)" for v in spssaux.GetVariableNamesList() if v not in exclude] spss.Submit("rename variables %s." % " ".join(renames)) end program.
Regards, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From: Ruben van den Berg <[hidden email]> Dear Hector,
Alternatively, you can use a little Python block for this. It needs two tweeks: 1) after "exclude", you should specify (quoted) variable name(s) that you do not want to suffix and 2) after
"suffix" you should specify the desired suffix. HTH, Ruben data list free/intnr gender age. begin data. end data. do repeat v=v1 to v10. compute v=0. end repeat. begin program. import spss,spssaux exclude=['intnr','gender','age'] suffix="_2011_1" vars=spssaux.GetVariableNamesList() for i in exclude: vars.remove(i) spss.Submit("rename variables ("+" ".join(vars)+"="+" ".join([i+suffix for i in vars])+").") end program. Date: Wed, 18 Jul 2012 23:27:20 +0300 have a look at this scripts on Ray's site: hope it helps, vlad On Wed, Jul 18, 2012 at 10:48 PM, Hector Maletta <[hidden email]> wrote: I am asked to provide syntax to merge several datasets from a panel survey (same cases interviewed on several occasions). Variables and their names and labels are the same
in all occasions, and there is an ID variable to match cases from one round to the next. In the resulting dataset I’d like the variables to have a suffix indicating the wave, so that a variable named VAR01 (existing in all rounds) should be renamed VAR01_2011_1
if corresponding to the first round in 2011, or VAR01_2011_2 if corresponding to the second round of that year, and so on. Up to now they are doing it with a long RENAME VARIABLES command, in which the whole detailed list of variables is included on both sides of the equal sign (there are about
150 variables). I think it should be possible to do it in a more elegant way, and I am sure I did it some time ago but somehow I cannot remember how. That is my question: how to add the same suffix (such as ‘_2011_1’) to the name of all variables, except for the ID variable which is the first one in the file. Thanks in advance. Hector |
Administrator
|
VAR01 to VARN ???
MACRO can NOT resolve the TO on the !DO list processing :-((((( That's why Scripting is the solution. You will end up with RENAME VARIABLES (var01 = var01_2011_1). RENAME VARIABLES (TO = TO_2011_1). RENAME VARIABLES (varN = varN_2011_1).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Just meant insert the listing of the variables - VAR01, VAR02, etc. not use exactly what was written...sorry for confusion.
-----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Thursday, July 19, 2012 10:21 AM To: [hidden email] Subject: Re: Add a suffix to all variable names VAR01 to VARN ??? MACRO can NOT resolve the TO on the !DO list processing :-((((( That's why Scripting is the solution. You will end up with RENAME VARIABLES (var01 = var01_2011_1). RENAME VARIABLES (TO = TO_2011_1). RENAME VARIABLES (varN = varN_2011_1). Veena Nambiar wrote > > Here’s another alternative, I believe from Ray’s site (spsstools.net): > > *Rename variables – add the same suffix to a list of variables. > > define !rename (vlist = !charend('/') > /suffix=!cmdend ) > > !do !vname !in (!vlist) > !let !nname = !concat(!vname, !suffix) rename variables (!vname = > !nname). > !doend > !enddefine. > > !rename vlist = > VAR01 to VARN > /suffix = ‘_2011_1’. > > From: SPSSX(r) Discussion [mailto:SPSSX-L@.UGA] On Behalf Of > Albert-Jan Roskam > Sent: Thursday, July 19, 2012 2:11 AM > To: SPSSX-L@.UGA > Subject: Re: Add a suffix to all variable names > > Hi, > > an alternative approach: > > data list free/intnr gender age. > begin data. > end data. > vector v (10). > begin program. > import spss,spssaux > exclude=['intnr','gender','age'] > renames = ["(" + v + "=" + v + "_2011_1)" for v in > spssaux.GetVariableNamesList() if v not in exclude] > spss.Submit("rename variables %s." % " ".join(renames)) end program. > > > > Regards, > Albert-Jan > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > All right, but apart from the sanitation, the medicine, education, > wine, public order, irrigation, roads, a fresh water system, and > public health, what have the Romans ever done for us? > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > From: Ruben van den Berg <ruben_van_den_berg@> > To: SPSSX-L@.UGA > Sent: Thursday, July 19, 2012 7:41 AM > Subject: Re: [SPSSX-L] Add a suffix to all variable names > > Dear Hector, > > Alternatively, you can use a little Python block for this. It needs > two > tweeks: 1) after "exclude", you should specify (quoted) variable > name(s) that you do not want to suffix and 2) after "suffix" you > should specify the desired suffix. > > HTH, > > Ruben > > data list free/intnr gender age. > begin data. > end data. > > do repeat v=v1 to v10. > compute v=0. > end repeat. > > begin program. > import spss,spssaux > exclude=['intnr','gender','age'] > suffix="_2011_1" > vars=spssaux.GetVariableNamesList() > for i in exclude: > vars.remove(i) > spss.Submit("rename variables ("+" ".join(vars)+"="+" ".join([i+suffix > for i in vars])+").") end program. > > Date: Wed, 18 Jul 2012 23:27:20 +0300 > From: vlad.simion@<mailto:vlad.simion@> > Subject: Re: Add a suffix to all variable names > To: SPSSX-L@.UGA<mailto:SPSSX-L@.UGA> > > Hi Hector, > > have a look at this scripts on Ray's site: > > http://spsstools.net/Scripts/Labels/AddExtensionToAllVariableNames.txt > > hope it helps, > vlad > On Wed, Jul 18, 2012 at 10:48 PM, Hector Maletta > <hmaletta@.com<mailto:hmaletta@.com>> wrote: > I am asked to provide syntax to merge several datasets from a panel > survey (same cases interviewed on several occasions). Variables and > their names and labels are the same in all occasions, and there is an > ID variable to match cases from one round to the next. In the > resulting dataset I’d like the variables to have a suffix indicating > the wave, so that a variable named VAR01 (existing in all rounds) > should be renamed VAR01_2011_1 if corresponding to the first round in > 2011, or VAR01_2011_2 if corresponding to the second round of that year, and so on. > > Up to now they are doing it with a long RENAME VARIABLES command, in > which the whole detailed list of variables is included on both sides > of the equal sign (there are about 150 variables). I think it should > be possible to do it in a more elegant way, and I am sure I did it > some time ago but somehow I cannot remember how. > > That is my question: how to add the same suffix (such as ‘_2011_1’) to > the name of all variables, except for the ID variable which is the > first one in the file. > > Thanks in advance. > > Hector > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Add-a-suffix-to-all-variable-names-tp5714298p5714318.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
I suppose, but that requires pasting the list into each call of the macro.
I would build something out which assumes the following. Pseudo code, don't have time to build the !QUOTE(!CONCATS(blah blah blah)) etc... Incoming Files have common root and end in 2011_1, 2011_2 etc . DEFINE Something (List !CMDEND). !DO !I !IN (!LIST) GET FILE !I . SCRIPT 'marso's script from before' (!I) RENAME VARIABLES (!CONCAT(ID,!I)=ID). SAVE OUTFILE !QUOTE(!CONCAT(blah blah blah)). !DOEND !ENDDEFINE Something List=2011_1 2011_2 etc....
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Hector Maletta
At 03:48 PM 7/18/2012, Hector Maletta wrote:
>I am asked to provide syntax to merge several datasets from a panel >survey (same cases interviewed on several occasions). Variables and >their names and labels are the same in all occasions, and there is >an ID variable to match cases from one round to the next. In the >resulting dataset I'd like the variables to have a suffix indicating >the wave, so that a variable named VAR01 (existing in all rounds) >should be renamed VAR01_2011_1 if corresponding to the first round >in 2011, or VAR01_2011_2 if corresponding to the second round of >that year, and so on. You can also do it, perhaps most easily, with CASESTOVARS. I would first catenate all the datasets, with an identifying variable named WAVE (say), with values "2011_1", etc., something like this (code not tested): ADD FILES /FILE=<input for 2011_1>/IN=IN_2011_1 /FILE=<input for 2011_2>/IN=IN_2011_2 ... /FILE=<input for 2012_4>/IN=IN_2012_4. STRING WAVE (A6). DO IF IN_2011_1. . COMPUTE WAVE="2011_1". ELSE IF IN_2011_2. . COMPUTE WAVE="2011_2". ... ELSE IF IN_2012_4. . COMPUTE WAVE="2012_4". END IF. SORT CASES BY ID WAVE. Now you have the data in 'long' organization, with one record for each case for each round. I'd strongly consider leaving it this way, but for the 'wide' organization you're looking for, something like this (again, untested, and I'm a bit rusty), CASESTOVARS /ID =ID /INDEX =WAVE /SEPARATOR="_" /GROUPBY =INDEX ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Thanks to everybody who responded. Very useful suggestions.
One remark: In Richard Ristow response, he refers to "the 'long' organization, with one record for each case for each round" and adds: "I'd strongly consider leaving it this way". In fact that would not do in my case. The point of a panel study (besides having the same sample interviewed on repeated occasions) is the ability to observe changes at the individual level. For instance, consider a turnover table showing vote intention in July crosstabulated with vote intention in October, to detect who changed and who stayed the same. This would be impossible if the records for October are added "below" the records for July, for in that case the connection between the two interviews of the same individual would be lost. For this, I guess, the "wide" organization, where all occasions are added side by side for each individual, is the logical choice. Hector -----Mensaje original----- De: Richard Ristow [mailto:[hidden email]] Enviado el: Saturday, July 21, 2012 16:01 Para: Hector Maletta; [hidden email] Asunto: Re: Add a suffix to all variable names At 03:48 PM 7/18/2012, Hector Maletta wrote: >I am asked to provide syntax to merge several datasets from a panel >survey (same cases interviewed on several occasions). Variables and >their names and labels are the same in all occasions, and there is an >ID variable to match cases from one round to the next. In the resulting >dataset I'd like the variables to have a suffix indicating the wave, so >that a variable named VAR01 (existing in all rounds) should be renamed >VAR01_2011_1 if corresponding to the first round in 2011, or >VAR01_2011_2 if corresponding to the second round of that year, and so >on. You can also do it, perhaps most easily, with CASESTOVARS. I would first catenate all the datasets, with an identifying variable named WAVE (say), with values "2011_1", etc., something like this (code not tested): ADD FILES /FILE=<input for 2011_1>/IN=IN_2011_1 /FILE=<input for 2011_2>/IN=IN_2011_2 ... /FILE=<input for 2012_4>/IN=IN_2012_4. STRING WAVE (A6). DO IF IN_2011_1. . COMPUTE WAVE="2011_1". ELSE IF IN_2011_2. . COMPUTE WAVE="2011_2". ... ELSE IF IN_2012_4. . COMPUTE WAVE="2012_4". END IF. SORT CASES BY ID WAVE. Now you have the data in 'long' organization, with one record for each case for each round. I'd strongly consider leaving it this way, but for the 'wide' organization you're looking for, something like this (again, untested, and I'm a bit rusty), CASESTOVARS /ID =ID /INDEX =WAVE /SEPARATOR="_" /GROUPBY =INDEX ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
But note that some of the newer procedures *do* require a long file format even when there are repeated (or correlated) measures. When analyzing repeated measures via MIXED or GENLIN (with GEE), you need a long file format.
HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Hector Maletta
At 06:06 PM 7/22/2012, Hector Maletta wrote:
>In Richard Ristow's response, he refers to "the 'long' organization, >with one record for each case for each round" and adds: "I'd >strongly consider leaving it this way". >The point of a panel study is to observe changes at the individual >level. For instance, a turnover table showing vote intention in July >crosstabulated with vote intention in October, to detect who changed >and who stayed the same, would be impossible if the records for >October are added "below" the records for July I recommended >CASESTOVARS > /ID =ID > /INDEX =WAVE > /SEPARATOR="_" > /GROUPBY =INDEX. That places all variables from the first wave together, then all variables from the second wave, etc. For the purpose Hector outlines, you might consider, instead, CASESTOVARS /ID =ID /INDEX =WAVE /SEPARATOR="_" /GROUPBY =VARIABLE. That groups all values of any quantity, like vote intention, from all waves, together in the resulting file. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |