|
Dear all,
I've the weirdest problem: I'm running a syntax file with 174 DELETE VARIABLES commands. The datafile contains 3057 cases and 638 variables. In first instance, everything goes fine but after some 10, 15 lines of DELETE VARIABLES, SPSS slows down incredibly and seems to choke competely (right now, it has been running for about an hour). The problem occurs with my PC (V14) as well as with my laptop (V15). Also, the syntax is pasted into my viewer bit by bit: in first instance some 10-15 lines are pasted. Then some 5 minutes later another 3 or 4 lines or so. And then it either stops completely or slows down so badly that any progress becomes invisible. My PC and Laptop 'choke' at different points in the syntax. Therefore, I've the feeling the problem is located within my data file. Also, a different syntax file with the same structure (only DELETE VARIABLES commands but different ones) yields the same symptoms. Does anybody have a clue what the problem may be? Is there anyway to check whether the datafile is 'corrupted'? Does anybody have any similar experience? All suggestions are more than welcome! TIA! Ruben van den Berg See all the ways you can stay connected to friends and family |
|
Hello Ruben,
Have you tried to delete the variables using only
one single syntax command yet? For example, DELETE VARIABLES var1,
var2, var3,...,varx or DELETE VARIABLES var1 to varx. The variables don't
necessarily have to be deleted using 174 DELETE VARIABLES commands. I don't
know whether this will resolve the problem but you can always try. Good
luck!
Joost van Ginkel
Joost R. Van Ginkel, PhD
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ruben van den Berg Sent: 27 August 2009 10:51 To: [hidden email] Subject: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? I've the weirdest problem: I'm running a syntax file with 174 DELETE VARIABLES commands. The datafile contains 3057 cases and 638 variables. In first instance, everything goes fine but after some 10, 15 lines of DELETE VARIABLES, SPSS slows down incredibly and seems to choke competely (right now, it has been running for about an hour). The problem occurs with my PC (V14) as well as with my laptop (V15). Also, the syntax is pasted into my viewer bit by bit: in first instance some 10-15 lines are pasted. Then some 5 minutes later another 3 or 4 lines or so. And then it either stops completely or slows down so badly that any progress becomes invisible. My PC and Laptop 'choke' at different points in the syntax. Therefore, I've the feeling the problem is located within my data file. Also, a different syntax file with the same structure (only DELETE VARIABLES commands but different ones) yields the same symptoms. Does anybody have a clue what the problem may be? Is there anyway to check whether the datafile is 'corrupted'? Does anybody have any similar experience? All suggestions are more than welcome! TIA! Ruben van den Berg See all the ways you can stay connected to friends and family ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. **********************************************************************
|
|
The other alternative might be to save the file and use either KEEP or DROP and I think that I have seen somewhere in the list you could use: SAVE OUTFILE = * / DROP = var1, var3, etc. / Best Wishes John S. Lemon Student Liaison Officer Directorate of Information Technology (DIT) - University of Aberdeen Edward Wright Building: Room G51 Tel: +44 1224 273350 Fax: +44 1224 273372 From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Ginkel, Joost van Hello Ruben, Have you tried to delete the variables using only one single syntax command yet? For example, DELETE VARIABLES var1, var2, var3,...,varx or DELETE VARIABLES var1
to varx. The variables don't necessarily have to be deleted using 174 DELETE VARIABLES commands. I don't know whether this will resolve the problem but you can always try. Good luck! Joost van Ginkel Joost R. Van Ginkel, PhD
From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Ruben van den Berg Dear all, See all the ways you can stay connected
to friends and family ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** The University of Aberdeen is a charity registered in Scotland, No SC013683. |
|
In reply to this post by Ruben Geert van den Berg
Hi!
This is a problem I have seen a lot. The best way to solve the problem is to save the data set and use SAVE OUTFILE and include the line DROP or KEEP. SAVE OUTFILE='hh.sav' Â /DROP=VAR1 VAR2 VAR3 Â /COMPRESSED. All the best 2009/8/27 Ruben van den Berg <[hidden email]>
-- Wilhelm Landerholm Queue/STATB BOX 92 162 12 Vallingby Sweden +46-735-460000 http://www.qsweden.com http://www.statb.com QUEUE - your partner in data analysis, data modeling and data mining. STATB - your Research agency. |
|
In reply to this post by Ruben Geert van den Berg
Hi Ruben,
did you try the old fashioned method of kicking out variables
using the MATCH FILES command?
MATCH FILES
/FILE = *
/DROP variablelist .
EXECUTE .
This deletes the named variables in one data
pass.
HTH
Best regards
Georg Maubach
Von: SPSSX(r) Discussion [mailto:[hidden email]] Im Auftrag von Ruben van den Berg Gesendet: Donnerstag, 27. August 2009 10:51 An: [hidden email] Betreff: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? I've the weirdest problem: I'm running a syntax file with 174 DELETE VARIABLES commands. The datafile contains 3057 cases and 638 variables. In first instance, everything goes fine but after some 10, 15 lines of DELETE VARIABLES, SPSS slows down incredibly and seems to choke competely (right now, it has been running for about an hour). The problem occurs with my PC (V14) as well as with my laptop (V15). Also, the syntax is pasted into my viewer bit by bit: in first instance some 10-15 lines are pasted. Then some 5 minutes later another 3 or 4 lines or so. And then it either stops completely or slows down so badly that any progress becomes invisible. My PC and Laptop 'choke' at different points in the syntax. Therefore, I've the feeling the problem is located within my data file. Also, a different syntax file with the same structure (only DELETE VARIABLES commands but different ones) yields the same symptoms. Does anybody have a clue what the problem may be? Is there anyway to check whether the datafile is 'corrupted'? Does anybody have any similar experience? All suggestions are more than welcome! TIA! Ruben van den Berg See all the ways you can stay connected to friends and family |
|
Dear Joost, John, Wilhelm, Georg (and other LISTers of course),
I've two overlapping lists of variables I'd like to delete. In first instance I tried MATCH FILES with a DROP subcommand. However, the second run, MATCH FILES is asked to DROP variables that have been dropped during the first run, causing the entrire MATCH FILES to break down. I guess the same will happen with single DELETE VARIABLES commands: during the first run all variables will be present so it should work fine. But some variables that are removed during the first run appear in the second DELETE VARIABLES command again, causing the entire command to break down. I thought separate DEL VAR commands would yield many errors but -more importantly- also the desired result. I really don't see why SPSS has to choke on such basic syntax. But since it does, I'll workaround the problem by merging and unduplicating (ADD CASES, then AGGREGATE) my two lists of variables to be removed and write a MATCH FILES around it. It will definitely work! Thanx for your recommendations all! Ruben van den Berg Date: Thu, 27 Aug 2009 12:16:26 +0200 From: [hidden email] Subject: AW: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? To: [hidden email] Hi Ruben,
did you try the old fashioned method of kicking out variables using the MATCH FILES command?
MATCH FILES
/FILE = *
/DROP variablelist .
EXECUTE .
This deletes the named variables in one data pass.
HTH
Best regards
Georg Maubach
Von: SPSSX(r) Discussion [mailto:[hidden email]] Im Auftrag von Ruben van den Berg Gesendet: Donnerstag, 27. August 2009 10:51 An: [hidden email] Betreff: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? I've the weirdest problem: I'm running a syntax file with 174 DELETE VARIABLES commands. The datafile contains 3057 cases and 638 variables. In first instance, everything goes fine but after some 10, 15 lines of DELETE VARIABLES, SPSS slows down incredibly and seems to choke competely (right now, it has been running for about an hour). The problem occurs with my PC (V14) as well as with my laptop (V15). Also, the syntax is pasted into my viewer bit by bit: in first instance some 10-15 lines are pasted. Then some 5 minutes later another 3 or 4 lines or so. And then it either stops completely or slows down so badly that any progress becomes invisible. My PC and Laptop 'choke' at different points in the syntax. Therefore, I've the feeling the problem is located within my data file. Also, a different syntax file with the same structure (only DELETE VARIABLES commands but different ones) yields the same symptoms. Does anybody have a clue what the problem may be? Is there anyway to check whether the datafile is 'corrupted'? Does anybody have any similar experience? All suggestions are more than welcome! TIA! Ruben van den Berg See all the ways you can stay connected to friends and family Express yourself instantly with MSN Messenger! MSN Messenger |
|
Dear Wilhelm and other listers,
I'm sorry about the confusion. I've two SPSS datafiles containing a column with variable names and a column 'remove' (whether to remove this variable or not). I first 'SELECT CASES' (i.e. variables) with Remove EQ 1 (should be removed), generating two lists of variables to be removed. I first used something like STR syn(A200). COMP syn=CON("DEL VAR ",RTR(Var1),"."). EXE. SAV TRA OUT 'Strip_1.sps' /TYP TAB /REP /KEE Syn. to make syntax with the separate DELETE VARIABLES commands. But alternatively, I can stack the two variable lists in a single column with ADD CASES and then AGGREGATE the result with variable names as the break variable in order to obtain a single list of variables to be removed without any variable names appearing twice. Then the MATCH FILES syntax can be made with DO IF $CASENUM=1. WRI OUT '$Var_sel_2.sps'/'MATC FIL FIL *'. WRI OUT '$Var_sel_2.sps'/' ','DRO '. END IF. WRI OUT '$Var_sel_2.sps'/' 'Var1. EXE. Fortunately, this second approached worked fine. Marta G.G. taught me this last bit of syntax by the way. Kind regards, Ruben van den Berg Date: Thu, 27 Aug 2009 14:16:30 +0200 Subject: Re: AW: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? From: [hidden email] To: [hidden email] I do not understand you Ruben. ADD CASES is a question about CASES (rows). DEL VAR is a quiz about VARIABELS (columns). If you want, you can send your syntax to me and I can have a look at it. All the best \WL 2009/8/27 Ruben van den Berg <[hidden email]>
-- Wilhelm Landerholm Queue/STATB BOX 92 162 12 Vallingby Sweden +46-735-460000 http://www.qsweden.com http://www.statb.com QUEUE - your partner in data analysis, data modeling and data mining. STATB - your Research agency. What can you do with the new Windows Live? Find out |
|
In reply to this post by Ruben Geert van den Berg
I want to yell out questions from the back bench. To Ruben: Why are you
trying to delete variables that no longer exist? (Although, hasn't everybody who has used delete variables noticed that the names of the variables deleted are simply 'grayed out' in the data file rather than removed as the data is removed. This seems pretty odd to me.) To spss. When somebody tries to delete an already deleted variable why doesn't the program generate an error. In theory the variables no longer exist and, i'd assume that the command processor checks specified variables names against the variable name vector before commencing the command. Or, do the names still exist somewhere? Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Dear Gene (and other LISTers of course,
Even if I could, I wouldn't want to delete any variables that no longer exist. I just want to delete a number of variables that are not relevant. Unfortunately, the irrelevant variable names reside in two data sources which overlap. I'm not sure which version you use but neither in my V14 nor in my V15 are deleted variables greyed out, they really disappear -like magic! If I convert my two lists of irrelevant variables names into 2 syntax files containing a separate DEL VAR command for each variable, duplicate DEL VAR commands generate error messages but every variable I'd like to remove will nevertheless be removed from the data file -that's why I tried that first. In theory this was the fastest way to achieve my goal and it should work fine but in practice, SPSS just got stuck or something. Fortunately, I've already worked around the problem. Kind regards, Ruben van den Berg P.s. difficulties like these may seem strange to some people but I often have to work with large, messy datafiles and cleaning these files up is actually quite a bit of work sometimes. > Date: Thu, 27 Aug 2009 09:43:54 -0400 > From: [hidden email] > Subject: Re: AW: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? > To: [hidden email] > > I want to yell out questions from the back bench. To Ruben: Why are you > trying to delete variables that no longer exist? (Although, hasn't everybody > who has used delete variables noticed that the names of the variables > deleted are simply 'grayed out' in the data file rather than removed as the > data is removed. This seems pretty odd to me.) > > To spss. When somebody tries to delete an already deleted variable why > doesn't the program generate an error. In theory the variables no longer > exist and, i'd assume that the command processor checks specified variables > names against the variable name vector before commencing the command. Or, do > the names still exist somewhere? > > > Gene Maguin > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD What can you do with the new Windows Live? Find out |
|
Administrator
|
In reply to this post by Joost van Ginkel
Joost may have put his finger on the problem here. I suspect that having 174 separate DELETE VARIABLE commands is akin to having 174 COMPUTE statements, each with its own EXECUTE. For the COMPUTE example, things certainly get done a lot more quickly if you have just one EXECUTE at the end. The most graphic example of this I've ever witnessed was taking a colleague's job that needed to run overnight (due to an EXECUTE for every COMPUTE), and reducing it to about 18 seconds run time, IIRC. I believe the response that came back was "Whoosh!" ;-)
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
Hello Bruce,
It sounds reasonable but of course we can't be sure until my suggestion has been tried out. (Un)fortunately I'm currently not faced with a problem of deleting 174 variables at once so the only one who can find out is Ruben himself ;) Joost Joost R. Van Ginkel, PhD Leiden University Faculty of Social and Behavioural Sciences Data Theory Group PO Box 9555 2300 RB Leiden The Netherlands Tel: +31-(0)71-527 3620 Fax: +31-(0)71-527 1721 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver Sent: 27 August 2009 17:58 To: [hidden email] Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? Ginkel, Joost van wrote: > > Hello Ruben, > > Have you tried to delete the variables using only one single syntax > command yet? For example, DELETE VARIABLES var1, var2, var3,...,varx > or DELETE VARIABLES var1 to varx. The variables don't necessarily have > to be deleted using 174 DELETE VARIABLES commands. I don't know > whether this will resolve the problem but you can always try. Good luck! > > Joost van Ginkel > > Joost may have put his finger on the problem here. I suspect that having 174 separate DELETE VARIABLE commands is akin to having 174 COMPUTE statements, each with its own EXECUTE. For the COMPUTE example, things certainly get done a lot more quickly if you have just one EXECUTE at the end. The most graphic example of this I've ever witnessed was taking a colleague's job that needed to run overnight (due to an EXECUTE for every COMPUTE), and reducing it to about 18 seconds run time, IIRC. I believe the response that came back was "Whoosh!" ;-) ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My hotmail address is for posting only, and messages sent to it will be deleted. -- View this message in context: http://www.nabble.com/%27DELETE-VARIABLES%27-impossibly-slow---is-my-dat afile-%27corrupted%27--tp25168776p25172615.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Dear Bruce and Joost,
I'm not sure whether I find this entirely plausible. -The datafile is rather small, say, 3000 cases and 650 variables. If I select some 5 lines of DEL VAR and run them, it takes about one second. So then why can't I run 174 lines in 34.8 seconds? They are separate commands after all. -If I run all lines (CTRL + A, CTRL + R), the first 10 or 15 lines run perfectly well and fast. Then things slow down and eventually break down. In theory, I'd expect the execution of such syntax to speed up instead of slow down since the list of variables to be scanned gets shorter after every command. -DEL VAR doesn't require a data pass or does it? I'd expect the number of variables to be positively related with the processing time but not the number of cases. One experiment still to be carried out, is constructing a comparable syntax file based on a different (but roughly comparable) .sav file in order to see whether the same symptoms occur. I'll let you know the outcome. I appreciate the discussion. Have a nice weekend! > Date: Fri, 28 Aug 2009 09:26:49 +0200 > From: [hidden email] > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? > To: [hidden email] > > Hello Bruce, > > It sounds reasonable but of course we can't be sure until my suggestion > has been tried out. (Un)fortunately I'm currently not faced with a > problem of deleting 174 variables at once so the only one who can find > out is Ruben himself ;) > > Joost > > > Joost R. Van Ginkel, PhD > Leiden University > Faculty of Social and Behavioural Sciences > Data Theory Group > PO Box 9555 > 2300 RB Leiden > The Netherlands > Tel: +31-(0)71-527 3620 > Fax: +31-(0)71-527 1721 > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Bruce Weaver > Sent: 27 August 2009 17:58 > To: [hidden email] > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile > 'corrupted'? > > Ginkel, Joost van wrote: > > > > Hello Ruben, > > > > Have you tried to delete the variables using only one single syntax > > command yet? For example, DELETE VARIABLES var1, var2, var3,...,varx > > or DELETE VARIABLES var1 to varx. The variables don't necessarily have > > > to be deleted using 174 DELETE VARIABLES commands. I don't know > > whether this will resolve the problem but you can always try. Good > luck! > > > > Joost van Ginkel > > > > > > Joost may have put his finger on the problem here. I suspect that > having > 174 separate DELETE VARIABLE commands is akin to having 174 COMPUTE > statements, each with its own EXECUTE. For the COMPUTE example, things > certainly get done a lot more quickly if you have just one EXECUTE at > the end. The most graphic example of this I've ever witnessed was > taking a colleague's job that needed to run overnight (due to an EXECUTE > for every COMPUTE), and reducing it to about 18 seconds run time, IIRC. > I believe the response that came back was "Whoosh!" ;-) > > > > ----- > -- > Bruce Weaver > [hidden email] > http://sites.google.com/a/lakeheadu.ca/bweaver/ > "When all else fails, RTFM." > > NOTE: My hotmail address is for posting only, and messages sent to it > will be deleted. > > -- > View this message in context: > http://www.nabble.com/%27DELETE-VARIABLES%27-impossibly-slow---is-my-dat > afile-%27corrupted%27--tp25168776p25172615.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command SIGNOFF SPSSX-L For a list > of commands to manage subscriptions, send the command INFO REFCARD > > ********************************************************************** > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the system manager. > ********************************************************************** > > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD What can you do with the new Windows Live? Find out |
|
In reply to this post by Joost van Ginkel
Dear Joost, Bruce and others,
I replicated the problem with a different data file so apparently that's not the problem. 10 lines of DEL VAR run like a razor but when I tried to run 15 lines, SPSS started to 'choke' after running line 8. Strange but true. Also, Joost was right: a single DELETE VARIABLES for all variables does work but is slightly more fragile since the entire command breaks down as soon as any error is encountered. Kind regards, Ruben van den Berg > Date: Fri, 28 Aug 2009 09:26:49 +0200 > From: [hidden email] > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? > To: [hidden email] > > Hello Bruce, > > It sounds reasonable but of course we can't be sure until my suggestion > has been tried out. (Un)fortunately I'm currently not faced with a > problem of deleting 174 variables at once so the only one who can find > out is Ruben himself ;) > > Joost > > > Joost R. Van Ginkel, PhD > Leiden University > Faculty of Social and Behavioural Sciences > Data Theory Group > PO Box 9555 > 2300 RB Leiden > The Netherlands > Tel: +31-(0)71-527 3620 > Fax: +31-(0)71-527 1721 > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Bruce Weaver > Sent: 27 August 2009 17:58 > To: [hidden email] > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile > 'corrupted'? > > Ginkel, Joost van wrote: > > > > Hello Ruben, > > > > Have you tried to delete the variables using only one single syntax > > command yet? For example, DELETE VARIABLES var1, var2, var3,...,varx > > or DELETE VARIABLES var1 to varx. The variables don't necessarily have > > > to be deleted using 174 DELETE VARIABLES commands. I don't know > > whether this will resolve the problem but you can always try. Good > luck! > > > > Joost van Ginkel > > > > > > Joost may have put his finger on the problem here. I suspect that > having > 174 separate DELETE VARIABLE commands is akin to having 174 COMPUTE > statements, each with its own EXECUTE. For the COMPUTE example, things > certainly get done a lot more quickly if you have just one EXECUTE at > the end. The most graphic example of this I've ever witnessed was > taking a colleague's job that needed to run overnight (due to an EXECUTE > for every COMPUTE), and reducing it to about 18 seconds run time, IIRC. > I believe the response that came back was "Whoosh!" ;-) > > > > ----- > -- > Bruce Weaver > [hidden email] > http://sites.google.com/a/lakeheadu.ca/bweaver/ > "When all else fails, RTFM." > > NOTE: My hotmail address is for posting only, and messages sent to it > will be deleted. > > -- > View this message in context: > http://www.nabble.com/%27DELETE-VARIABLES%27-impossibly-slow---is-my-dat > afile-%27corrupted%27--tp25168776p25172615.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command SIGNOFF SPSSX-L For a list > of commands to manage subscriptions, send the command INFO REFCARD > > ********************************************************************** > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the system manager. > ********************************************************************** > > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD Express yourself instantly with MSN Messenger! MSN Messenger |
|
In reply to this post by Ruben Geert van den Berg
Hi Ruben:
Try inserting: CACHE. EXE. every 20 DELETE commands or so. HTH and happy weekend to you too, Marta GG Ruben van den Berg escribió: > Dear Bruce and Joost, > > I'm not sure whether I find this entirely plausible. > > -The datafile is rather small, say, 3000 cases and 650 variables. If I > select some 5 lines of DEL VAR and run them, it takes about one > second. So then why can't I run 174 lines in 34.8 seconds? They are > separate commands after all. > > -If I run all lines (CTRL + A, CTRL + R), the first 10 or 15 lines run > perfectly well and fast. Then things slow down and eventually break > down. In theory, I'd expect the execution of such syntax to *speed up* > instead of slow down since the list of variables to be scanned gets > shorter after every command. > > -DEL VAR doesn't require a data pass or does it? I'd expect the number > of variables to be positively related with the processing time but not > the number of cases. > > One experiment still to be carried out, is constructing a comparable > syntax file based on a different (but roughly comparable) .sav file in > order to see whether the same symptoms occur. I'll let you know the > outcome. > > I appreciate the discussion. Have a nice weekend! > > > > > > Date: Fri, 28 Aug 2009 09:26:49 +0200 > > From: [hidden email] > > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile > 'corrupted'? > > To: [hidden email] > > > > Hello Bruce, > > > > It sounds reasonable but of course we can't be sure until my suggestion > > has been tried out. (Un)fortunately I'm currently not faced with a > > problem of deleting 174 variables at once so the only one who can find > > out is Ruben himself ;) > > > > Joost > > > > > > Joost R. Van Ginkel, PhD > > Leiden University > > Faculty of Social and Behavioural Sciences > > Data Theory Group > > PO Box 9555 > > 2300 RB Leiden > > The Netherlands > > Tel: +31-(0)71-527 3620 > > Fax: +31-(0)71-527 1721 > > > > > > -----Original Message----- > > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > > Bruce Weaver > > Sent: 27 August 2009 17:58 > > To: [hidden email] > > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile > > 'corrupted'? > > > > Ginkel, Joost van wrote: > > > > > > Hello Ruben, > > > > > > Have you tried to delete the variables using only one single syntax > > > command yet? For example, DELETE VARIABLES var1, var2, var3,...,varx > > > or DELETE VARIABLES var1 to varx. The variables don't necessarily have > > > > > to be deleted using 174 DELETE VARIABLES commands. I don't know > > > whether this will resolve the problem but you can always try. Good > > luck! > > > > > > Joost van Ginkel > > > > > > > > > > Joost may have put his finger on the problem here. I suspect that > > having > > 174 separate DELETE VARIABLE commands is akin to having 174 COMPUTE > > statements, each with its own EXECUTE. For the COMPUTE example, things > > certainly get done a lot more quickly if you have just one EXECUTE at > > the end. The most graphic example of this I've ever witnessed was > > taking a colleague's job that needed to run overnight (due to an EXECUTE > > for every COMPUTE), and reducing it to about 18 seconds run time, IIRC. > > I believe the response that came back was "Whoosh!" ;-) > > > > > > > > ----- > > -- > > Bruce Weaver > > [hidden email] > > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > > > NOTE: My hotmail address is for posting only, and messages sent to it > > will be deleted. > > > > -- > > View this message in context: > > http://www.nabble.com/%27DELETE-VARIABLES%27-impossibly-slow---is-my-dat > > afile-%27corrupted%27--tp25168776p25172615.html > > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > > > ===================== > > To manage your subscription to SPSSX-L, send a message to > > [hidden email] (not to SPSSX-L), with no body text except the > > command. To leave the list, send the command SIGNOFF SPSSX-L For a list > > of commands to manage subscriptions, send the command INFO REFCARD > > > > ********************************************************************** > > This email and any files transmitted with it are confidential and > > intended solely for the use of the individual or entity to whom they > > are addressed. If you have received this email in error please notify > > the system manager. > > ********************************************************************** > > > > > > ===================== > > To manage your subscription to SPSSX-L, send a message to > > [hidden email] (not to SPSSX-L), with no body text except the > > command. To leave the list, send the command > > SIGNOFF SPSSX-L > > For a list of commands to manage subscriptions, send the command > > INFO REFCARD > > ------------------------------------------------------------------------ > What can you do with the new Windows Live? Find out > <http://www.microsoft.com/windows/windowslive/default.aspx> -- For miscellaneous SPSS related statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Ruben Geert van den Berg
Note that there is a function in the spssaux programmability
module named deleteVars that can do all the deletions with a single command AND
automatically eliminates variables that don’t exist. Usage example: begin program. import spss, spssaux spssaux.deleteVars(“list-of-variable-names”) end program. Requires the Python programmability module. If the list is very long, it could be written as spssaux.deleteVars(“””list line 1 list line2 ….”””) HTH, Jon Peck From: SPSSX(r) Discussion
[mailto:[hidden email]] On Behalf Of Ruben van den Berg Dear Joost, Bruce and others, Express
yourself instantly with MSN Messenger! MSN
Messenger |
|
In reply to this post by Marta Garcia-Granero
Dear Marta,
Thanks, inserting CACHE.EXE. makes a huge difference! The syntax with CACHE.EXE. finished in reasonable time while the same syntax without it didn't finish at all. So a happy weekend after all ;-) Best regards, Ruben van den Berg P.s. in the unlikely case anybody would like to replicate the findings from this experiment, see below for the syntax. It's a dirty job, but SPSS got to do it. ***Syntax with CACHE.EXE. SET SEED 12345. SET TVA DATAS CLO ALL. INP PRO. LOOP ID=1 to 1000. END CAS. END END FIL. END INP PRO. DO REP Vars=Var1 to Var1500. COMP Vars=RV.BER(.5). END REP. DATAS * OMS. DATASET DECLARE Des. OMS /SELECT TABLES /IF COMMANDS=['Descriptives'] SUBTYPES=['Descriptive Statistics'] /DESTINATION FORMAT=SAV NUMBERED=TableNumber_ OUTFILE='Des'. DES ALL. OMSEND. DATAS ACT Des. COMP Delete=RV.BER(.5). IF Var1 EQ "ID" Delete=0. SEL IF Delete=1 AND Var1 NE "Valid N (listwise)". COMP Temp=MOD($casenum,8). STR Temp1(A10). STR Temp2(A10). DO IF Temp=0. COMP Temp1="EXE.". COMP Temp2="CACHE.". END IF. MATC FIL FIL * /KEE Var1 Temp1 Temp2. VARSTOCASES /MAKE Syn FROM Var1 Temp1 Temp2. STR Syn2(A100). DO IF ( COMP Syn=CON("DEL VAR ",RTR(Syn),"." ). END IF. SAV TRA OUT 'C:\Temp\Delvars_a.sps' /TYP TAB /REP. DATAS ACT Data. DATAS CLO Des. INS FIL='C:\Temp\Delvars_a.sps'. EXE. SET SEED 12345. SET TVA NAM. DATAS CLO ALL. INP PRO. LOOP ID=1 to 1000. END CAS. END LOOP. END FIL. END INP PRO. DO REP Vars=Var1 to Var1500. COMP Vars=RV.BER(.5). END REP. DATAS NAM Data. * OMS. DATASET DECLARE Des. OMS /SELECT TABLES /IF COMMANDS=['Descriptives'] SUBTYPES=['Descriptive Statistics'] /DESTINATION FORMAT=SAV NUMBERED=TableNumber_ OUTFILE='Des'. DES ALL. OMSEND. DATAS ACT Des. COMP Delete=RV.BER(.5). IF Var1 EQ "ID" Delete=0. SEL IF Delete=1 AND Var1 NE "Valid N (listwise)". STR Syn(A200). COMP Syn=CON("DEL VAR ",RTR(Var1),"." ). SAV TRA OUT 'C:\Temp\Delvars_b.sps' /TYP TAB /KEE Syn /REP. DATAS ACT Data. DATAS CLO Des. INS FIL='C:\Temp\Delvars_b.sps'. EXE. > Date: Fri, 28 Aug 2009 15:54:01 +0200 > From: [hidden email] > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile 'corrupted'? > To: [hidden email] > > Hi Ruben: > > Try inserting: > > CACHE. > EXE. > > every 20 DELETE commands or so. > > HTH and happy weekend to you too, > Marta GG > > > > Ruben van den Berg escribió: > > Dear Bruce and Joost, > > > > I'm not sure whether I find this entirely plausible. > > > > -The datafile is rather small, say, 3000 cases and 650 variables. If I > > select some 5 lines of DEL VAR and run them, it takes about one > > second. So then why can't I run 174 lines in 34.8 seconds? They are > > separate commands after all. > > > > -If I run all lines (CTRL + A, CTRL + R), the first 10 or 15 lines run > > perfectly well and fast. Then things slow down and eventually break > > down. In theory, I'd expect the execution of such syntax to *speed up* > > instead of slow down since the list of variables to be scanned gets > > shorter after every command. > > > > -DEL VAR doesn't require a data pass or does it? I'd expect the number > > of variables to be positively related with the processing time but not > > the number of cases. > > > > One experiment still to be carried out, is constructing a comparable > > syntax file based on a different (but roughly comparable) .sav file in > > order to see whether the same symptoms occur. I'll let you know the > > outcome. > > > > I appreciate the discussion. Have a nice weekend! > > > > > > > > > > > Date: Fri, 28 Aug 2009 09:26:49 +0200 > > > From: [hidden email] > > > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile > > 'corrupted'? > > > To: [hidden email] > > > > > > Hello Bruce, > > > > > > It sounds reasonable but of course we can't be sure until my suggestion > > > has been tried out. (Un)fortunately I'm currently not faced with a > > > problem of deleting 174 variables at once so the only one who can find > > > out is Ruben himself ;) > > > > > > Joost > > > > > > > > > Joost R. Van Ginkel, PhD > > > Leiden University > > > Faculty of Social and Behavioural Sciences > > > Data Theory Group > > > PO Box 9555 > > > 2300 RB Leiden > > > The Netherlands > > > Tel: +31-(0)71-527 3620 > > > Fax: +31-(0)71-527 1721 > > > > > > > > > -----Original Message----- > > > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > > > Bruce Weaver > > > Sent: 27 August 2009 17:58 > > > To: [hidden email] > > > Subject: Re: 'DELETE VARIABLES' impossibly slow - is my datafile > > > 'corrupted'? > > > > > > Ginkel, Joost van wrote: > > > > > > > > Hello Ruben, > > > > > > > > Have you tried to delete the variables using only one single syntax > > > > command yet? For example, DELETE VARIABLES var1, var2, var3,...,varx > > > > or DELETE VARIABLES var1 to varx. The variables don't necessarily have > > > > > > > to be deleted using 174 DELETE VARIABLES commands. I don't know > > > > whether this will resolve the problem but you can always try. Good > > > luck! > > > > > > > > Joost van Ginkel > > > > > > > > > > > > > > Joost may have put his finger on the problem here. I suspect that > > > having > > > 174 separate DELETE VARIABLE commands is akin to having 174 COMPUTE > > > statements, each with its own EXECUTE. For the COMPUTE example, things > > > certainly get done a lot more quickly if you have just one EXECUTE at > > > the end. The most graphic example of this I've ever witnessed was > > > taking a colleague's job that needed to run overnight (due to an EXECUTE > > > for every COMPUTE), and reducing it to about 18 seconds run time, IIRC. > > > I believe the response that came back was "Whoosh!" ;-) > > > > > > > > > > > > ----- > > > -- > > > Bruce Weaver > > > [hidden email] > > > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > > "When all else fails, RTFM." > > > > > > NOTE: My hotmail address is for posting only, and messages sent to it > > > will be deleted. > > > > > > -- > > > View this message in context: > > > http://www.nabble.com/%27DELETE-VARIABLES%27-impossibly-slow---is-my-dat > > > afile-%27corrupted%27--tp25168776p25172615.html > > > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > > > > > ===================== > > > To manage your subscription to SPSSX-L, send a message to > > > [hidden email] (not to SPSSX-L), with no body text except the > > > command. To leave the list, send the command SIGNOFF SPSSX-L For a list > > > of commands to manage subscriptions, send the command INFO REFCARD > > > > > > ********************************************************************** > > > This email and any files transmitted with it are confidential and > > > intended solely for the use of the individual or entity to whom they > > > are addressed. If you have received this email in error please notify > > > the system manager. > > > ********************************************************************** > > > > > > > > > ===================== > > > To manage your subscription to SPSSX-L, send a message to > > > [hidden email] (not to SPSSX-L), with no body text except the > > > command. To leave the list, send the command > > > SIGNOFF SPSSX-L > > > For a list of commands to manage subscriptions, send the command > > > INFO REFCARD > > > > ------------------------------------------------------------------------ > > What can you do with the new Windows Live? Find out > > <http://www.microsoft.com/windows/windowslive/default.aspx> > > > -- > For miscellaneous SPSS related statistical stuff, visit: > http://gjyp.nl/marta/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD What can you do with the new Windows Live? Find out |
|
In reply to this post by Ruben Geert van den Berg
I have not read the entire thread but the
simplest way to do this is to include /Keep in the GET command (in other cases
the /Drop parameter may be preferable). This limits the variables loaded
into the active file for the current job. For example: GET File=’yourfilename.sav
‘ /Keep= var01 var10 var45 . produces an active file with just the 3
specified variables. Dennis Deck From: Ruben van den
Berg [mailto:[hidden email]] Dear Gene (and other LISTers of
course, What can you do with the new Windows Live? Find
out |
| Free forum by Nabble | Edit this page |
