|
no I
did not intend to leave that part of the file
spec in there. I was cannibalizing.
in my defense,
I did say NCY.
Of course I would not want to routinely
copy the text
n number of times (here n = 38).
That is a very old way
to do it. I did say I though there was a Python
way to do it. I did mean to say that the example
I was posting was what a generalized approach
would be doing
However, on a one time run, copy
and pasting, using a fixed font
and vertically eyeballing would get be what we had to
do many years ago. It would
work if you double checked the syntax
and its results.
BTW does the Python
method have built in randomization be
splitting into
subsets?
If I were actually
doing this on a job I would
have checked Developer Works, but I
thought it was down for a
while.
Also, I
still wonder why the OP was splitting
the file into separate files.
MAYBE something like the example would
do what the OP needed.
MAYBE bootstrapping would be what the OP would want.
MAYBE the OP would not need to
randomize before the
split.
compute
RandomOrder = uniform(2**31).
sort cases by RandomOrder.
compute SubSet = trunc($casenum/10000).
frequencies variables = subset.
split file by subset.
. . .
Art Kendall
Social Research Consultants
On 4/8/2013 10:30 AM, Jon K Peck wrote:
I don't think Art
meant to have those GET
commands in there. But putting that aside, this is an example
of
how NOT to do a task even though it would work.
It is painful to write all that
code,
and, worse, the chances of getting it exactly right are not
great - boredom
will set in long before that many XSAVE commands are written, so
careful
testing and code review is required. Finally, it is very
specific
to these particular numbers, so it doesn't make a good model for
a general
solution.
Generalization + correctness +
pain
reduction = Python
Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621
From:
Art Kendall
[hidden email]
To:
[hidden email],
Date:
04/08/2013 08:16 AM
Subject:
Re: [SPSSX-L]
Dividing file into 10,000 case chunks
Sent by:
"SPSSX(r)
Discussion" [hidden email]
see the archive for writing out separate files.
IIRC
there is a Python method.
UNTESTED
not sure if you need the +1 on the mod.
you may not need to randomize the order of cases.
NCY -- NO Coffee Yet.
this is what you would want to Macro or Python to do. or you can
just write
the 38 set of xsaves.
compute RandomOrder = uniform(31**2).
sort cases by RandomOrder.
compute WhichFile = mod($casenum, 10000)+1.
do if WhichFile eq 1.
xsave outfile = 'j:Get file='D:\sbec\tea ids\Master ID List
subset 1.sav'.
else if WhichFile eq 2.
xsave outfile = 'j:Get file='D:\sbec\tea ids\Master ID List
subset 2.sav'.
else if WhichFile eq 3.
xsave outfile = 'j:Get file='D:\sbec\tea ids\Master ID List
subset 3.sav'.
. . .
else if WhichFile eq 38.
xsave outfile = 'j:Get file='D:\sbec\tea ids\Master ID List
subset 38.sav'.
else.
print /'oops WhichFile is ' WhichFile.
frequencies variables = WhichFile.
However why are you doing this? There may be other approaches.
Art Kendall
Social Research Consultants
On 4/7/2013 8:55 PM, Van Overschelde, Jim [via
SPSSX Discussion]
wrote:
Hey folks,
I have tried for many hours to figure out how to write a macro
to divide
a 380,000 case file into 38 files with 10,000 cases.
My most recent attempt gives error: "A macro expansion required
more
storage than was available. Try running with more memory."
Suggestions for fixing this code or another method that should
work would
be greatly appreciated!!
Thanks,
Jim
DEFINE !Looper ()
!DO !i=1 !to 38.
Get file='D:\sbec\tea ids\Master ID List.sav'.
dataset name SSNList.
/* Define Ending point.*/
!let !temp=!blanks(0).
!do !cnt=1 !to !i
!Let !temp=!concat(!temp,!blanks(10000))
!doEnd.
!Let !EndNum=!Length(!temp).
/* Define start point.*/
!Let !j=!length(!substr(!blanks(!temp),9999)).
!Let !StartNum=!length(!concat(!blanks(!j),!blanks(1))).
Select if $casenum<=!StartNum & $casenum >=!EndNum.
SAVE TRANSLATE OUTFILE=!QUOTE(!CONCAT("d:\sbec\tea
ids\newIDs\ID",!i,".txt"))
/TYPE=CSV
/MAP
/REPLACE
/CELLS=VALUES.
!DOEND.
!ENDDEFINE.
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden
email] (not to SPSSX-L), with
no body text except
the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
If you reply to this email, your
message
will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Dividing-file-into-10-000-case-chunks-tp5719315.html
To start a new topic under SPSSX
Discussion,
email [hidden
email]
To unsubscribe from SPSSX Discussion, click
here.
NAML
Art Kendall
Social Research Consultants
View this message in context: Re:
Dividing file into 10,000 case chunks
Sent from the SPSSX
Discussion mailing list archive
at Nabble.com.
Art Kendall
Social Research Consultants
|