Repeated file opening and saving

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Repeated file opening and saving

Sjoerd van den Berg
Dear all,

A second ply for help comming from my side of the PC is due this day.

This time, the reason is the generation of an extremely large syntax for
batch processing files.

As our machine spits out CSV files, and we do our analysis in SPSS, we
wondered if we could manage a "syntax shortening procedure".

below is a classical example of a repeated opening, saving and opening
another and saving that and opening... you get the point. As you can see it
is basically only the number 0101 in /FILE =
'C:\CLAMS\Metabolism\0101.csv'which changes to 0102 in the next one, 0103 in
the one thereafter and so on.

The problem is not that it takes a lot of time, just that the syntax we use
for this spans over 2000 lines to get to the real analysis part; which does
not bode well for researchers unfortunately.

Could you give me the helping hand in this process?

Thanks in advance,

Sjoerd

syntax now:

GET DATA  /TYPE = TXT
/FILE = 'C:\CLAMS\Metabolism\0101.csv'
/DELCASE = LINE
/DELIMITERS = " ;"
/ARRANGEMENT = DELIMITED
/FIRSTCASE = 25
/IMPORTCASE = ALL
/VARIABLES =
Interval F2.1
Channel F4.2
Date EDATE10
Time TIME11.2
V5 1X
VO2_ml_kg_hr F4.2
VO2_in F5.2
VO2_out F5.2
Delta_vo2 F4.2
Acc_VO2_L F3.2
VCO2_ml_kg_hr F4.2
CO2_in F5.2
CO2_out F5.2
Delta_CO2 F5.2
Acc_CO2_L F3.2
RER F5.2
Heat F3.2
V18 4X
V19 7X
Feeding 4X
Accumulated_Feeding_gr F4.2
V22 5X
Accumulated_Drinking_ml F5.2
Total_X_Activity F3.2
V25 2X
Total_Z_Activity F1.0
.
CACHE.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.

[the repeat starts here :) ]

SAVE OUTFILE='C:\CLAMS\Metabolism\0101.sav'
/COMPRESSED.

GET DATA  /TYPE = TXT
/FILE = 'C:\CLAMS\Metabolism\0102.csv'
/DELCASE = LINE
/DELIMITERS = " ;"
/ARRANGEMENT = DELIMITED
/FIRSTCASE = 25
/IMPORTCASE = ALL
/VARIABLES =
Interval F2.1
Channel F4.2
Date EDATE10
Time TIME11.2
V5 1X
VO2_ml_kg_hr F4.2
VO2_in F5.2
VO2_out F5.2
Delta_vo2 F4.2
Acc_VO2_L F3.2
VCO2_ml_kg_hr F4.2
CO2_in F5.2
CO2_out F5.2
Delta_CO2 F5.2
Acc_CO2_L F3.2
RER F5.2
Heat F3.2
V18 4X
V19 7X
Feeding 4X
Accumulated_Feeding_gr F4.2
V22 5X
Accumulated_Drinking_ml F5.2
Total_X_Activity F3.2
V25 2X
Total_Z_Activity F1.0
.
CACHE.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.

SAVE OUTFILE='C:\CLAMS\Metabolism\0102.sav'
/COMPRESSED.
Reply | Threaded
Open this post in threaded view
|

Re: Repeated file opening and saving

Albert-Jan Roskam
Hi Sjoerd,

You're looking for a loop. Try this (untested!):

define mymacro (begin = !charend ('/') / end = !tokens
(1)).

set mprint = on.

!do !cnt !begin !to !end.
GET DATA  /TYPE = TXT
/FILE =
!quote(!concat('C:\CLAMS\Metabolism\',!cnt,'.csv'))
/DELCASE = LINE
/DELIMITERS = " ;"
/ARRANGEMENT = DELIMITED
/FIRSTCASE = 25
/IMPORTCASE = ALL
/VARIABLES =
Interval F2.1
Channel F4.2
Date EDATE10
Time TIME11.2
V5 1X
VO2_ml_kg_hr F4.2
VO2_in F5.2
VO2_out F5.2
Delta_vo2 F4.2
Acc_VO2_L F3.2
VCO2_ml_kg_hr F4.2
CO2_in F5.2
CO2_out F5.2
Delta_CO2 F5.2
Acc_CO2_L F3.2
RER F5.2
Heat F3.2
V18 4X
V19 7X
Feeding 4X
Accumulated_Feeding_gr F4.2
V22 5X
Accumulated_Drinking_ml F5.2
Total_X_Activity F3.2
V25 2X
Total_Z_Activity F1.0
.
CACHE.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.

save outfile =

!quote(!concat('C:\CLAMS\Metabolism\',!cnt,'.sav')).
!doend.
set mprint = off.
!enddefine.

mymacro begin = 0101 / end = 2050.

Cheers!!
Albert-Jan



--- Sjoerd van den Berg <[hidden email]> wrote:

> Dear all,
>
> A second ply for help comming from my side of the PC
> is due this day.
>
> This time, the reason is the generation of an
> extremely large syntax for
> batch processing files.
>
> As our machine spits out CSV files, and we do our
> analysis in SPSS, we
> wondered if we could manage a "syntax shortening
> procedure".
>
> below is a classical example of a repeated opening,
> saving and opening
> another and saving that and opening... you get the
> point. As you can see it
> is basically only the number 0101 in /FILE =
> 'C:\CLAMS\Metabolism\0101.csv'which changes to 0102
> in the next one, 0103 in
> the one thereafter and so on.
>
> The problem is not that it takes a lot of time, just
> that the syntax we use
> for this spans over 2000 lines to get to the real
> analysis part; which does
> not bode well for researchers unfortunately.
>
> Could you give me the helping hand in this process?
>
> Thanks in advance,
>
> Sjoerd
>
> syntax now:
>
> GET DATA  /TYPE = TXT
> /FILE = 'C:\CLAMS\Metabolism\0101.csv'
> /DELCASE = LINE
> /DELIMITERS = " ;"
> /ARRANGEMENT = DELIMITED
> /FIRSTCASE = 25
> /IMPORTCASE = ALL
> /VARIABLES =
> Interval F2.1
> Channel F4.2
> Date EDATE10
> Time TIME11.2
> V5 1X
> VO2_ml_kg_hr F4.2
> VO2_in F5.2
> VO2_out F5.2
> Delta_vo2 F4.2
> Acc_VO2_L F3.2
> VCO2_ml_kg_hr F4.2
> CO2_in F5.2
> CO2_out F5.2
> Delta_CO2 F5.2
> Acc_CO2_L F3.2
> RER F5.2
> Heat F3.2
> V18 4X
> V19 7X
> Feeding 4X
> Accumulated_Feeding_gr F4.2
> V22 5X
> Accumulated_Drinking_ml F5.2
> Total_X_Activity F3.2
> V25 2X
> Total_Z_Activity F1.0
> .
> CACHE.
> EXECUTE.
> DATASET NAME DataSet1 WINDOW=FRONT.
>
> [the repeat starts here :) ]
>
> SAVE OUTFILE='C:\CLAMS\Metabolism\0101.sav'
> /COMPRESSED.
>
> GET DATA  /TYPE = TXT
> /FILE = 'C:\CLAMS\Metabolism\0102.csv'
> /DELCASE = LINE
> /DELIMITERS = " ;"
> /ARRANGEMENT = DELIMITED
> /FIRSTCASE = 25
> /IMPORTCASE = ALL
> /VARIABLES =
> Interval F2.1
> Channel F4.2
> Date EDATE10
> Time TIME11.2
> V5 1X
> VO2_ml_kg_hr F4.2
> VO2_in F5.2
> VO2_out F5.2
> Delta_vo2 F4.2
> Acc_VO2_L F3.2
> VCO2_ml_kg_hr F4.2
> CO2_in F5.2
> CO2_out F5.2
> Delta_CO2 F5.2
> Acc_CO2_L F3.2
> RER F5.2
> Heat F3.2
> V18 4X
> V19 7X
> Feeding 4X
> Accumulated_Feeding_gr F4.2
> V22 5X
> Accumulated_Drinking_ml F5.2
> Total_X_Activity F3.2
> V25 2X
> Total_Z_Activity F1.0
> .
> CACHE.
> EXECUTE.
> DATASET NAME DataSet1 WINDOW=FRONT.
>
> SAVE OUTFILE='C:\CLAMS\Metabolism\0102.sav'
> /COMPRESSED.
>


Cheers!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Did you know that 87.166253% of all statistics claim a precision of results that is not justified by the method employed? [HELMUT RICHTER]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

Re: Repeated file opening and saving

Art Kendall-2
In reply to this post by Sjoerd van den Berg
One workaround is to include a variable in each input file that is the
"0101" "0102" etc. You might call it something like source file, so you
can trace back during QA procedures. Put all the files you want to
concatenate in a single source directory folder.  Create a result
directory folder.
then if you are on a Windows system some syntax similar to  this
untested syntax.  Other OS's will have some other way to do the same thing.

host command 'copy c:\CLAMS\Metabolism\*.csv
c:CLAMS\Metabolism\onebigfile\combined.csv' .

GET DATA  /TYPE = TXT
   /FILE = 'C:\CLAMS\Metabolism\onebigfile\combined.csv'
. . .



Another workaround is to have the program that writes the .csv data open
its output file with a parameter that means "append" and just append new
output to the old file.


Art Kendall
Social Research Consultants.

Sjoerd van den Berg wrote:

> Dear all,
>
> A second ply for help comming from my side of the PC is due this day.
>
> This time, the reason is the generation of an extremely large syntax for
> batch processing files.
>
> As our machine spits out CSV files, and we do our analysis in SPSS, we
> wondered if we could manage a "syntax shortening procedure".
>
> below is a classical example of a repeated opening, saving and opening
> another and saving that and opening... you get the point. As you can see it
> is basically only the number 0101 in /FILE =
> 'C:\CLAMS\Metabolism\0101.csv'which changes to 0102 in the next one, 0103 in
> the one thereafter and so on.
>
> The problem is not that it takes a lot of time, just that the syntax we use
> for this spans over 2000 lines to get to the real analysis part; which does
> not bode well for researchers unfortunately.
>
> Could you give me the helping hand in this process?
>
> Thanks in advance,
>
> Sjoerd
>
> syntax now:
>
> GET DATA  /TYPE = TXT
> /FILE = 'C:\CLAMS\Metabolism\0101.csv'
> /DELCASE = LINE
> /DELIMITERS = " ;"
> /ARRANGEMENT = DELIMITED
> /FIRSTCASE = 25
> /IMPORTCASE = ALL
> /VARIABLES =
> Interval F2.1
> Channel F4.2
> Date EDATE10
> Time TIME11.2
> V5 1X
> VO2_ml_kg_hr F4.2
> VO2_in F5.2
> VO2_out F5.2
> Delta_vo2 F4.2
> Acc_VO2_L F3.2
> VCO2_ml_kg_hr F4.2
> CO2_in F5.2
> CO2_out F5.2
> Delta_CO2 F5.2
> Acc_CO2_L F3.2
> RER F5.2
> Heat F3.2
> V18 4X
> V19 7X
> Feeding 4X
> Accumulated_Feeding_gr F4.2
> V22 5X
> Accumulated_Drinking_ml F5.2
> Total_X_Activity F3.2
> V25 2X
> Total_Z_Activity F1.0
> .
> CACHE.
> EXECUTE.
> DATASET NAME DataSet1 WINDOW=FRONT.
>
> [the repeat starts here :) ]
>
> SAVE OUTFILE='C:\CLAMS\Metabolism\0101.sav'
> /COMPRESSED.
>
> GET DATA  /TYPE = TXT
> /FILE = 'C:\CLAMS\Metabolism\0102.csv'
> /DELCASE = LINE
> /DELIMITERS = " ;"
> /ARRANGEMENT = DELIMITED
> /FIRSTCASE = 25
> /IMPORTCASE = ALL
> /VARIABLES =
> Interval F2.1
> Channel F4.2
> Date EDATE10
> Time TIME11.2
> V5 1X
> VO2_ml_kg_hr F4.2
> VO2_in F5.2
> VO2_out F5.2
> Delta_vo2 F4.2
> Acc_VO2_L F3.2
> VCO2_ml_kg_hr F4.2
> CO2_in F5.2
> CO2_out F5.2
> Delta_CO2 F5.2
> Acc_CO2_L F3.2
> RER F5.2
> Heat F3.2
> V18 4X
> V19 7X
> Feeding 4X
> Accumulated_Feeding_gr F4.2
> V22 5X
> Accumulated_Drinking_ml F5.2
> Total_X_Activity F3.2
> V25 2X
> Total_Z_Activity F1.0
> .
> CACHE.
> EXECUTE.
> DATASET NAME DataSet1 WINDOW=FRONT.
>
> SAVE OUTFILE='C:\CLAMS\Metabolism\0102.sav'
> /COMPRESSED.
>
>
>