problem with large datafile and macros

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

problem with large datafile and macros

vlad simion
hi,

i'm using SPSS 13.

there aren't 25000 macros, there are 6 macros, which i call for each
variable in the dataset.
3 of them are for proceesing lists of variables and values, e.g. suppose i
have a multiple answer variable q1_1_1,q1_2_1, ..... q1_50_50, instead of
listing all these variables, i use a macro which expands this list (call of
the macro: !xpand2 q1_ dim[1 50] dim2[1 10] sep=[,]), or long lists of
values, instead of enumerating 1,2,3,....,100, i use a macro: !enum_to
values=[1 to 100] except=[50] sep=[,]
the other 3 macros are for validating each variable in terms of missing
values, value ranges, logical conditions and exclusivity conditions. instead
of writing the specs for each of this process, i've made a macro for each
type of variable(single answer, multiple answer and open-end), that
encapsulate all in one.
e.g. : suppose i have 5 single answer questions for which i want to check
for missing values, ranges, logical conditions, i call a macro: !valid_sa
listvar=!xpand q 1 5 / skip=logical condition / vallist=!enum_to values=[1
to 15] except=[99] sep=[,].
now, these macros are in a macro library, which i use in every syntax file
via insert file.
the problem is that after many calls of these macros, SPSS crashes.

many thanks once again,

vlad




--
Vlad Simion
Data Analyst
Tel:      +40 720130611
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

Richard Ristow
At 04:33 AM 12/13/2006, vlad simion wrote:

>i'm using SPSS 13.

Well, as far as I know, that's as stable as any release.

>there aren't 25000 macros, there are 6 macros, which i call for each
>variable in the dataset.

No, I didn't figure there were 25,000 macros. But 25,000 macro CALLS is
a lot. 6*25,000=150,000 macro calls, is even more.

>3 of them are for proceesing lists of variables and values, e.g.
>suppose i have a multiple answer variable q1_1_1,q1_2_1, .....
>q1_50_50, instead of listing all these variables, i use a macro which
>expands this list (call of the macro: !xpand2 q1_ dim[1 50] dim2[1 10]
>sep=[,]), or long lists of values, instead of enumerating
>1,2,3,....,100, i use a macro: !enum_to values=[1 to 100] except=[50]
>sep=[,]
>
>the other 3 macros are for validating each variable in terms of
>missing values, value ranges, logical conditions and exclusivity
>conditions. instead of writing the specs for each of this process,
>i've made a macro for each type of variable(single answer, multiple
>answer and open-end), that encapsulate all in one.
>e.g. : suppose i have 5 single answer questions for which i want to
>check for missing values, ranges, logical conditions, i call a macro:
>!valid_sa listvar=!xpand q 1 5 / skip=logical condition /
>vallist=!enum_to values=[1 to 15] except=[99] sep=[,].
>
>the problem is that after many calls of these macros, SPSS crashes.

Well, generally the problem isn't the macros, but at the code the
macros expand into. Macros just make it harder, because
. It's harder to look at the expanded code - at best, it's badly
formatted - so you may not do it carefully;
. Macros, you can generate a lot of code, more than it's easy to write
by hand, with correspondingly more opportunities for things to go
wrong.

You don't say anything about the logic of your macros. I assume you use
logic within transformation programs. Do you generate thousands of
transformation programs, or one transformation program with many
thousands of lines? SPSS is supposed to be able to handle either, but I
could imagine trouble either way.

What happens if you run for the first 250 variables, the first 10%? Or
for the first 25, the first 1%?

Of course, run with SET MPRINT ON. I suppose that'll be several tens of
thousands of lines of code, but it's necessary. You can use ECHO
statements in your macros, to have them note the beginnings of the code
they emit.

That, I'm afraid, is about as far as I can go without knowing anything
specific.

-Good luck to you,
  Richard Ristow
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

Richard Ristow
Reply to an off-list response.

At 03:37 AM 12/14/2006, vlad simion wrote:

>i took the liberty to write only to you because i have attached 2
>print screens with the errors occured and i know that on the forum it
>is not allowed to attach files, i hope it is not a problem.

In both cases, the errors are "assertion failures." Assertions are
debugging tools internal to a program; an assertion failure is, *ipso
facto*, a bug in SPSS. Not that there's reason to be debugging SPSS 13;
only if it's replicable in 15 would it (and should it) get attention.

-----------------------
To readers at SPSS, Inc. - to the accuracy I can transcribe them, the
first message gives
-----------------------
Program: ...\spsswin.exe  [I've elided the path name]
files: Z:\cs_source\Datasource\src\dictnry.cpp
line: 183

expression:fpIterator
-----------------------
The second gives
-----------------------
Program: ...\spsswin.exe  [I've elided the path name]
files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
line: 200

expression:iInput <_my._xdsData._inputs.size()
-----------------------
Since this hasn't hit everybody, it's a fair guess these are (or this
is a) size-dependent bug, i.e. one that shows up only with input of a
certain size or complexity. They're quite common.


>i think it has something to do about the transformations program, but
>i can't figure out what, i've manage to go a little further, but
>still... it crash giving error: "an error occured while atempting to
>write a transformation file"
>
>and here are the macros that i use:

Remarks follow. But this is a LOT of code; you should be debugging it,
and I'm not going to try a complete job. How many of the suggestions I
made in the last posting, have you applied?

And you haven't said a word about what each macro does. Nor put in any
annotations or comments in your definitions. Those are extremely
important for your understanding and quality control; and a minimal
courtesy, for anybody else you ask to look at your code.

>set mprint=on printback=on mexpand=on.
>
>define !enum_to(values= !enclose('[',']') /except= !enclose('[',']') /
>sep= !default ('') !enclose('[',']'))
>  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals= !null
>  !do !j= !head(!except) !to !tail(!tail(!except))
>  !let !exvals= !concat(!exvals,' ',!j)
>  !doend
>  !else !let !exvals=!except
>  !ifend
>  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
>  !do !i= !head(!values) !to !tail(!tail(!values))
>  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0) !then
>  !let !vals= !concat(!vals,' ',!i,!sep)
>  !ifend
>  !doend
>  !let !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
> 2)))
>  !let !vals= !concat(!head(!values),!tail(!vals))
>  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
> !concat(!vals,!tail(!tail(!values))) !ifend
>  !else !let !vals= !values
>  !ifend
>  !vals
>  !enddefine.
>*===========================================================.
I can't see what code this emits, with the effort I'm willing to put
out. It has lot of looping. Does it call a lot of macros in those
loops, and hence emit, many times, the code that they emit?  a lot of
macro calls? If so, which macros does it call?

>*===========================================================.
>define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
>sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
>  !do !i=!2 !to !3
>  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
>  !else !concat(!1,!i, !unquote(!sufix),!sep)
>  !ifend
>  !doend
>  !enddefine.
>*===========================================================.
Again, I can't see what this does, except it seems to invoke macro
!sufix. To do what? (Yes, I could look, but saying what is an
elementary courtesy.)


>*===========================================================.
>define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
>dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
>("") !enclose('(',')'))
>  !do !i=!head(!dim1) !to !tail(!dim1)
>  !do !j=!head(!dim2) !to !tail(!dim2)
>  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
> !concat(!1,!i,!unquote(!sufix),!j)
>  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
>  !ifend
>  !doend
>  !doend
>  !enddefine.
>*===========================================================.
This appears to be another looping macro, that calls other macros many
times. It looks like it's mostly !sufix.

Here's a macro that seems to be doing something:

>*===========================================================.
>define valid_sa (listvar=!charend('/') /  skip= !default ("None")
>!charend('/')  / vallist= !default ("") !cmdend )
>  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
> !conditie=!skip !ifend
>echo
>'--------------------------------------------------------------------------'.
>echo !quote(!concat(' Validating SA variables: ', !listvar)).
>echo
>'--------------------------------------------------------------------------'.
>echo 'VALIDATING MISSING VALUES'.
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
>',', !quote(!oe)).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: the following variables should NOT be
>missing unless SKIP condition: ', !skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if (~miss(!oe) & ~!conditie ) validerr=concat(ltrim(rtrim(validerr)),
>',', !quote(!oe)).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: The following variables SHOULD be missing
>for SKIP condition: ',!skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>echo
>'--------------------------------------------------------------------------'.
>echo 'VALIDATING VALUES/RANGES'.
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if ((~any(!oe,!vallist)) & !conditie )
>validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
>ltrim(rtrim(string(!oe , F15)))).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: The following variables do not fit the
>requested Range for SKIP condition: ',!skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>echo '----------------END VALIDATION
>--------------------------------------------'.
>  !enddefine.

Each of your tests (missing values and ranges) makes two passes through
the data, once to generate your error-message string 'validerr' for
each case, and once to print it. (It would be quite easy to have the
PRINT statements in the same transformation programs that do the tests.
Why don't you do it that way?)

You delete and re-declare 'validerr' each time for each new test:
>del var validerr .
>string validerr(A200).
Better simply to set it to blank, at the beginning of each set of tests
for a new record. This may easily be something that strains SPSS, if
you do it often.

And, you're generating your tests in macro loops. How long are the
resulting transformation programs? And how often is this macro called,
at four transformation programs each call?


>*===========================================================.
>*===========================================================.
>*===========================================================.
>define valid_oe (listvar=!charend('/') / skip=!default ("None")
>!cmdend)
>  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
> !conditie=!skip  !ifend
>echo
>'--------------------------------------------------------------------------'.
>echo !quote(!concat(' Validating OE variables: ', !listvar)).
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
>validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: the following variables should NOT be
>missing unless SKIP condition: ', !skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
>validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: The following variables SHOULD be missing
>for SKIP condition: ',!skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>echo '----------------END VALIDATION OE
>--------------------------------------------'.
>  !enddefine.
>*===========================================================.
Same remarks as for previous.


>*===========================================================.
>*===========================================================.
>define valid_ma (listvar=!charend('/') /interval=!charend('/') / skip=
>!default ("None") !charend('/') / exclusiv= !default ("None") !cmdend)
>  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
> !conditie=!skip !ifend
>echo
>'-----------------------------------------------------------------------------------------------------------'.
>
>echo !quote(!concat(' Validating MA variables: ', !listvar)).
>comp validerr=0.
>count qq=!listvar  (!interval).
>if ((qq=0 ) & ~!conditie) validerr=1.
>exe.
>echo !quote(!concat('ERROR: the following variables should NOT be
>missing unless SKIP condition: ', !skip)).
>do if validerr>0.
>print / 'id=' id .
>end if.
>exe.
>del var validerr .
>echo
>'-----------------------------------------------------------------------------------------------------------'.
>if ((qq<>0 ) & (!conditie)) validerr=1.
>exe.
>echo !quote(!concat('ERROR: the following variables should be missing
>when NOT SKIP condition: ', !skip)).
>do if validerr>0.
>print / 'id=' id .
>end if.
>exe.
>del var validerr .
>echo
>'-----------------------------------------------------------------------------------------------------------'.
>
>  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
> !test=!exclusiv !ifend
>do if !test & qq>1.
>comp validerr=1.
>end if.
>  !let !flag=0
>do if validerr>0.
>echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY CONDITION'.
>print / 'id=' id .
>end if.
>exe.
>del var validerr .
>del var qq .
>echo '----------------------------END
>VALIDATION-----------------------------------------------------------------'.
>!enddefine.
>*===========================================================.
Same remarks again.


>*===========================================================.
>thank you very much and once again sorry for any trouble,

Good luck, and I think I've given you something to go on with.
Richard
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

Peck, Jon
This kind of discussion is nearly hopeless for SPSS to do anything about.  It is much better to work with SPSS Technical Support, which can check the bug database and ask the questions that developers need answered in order to track down the problem - and are in a position to track the problem and follow up.

The critical thing needed to fix a bug is reproducibility in our development setup and on a current version, where we have tools to probe into the cause.  Sometimes a detailed description is enough, but it is much better when we can get actual syntax or an exact description of gui actions and some (nonconfidential) data that show the problem.  Even this may not be enough if the failure depends on aspects of the user's hardware and software environment that may be hard to ferret out, but it is the best place to start.  Resource exhaustion problems such as running out of memory can be hard to reproduce exactly.  And there can be problems with operating system versions released after the corresponding SPSS version, which could obviously cause unanticipatable conditions.

It helps a lot if the problem can be narrowed down to the simplest failure case that can be isolated.  Assertions are great clues, but we generally need to know exactly what was happening when the assertion was generated.

No matter how much we test - and our testbed contains many thousands of jobs and scenarios run in many different computer hardware and software configurations, some things will be found only in the field.  They will usually - ideally always - be rare combinations of circumstances that require a lot of information to pin down.  So, again, the more you can work with Technical Support the better.

Regards,
Jon Peck
(who is not in Technical Support :-))

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow
Sent: Thursday, December 14, 2006 1:48 PM
To: [hidden email]
Subject: Re: [SPSSX-L] problem with large datafile and macros

Reply to an off-list response.

At 03:37 AM 12/14/2006, vlad simion wrote:

>i took the liberty to write only to you because i have attached 2
>print screens with the errors occured and i know that on the forum it
>is not allowed to attach files, i hope it is not a problem.

In both cases, the errors are "assertion failures." Assertions are
debugging tools internal to a program; an assertion failure is, *ipso
facto*, a bug in SPSS. Not that there's reason to be debugging SPSS 13;
only if it's replicable in 15 would it (and should it) get attention.

-----------------------
To readers at SPSS, Inc. - to the accuracy I can transcribe them, the
first message gives
-----------------------
Program: ...\spsswin.exe  [I've elided the path name]
files: Z:\cs_source\Datasource\src\dictnry.cpp
line: 183

expression:fpIterator
-----------------------
The second gives
-----------------------
Program: ...\spsswin.exe  [I've elided the path name]
files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
line: 200

expression:iInput <_my._xdsData._inputs.size()
-----------------------
Since this hasn't hit everybody, it's a fair guess these are (or this
is a) size-dependent bug, i.e. one that shows up only with input of a
certain size or complexity. They're quite common.


>i think it has something to do about the transformations program, but
>i can't figure out what, i've manage to go a little further, but
>still... it crash giving error: "an error occured while atempting to
>write a transformation file"
>
>and here are the macros that i use:

Remarks follow. But this is a LOT of code; you should be debugging it,
and I'm not going to try a complete job. How many of the suggestions I
made in the last posting, have you applied?

And you haven't said a word about what each macro does. Nor put in any
annotations or comments in your definitions. Those are extremely
important for your understanding and quality control; and a minimal
courtesy, for anybody else you ask to look at your code.

>set mprint=on printback=on mexpand=on.
>
>define !enum_to(values= !enclose('[',']') /except= !enclose('[',']') /
>sep= !default ('') !enclose('[',']'))
>  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals= !null
>  !do !j= !head(!except) !to !tail(!tail(!except))
>  !let !exvals= !concat(!exvals,' ',!j)
>  !doend
>  !else !let !exvals=!except
>  !ifend
>  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
>  !do !i= !head(!values) !to !tail(!tail(!values))
>  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0) !then
>  !let !vals= !concat(!vals,' ',!i,!sep)
>  !ifend
>  !doend
>  !let !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
> 2)))
>  !let !vals= !concat(!head(!values),!tail(!vals))
>  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
> !concat(!vals,!tail(!tail(!values))) !ifend
>  !else !let !vals= !values
>  !ifend
>  !vals
>  !enddefine.
>*===========================================================.
I can't see what code this emits, with the effort I'm willing to put
out. It has lot of looping. Does it call a lot of macros in those
loops, and hence emit, many times, the code that they emit?  a lot of
macro calls? If so, which macros does it call?

>*===========================================================.
>define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
>sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
>  !do !i=!2 !to !3
>  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
>  !else !concat(!1,!i, !unquote(!sufix),!sep)
>  !ifend
>  !doend
>  !enddefine.
>*===========================================================.
Again, I can't see what this does, except it seems to invoke macro
!sufix. To do what? (Yes, I could look, but saying what is an
elementary courtesy.)


>*===========================================================.
>define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
>dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
>("") !enclose('(',')'))
>  !do !i=!head(!dim1) !to !tail(!dim1)
>  !do !j=!head(!dim2) !to !tail(!dim2)
>  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
> !concat(!1,!i,!unquote(!sufix),!j)
>  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
>  !ifend
>  !doend
>  !doend
>  !enddefine.
>*===========================================================.
This appears to be another looping macro, that calls other macros many
times. It looks like it's mostly !sufix.

Here's a macro that seems to be doing something:

>*===========================================================.
>define valid_sa (listvar=!charend('/') /  skip= !default ("None")
>!charend('/')  / vallist= !default ("") !cmdend )
>  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
> !conditie=!skip !ifend
>echo
>'--------------------------------------------------------------------------'.
>echo !quote(!concat(' Validating SA variables: ', !listvar)).
>echo
>'--------------------------------------------------------------------------'.
>echo 'VALIDATING MISSING VALUES'.
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
>',', !quote(!oe)).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: the following variables should NOT be
>missing unless SKIP condition: ', !skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if (~miss(!oe) & ~!conditie ) validerr=concat(ltrim(rtrim(validerr)),
>',', !quote(!oe)).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: The following variables SHOULD be missing
>for SKIP condition: ',!skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>echo
>'--------------------------------------------------------------------------'.
>echo 'VALIDATING VALUES/RANGES'.
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if ((~any(!oe,!vallist)) & !conditie )
>validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
>ltrim(rtrim(string(!oe , F15)))).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: The following variables do not fit the
>requested Range for SKIP condition: ',!skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>echo '----------------END VALIDATION
>--------------------------------------------'.
>  !enddefine.

Each of your tests (missing values and ranges) makes two passes through
the data, once to generate your error-message string 'validerr' for
each case, and once to print it. (It would be quite easy to have the
PRINT statements in the same transformation programs that do the tests.
Why don't you do it that way?)

You delete and re-declare 'validerr' each time for each new test:
>del var validerr .
>string validerr(A200).
Better simply to set it to blank, at the beginning of each set of tests
for a new record. This may easily be something that strains SPSS, if
you do it often.

And, you're generating your tests in macro loops. How long are the
resulting transformation programs? And how often is this macro called,
at four transformation programs each call?


>*===========================================================.
>*===========================================================.
>*===========================================================.
>define valid_oe (listvar=!charend('/') / skip=!default ("None")
>!cmdend)
>  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
> !conditie=!skip  !ifend
>echo
>'--------------------------------------------------------------------------'.
>echo !quote(!concat(' Validating OE variables: ', !listvar)).
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
>validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: the following variables should NOT be
>missing unless SKIP condition: ', !skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>string validerr(A200).
>  !do !oe !IN (!listvar)
>if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
>validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>  !doend
>if length(ltrim(rtrim(validerr)))>0
>validerr=substr(validerr,2,length(validerr)-1).
>echo !quote(!concat('ERROR: The following variables SHOULD be missing
>for SKIP condition: ',!skip)).
>exec.
>do if length(ltrim(rtrim(validerr)))>0.
>print / 'id=' id ' variables: ' validerr.
>end if.
>exec.
>del var validerr .
>echo '----------------END VALIDATION OE
>--------------------------------------------'.
>  !enddefine.
>*===========================================================.
Same remarks as for previous.


>*===========================================================.
>*===========================================================.
>define valid_ma (listvar=!charend('/') /interval=!charend('/') / skip=
>!default ("None") !charend('/') / exclusiv= !default ("None") !cmdend)
>  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
> !conditie=!skip !ifend
>echo
>'-----------------------------------------------------------------------------------------------------------'.
>
>echo !quote(!concat(' Validating MA variables: ', !listvar)).
>comp validerr=0.
>count qq=!listvar  (!interval).
>if ((qq=0 ) & ~!conditie) validerr=1.
>exe.
>echo !quote(!concat('ERROR: the following variables should NOT be
>missing unless SKIP condition: ', !skip)).
>do if validerr>0.
>print / 'id=' id .
>end if.
>exe.
>del var validerr .
>echo
>'-----------------------------------------------------------------------------------------------------------'.
>if ((qq<>0 ) & (!conditie)) validerr=1.
>exe.
>echo !quote(!concat('ERROR: the following variables should be missing
>when NOT SKIP condition: ', !skip)).
>do if validerr>0.
>print / 'id=' id .
>end if.
>exe.
>del var validerr .
>echo
>'-----------------------------------------------------------------------------------------------------------'.
>
>  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
> !test=!exclusiv !ifend
>do if !test & qq>1.
>comp validerr=1.
>end if.
>  !let !flag=0
>do if validerr>0.
>echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY CONDITION'.
>print / 'id=' id .
>end if.
>exe.
>del var validerr .
>del var qq .
>echo '----------------------------END
>VALIDATION-----------------------------------------------------------------'.
>!enddefine.
>*===========================================================.
Same remarks again.


>*===========================================================.
>thank you very much and once again sorry for any trouble,

Good luck, and I think I've given you something to go on with.
Richard
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

Richard Ristow
I want to agree with Jon here, though not to take back what I said.
(That is, an assertion failure is *ipso facto* a bug: that doesn't mean
it's traceable from the information given, especially in a pre-current
release.) No apologies for passing on the messages, though; they might
be useful, and if not can be ignored.

To repeat, and emphasize, something else I mentioned:

I'd start with dropping

.  del var validerr .

throughout. Retain

.  string validerr(A200).

where it's needed, i.e. the first pass through the data after GET FILE
or whatever; otherwise, replace it with

.  COMPUTE validerr = ''.


Below, notes (Jon's) about what bug-chasing in a large program actually
entails. At 03:19 PM 12/14/2006, Peck, Jon wrote:

>The critical thing needed to fix a bug is reproducibility in our
>development setup and on a current version, where we have tools to
>probe into the cause.  Sometimes a detailed description is enough, but
>it is much better when we can get actual syntax or an exact
>description of gui actions and some (nonconfidential) data that show
>the problem.  Even this may not be enough if the failure depends on
>aspects of the user's hardware and software environment that may be
>hard to ferret out, but it is the best place to start.  Resource
>exhaustion problems such as running out of memory can be hard to
>reproduce exactly.  And there can be problems with operating system
>versions released after the corresponding SPSS version, which could
>obviously cause unanticipatable conditions.
>
>It helps a lot if the problem can be narrowed down to the simplest
>failure case that can be isolated.  Assertions are great clues, but we
>generally need to know exactly what was happening when the assertion
>was generated.
>
>No matter how much we test - and our testbed contains many thousands
>of jobs and scenarios run in many different computer hardware and
>software configurations, some things will be found only in the
>field.  They will usually - ideally always - be rare combinations of
>circumstances that require a lot of information to pin down.  So,
>again, the more you can work with Technical Support the better.
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

vlad simion
In reply to this post by Richard Ristow
thank you very much Richard for your time and kindness to look into the
codes i sent you and sorry for not giving all the explanations about the
macros, i didn't mean to be dissrespectfull, it's just that i was under
presure of resolving the issue and ... got carried away

i have tried all the suggestions you gave me, but with no better results
i have also installed the patches that spss provide and still, it does not
work.

about the macros: the first 3 of them are, as Ray Levesque call them, "macro
gems", just some useful tools to ease my work, i just call them in other
macros instead of listing a whole lot of numbers and variables
the first one !enum_to, expands a list of numbers separated or not by
something (semicolon or any other character), instead of listing something
like: 1;2;3;4;5;.....;100, i just call the macro : !enum_to values=[1 to
100] except=[99] sep=[;].
so values is the numbers to be listed, except - excludes the value 99 from
list and sep - is the separator.
the second one !xpand is a macro from Ray's site, which i have adapted to my
needs, it expands a list of variables separated by some character, while
sufix is just for flagging the variables, is not another call of another
macro, so, instead of listing the variables, i call this macro !xpand q1 1
100 sufix=_ sep=,. resulting something like this : q1_1,q1_2,...,q1_100.
the third one !xpand2 does the same thing like the second one, but across 2
dimensions, suppose i have a grid and i want to use the variables from it,
something like q1_1_1,q1_1_2,...,q1_1_15,q1_2_1,q1_2_2,....,q1_10_15,
instead of listing all the variables i call the macro !xpand2 q1_ dim1=[1
10] dim2[1 15] sep=[,].
so q1 represents the root of the variables, dim1=the first dimension dim2
the second one, sufix is by default "_", but it can be changed to whatever
other character and sep is the separator.
so, these 3 macros i never use them alone, but i just call them in other
macros in order to get a list of variables or numbers, depending on the
needs.
the other 3 macros are for checking some logical condition upon variables.
the transformations program depends on the list of variables in the macro
call, if is only one variable, yes there are four transformations program,
otherwise there are 4*number of variables, so the code might end up being
very long.

thank you very much once again and i hope i've been more explicit now :)

vlad




On 12/14/06, Richard Ristow <[hidden email]> wrote:

>
> Reply to an off-list response.
>
> At 03:37 AM 12/14/2006, vlad simion wrote:
>
> >i took the liberty to write only to you because i have attached 2
> >print screens with the errors occured and i know that on the forum it
> >is not allowed to attach files, i hope it is not a problem.
>
> In both cases, the errors are "assertion failures." Assertions are
> debugging tools internal to a program; an assertion failure is, *ipso
> facto*, a bug in SPSS. Not that there's reason to be debugging SPSS 13;
> only if it's replicable in 15 would it (and should it) get attention.
>
> -----------------------
> To readers at SPSS, Inc. - to the accuracy I can transcribe them, the
> first message gives
> -----------------------
> Program: ...\spsswin.exe  [I've elided the path name]
> files: Z:\cs_source\Datasource\src\dictnry.cpp
> line: 183
>
> expression:fpIterator
> -----------------------
> The second gives
> -----------------------
> Program: ...\spsswin.exe  [I've elided the path name]
> files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
> line: 200
>
> expression:iInput <_my._xdsData._inputs.size()
> -----------------------
> Since this hasn't hit everybody, it's a fair guess these are (or this
> is a) size-dependent bug, i.e. one that shows up only with input of a
> certain size or complexity. They're quite common.
>
>
> >i think it has something to do about the transformations program, but
> >i can't figure out what, i've manage to go a little further, but
> >still... it crash giving error: "an error occured while atempting to
> >write a transformation file"
> >
> >and here are the macros that i use:
>
> Remarks follow. But this is a LOT of code; you should be debugging it,
> and I'm not going to try a complete job. How many of the suggestions I
> made in the last posting, have you applied?
>
> And you haven't said a word about what each macro does. Nor put in any
> annotations or comments in your definitions. Those are extremely
> important for your understanding and quality control; and a minimal
> courtesy, for anybody else you ask to look at your code.
>
> >set mprint=on printback=on mexpand=on.
> >
> >define !enum_to(values= !enclose('[',']') /except= !enclose('[',']') /
> >sep= !default ('') !enclose('[',']'))
> >  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals= !null
> >  !do !j= !head(!except) !to !tail(!tail(!except))
> >  !let !exvals= !concat(!exvals,' ',!j)
> >  !doend
> >  !else !let !exvals=!except
> >  !ifend
> >  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
> >  !do !i= !head(!values) !to !tail(!tail(!values))
> >  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0) !then
> >  !let !vals= !concat(!vals,' ',!i,!sep)
> >  !ifend
> >  !doend
> >  !let !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
> > 2)))
> >  !let !vals= !concat(!head(!values),!tail(!vals))
> >  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
> > !concat(!vals,!tail(!tail(!values))) !ifend
> >  !else !let !vals= !values
> >  !ifend
> >  !vals
> >  !enddefine.
> >*===========================================================.
> I can't see what code this emits, with the effort I'm willing to put
> out. It has lot of looping. Does it call a lot of macros in those
> loops, and hence emit, many times, the code that they emit?  a lot of
> macro calls? If so, which macros does it call?
>
> >*===========================================================.
> >define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
> >sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
> >  !do !i=!2 !to !3
> >  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
> >  !else !concat(!1,!i, !unquote(!sufix),!sep)
> >  !ifend
> >  !doend
> >  !enddefine.
> >*===========================================================.
> Again, I can't see what this does, except it seems to invoke macro
> !sufix. To do what? (Yes, I could look, but saying what is an
> elementary courtesy.)
>
>
> >*===========================================================.
> >define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
> >dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
> >("") !enclose('(',')'))
> >  !do !i=!head(!dim1) !to !tail(!dim1)
> >  !do !j=!head(!dim2) !to !tail(!dim2)
> >  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
> > !concat(!1,!i,!unquote(!sufix),!j)
> >  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
> >  !ifend
> >  !doend
> >  !doend
> >  !enddefine.
> >*===========================================================.
> This appears to be another looping macro, that calls other macros many
> times. It looks like it's mostly !sufix.
>
> Here's a macro that seems to be doing something:
> >*===========================================================.
> >define valid_sa (listvar=!charend('/') /  skip= !default ("None")
> >!charend('/')  / vallist= !default ("") !cmdend )
> >  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
> > !conditie=!skip !ifend
> >echo
>
> >'--------------------------------------------------------------------------'.
> >echo !quote(!concat(' Validating SA variables: ', !listvar)).
> >echo
>
> >'--------------------------------------------------------------------------'.
> >echo 'VALIDATING MISSING VALUES'.
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
> >',', !quote(!oe)).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: the following variables should NOT be
> >missing unless SKIP condition: ', !skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if (~miss(!oe) & ~!conditie ) validerr=concat(ltrim(rtrim(validerr)),
> >',', !quote(!oe)).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: The following variables SHOULD be missing
> >for SKIP condition: ',!skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >echo
>
> >'--------------------------------------------------------------------------'.
> >echo 'VALIDATING VALUES/RANGES'.
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if ((~any(!oe,!vallist)) & !conditie )
> >validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
> >ltrim(rtrim(string(!oe , F15)))).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: The following variables do not fit the
> >requested Range for SKIP condition: ',!skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >echo '----------------END VALIDATION
> >--------------------------------------------'.
> >  !enddefine.
>
> Each of your tests (missing values and ranges) makes two passes through
> the data, once to generate your error-message string 'validerr' for
> each case, and once to print it. (It would be quite easy to have the
> PRINT statements in the same transformation programs that do the tests.
> Why don't you do it that way?)
>
> You delete and re-declare 'validerr' each time for each new test:
> >del var validerr .
> >string validerr(A200).
> Better simply to set it to blank, at the beginning of each set of tests
> for a new record. This may easily be something that strains SPSS, if
> you do it often.
>
> And, you're generating your tests in macro loops. How long are the
> resulting transformation programs? And how often is this macro called,
> at four transformation programs each call?
>
>
> >*===========================================================.
> >*===========================================================.
> >*===========================================================.
> >define valid_oe (listvar=!charend('/') / skip=!default ("None")
> >!cmdend)
> >  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
> > !conditie=!skip  !ifend
> >echo
>
> >'--------------------------------------------------------------------------'.
> >echo !quote(!concat(' Validating OE variables: ', !listvar)).
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: the following variables should NOT be
> >missing unless SKIP condition: ', !skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: The following variables SHOULD be missing
> >for SKIP condition: ',!skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >echo '----------------END VALIDATION OE
> >--------------------------------------------'.
> >  !enddefine.
> >*===========================================================.
> Same remarks as for previous.
>
>
> >*===========================================================.
> >*===========================================================.
> >define valid_ma (listvar=!charend('/') /interval=!charend('/') / skip=
> >!default ("None") !charend('/') / exclusiv= !default ("None") !cmdend)
> >  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
> > !conditie=!skip !ifend
> >echo
> >'-----------------------------------------------------------------------------------------------------------'.
>
> >
> >echo !quote(!concat(' Validating MA variables: ', !listvar)).
> >comp validerr=0.
> >count qq=!listvar  (!interval).
> >if ((qq=0 ) & ~!conditie) validerr=1.
> >exe.
> >echo !quote(!concat('ERROR: the following variables should NOT be
> >missing unless SKIP condition: ', !skip)).
> >do if validerr>0.
> >print / 'id=' id .
> >end if.
> >exe.
> >del var validerr .
> >echo
> >'-----------------------------------------------------------------------------------------------------------'.
>
> >if ((qq<>0 ) & (!conditie)) validerr=1.
> >exe.
> >echo !quote(!concat('ERROR: the following variables should be missing
> >when NOT SKIP condition: ', !skip)).
> >do if validerr>0.
> >print / 'id=' id .
> >end if.
> >exe.
> >del var validerr .
> >echo
>
> >'-----------------------------------------------------------------------------------------------------------'.
> >
> >  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
> > !test=!exclusiv !ifend
> >do if !test & qq>1.
> >comp validerr=1.
> >end if.
> >  !let !flag=0
> >do if validerr>0.
> >echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY CONDITION'.
> >print / 'id=' id .
> >end if.
> >exe.
> >del var validerr .
> >del var qq .
> >echo '----------------------------END
> >VALIDATION-----------------------------------------------------------------'.
>
> >!enddefine.
> >*===========================================================.
> Same remarks again.
>
>
> >*===========================================================.
> >thank you very much and once again sorry for any trouble,
>
> Good luck, and I think I've given you something to go on with.
> Richard
>
>


--
Vlad Simion
Data Analyst
Tel:      +40 720130611
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

vlad simion
In reply to this post by Richard Ristow
thank you Jon,

i don't have spss 14 or 15 to try to run the same syntax and see if it works
or not.
i wrote the issue to the list because i saw that there are a lot of guys
from spss support that give a lot of usefull suggestions :), including you,
even if you are not in Technical Support :-))

many thanks,

vlad

On 12/14/06, Richard Ristow <[hidden email]> wrote:

>
> Reply to an off-list response.
>
> At 03:37 AM 12/14/2006, vlad simion wrote:
>
> >i took the liberty to write only to you because i have attached 2
> >print screens with the errors occured and i know that on the forum it
> >is not allowed to attach files, i hope it is not a problem.
>
> In both cases, the errors are "assertion failures." Assertions are
> debugging tools internal to a program; an assertion failure is, *ipso
> facto*, a bug in SPSS. Not that there's reason to be debugging SPSS 13;
> only if it's replicable in 15 would it (and should it) get attention.
>
> -----------------------
> To readers at SPSS, Inc. - to the accuracy I can transcribe them, the
> first message gives
> -----------------------
> Program: ...\spsswin.exe  [I've elided the path name]
> files: Z:\cs_source\Datasource\src\dictnry.cpp
> line: 183
>
> expression:fpIterator
> -----------------------
> The second gives
> -----------------------
> Program: ...\spsswin.exe  [I've elided the path name]
> files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
> line: 200
>
> expression:iInput <_my._xdsData._inputs.size()
> -----------------------
> Since this hasn't hit everybody, it's a fair guess these are (or this
> is a) size-dependent bug, i.e. one that shows up only with input of a
> certain size or complexity. They're quite common.
>
>
> >i think it has something to do about the transformations program, but
> >i can't figure out what, i've manage to go a little further, but
> >still... it crash giving error: "an error occured while atempting to
> >write a transformation file"
> >
> >and here are the macros that i use:
>
> Remarks follow. But this is a LOT of code; you should be debugging it,
> and I'm not going to try a complete job. How many of the suggestions I
> made in the last posting, have you applied?
>
> And you haven't said a word about what each macro does. Nor put in any
> annotations or comments in your definitions. Those are extremely
> important for your understanding and quality control; and a minimal
> courtesy, for anybody else you ask to look at your code.
>
> >set mprint=on printback=on mexpand=on.
> >
> >define !enum_to(values= !enclose('[',']') /except= !enclose('[',']') /
> >sep= !default ('') !enclose('[',']'))
> >  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals= !null
> >  !do !j= !head(!except) !to !tail(!tail(!except))
> >  !let !exvals= !concat(!exvals,' ',!j)
> >  !doend
> >  !else !let !exvals=!except
> >  !ifend
> >  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
> >  !do !i= !head(!values) !to !tail(!tail(!values))
> >  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0) !then
> >  !let !vals= !concat(!vals,' ',!i,!sep)
> >  !ifend
> >  !doend
> >  !let !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
> > 2)))
> >  !let !vals= !concat(!head(!values),!tail(!vals))
> >  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
> > !concat(!vals,!tail(!tail(!values))) !ifend
> >  !else !let !vals= !values
> >  !ifend
> >  !vals
> >  !enddefine.
> >*===========================================================.
> I can't see what code this emits, with the effort I'm willing to put
> out. It has lot of looping. Does it call a lot of macros in those
> loops, and hence emit, many times, the code that they emit?  a lot of
> macro calls? If so, which macros does it call?
>
> >*===========================================================.
> >define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
> >sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
> >  !do !i=!2 !to !3
> >  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
> >  !else !concat(!1,!i, !unquote(!sufix),!sep)
> >  !ifend
> >  !doend
> >  !enddefine.
> >*===========================================================.
> Again, I can't see what this does, except it seems to invoke macro
> !sufix. To do what? (Yes, I could look, but saying what is an
> elementary courtesy.)
>
>
> >*===========================================================.
> >define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
> >dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
> >("") !enclose('(',')'))
> >  !do !i=!head(!dim1) !to !tail(!dim1)
> >  !do !j=!head(!dim2) !to !tail(!dim2)
> >  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
> > !concat(!1,!i,!unquote(!sufix),!j)
> >  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
> >  !ifend
> >  !doend
> >  !doend
> >  !enddefine.
> >*===========================================================.
> This appears to be another looping macro, that calls other macros many
> times. It looks like it's mostly !sufix.
>
> Here's a macro that seems to be doing something:
> >*===========================================================.
> >define valid_sa (listvar=!charend('/') /  skip= !default ("None")
> >!charend('/')  / vallist= !default ("") !cmdend )
> >  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
> > !conditie=!skip !ifend
> >echo
>
> >'--------------------------------------------------------------------------'.
> >echo !quote(!concat(' Validating SA variables: ', !listvar)).
> >echo
>
> >'--------------------------------------------------------------------------'.
> >echo 'VALIDATING MISSING VALUES'.
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
> >',', !quote(!oe)).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: the following variables should NOT be
> >missing unless SKIP condition: ', !skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if (~miss(!oe) & ~!conditie ) validerr=concat(ltrim(rtrim(validerr)),
> >',', !quote(!oe)).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: The following variables SHOULD be missing
> >for SKIP condition: ',!skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >echo
>
> >'--------------------------------------------------------------------------'.
> >echo 'VALIDATING VALUES/RANGES'.
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if ((~any(!oe,!vallist)) & !conditie )
> >validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
> >ltrim(rtrim(string(!oe , F15)))).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: The following variables do not fit the
> >requested Range for SKIP condition: ',!skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >echo '----------------END VALIDATION
> >--------------------------------------------'.
> >  !enddefine.
>
> Each of your tests (missing values and ranges) makes two passes through
> the data, once to generate your error-message string 'validerr' for
> each case, and once to print it. (It would be quite easy to have the
> PRINT statements in the same transformation programs that do the tests.
> Why don't you do it that way?)
>
> You delete and re-declare 'validerr' each time for each new test:
> >del var validerr .
> >string validerr(A200).
> Better simply to set it to blank, at the beginning of each set of tests
> for a new record. This may easily be something that strains SPSS, if
> you do it often.
>
> And, you're generating your tests in macro loops. How long are the
> resulting transformation programs? And how often is this macro called,
> at four transformation programs each call?
>
>
> >*===========================================================.
> >*===========================================================.
> >*===========================================================.
> >define valid_oe (listvar=!charend('/') / skip=!default ("None")
> >!cmdend)
> >  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
> > !conditie=!skip  !ifend
> >echo
>
> >'--------------------------------------------------------------------------'.
> >echo !quote(!concat(' Validating OE variables: ', !listvar)).
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: the following variables should NOT be
> >missing unless SKIP condition: ', !skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >string validerr(A200).
> >  !do !oe !IN (!listvar)
> >if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
> >  !doend
> >if length(ltrim(rtrim(validerr)))>0
> >validerr=substr(validerr,2,length(validerr)-1).
> >echo !quote(!concat('ERROR: The following variables SHOULD be missing
> >for SKIP condition: ',!skip)).
> >exec.
> >do if length(ltrim(rtrim(validerr)))>0.
> >print / 'id=' id ' variables: ' validerr.
> >end if.
> >exec.
> >del var validerr .
> >echo '----------------END VALIDATION OE
> >--------------------------------------------'.
> >  !enddefine.
> >*===========================================================.
> Same remarks as for previous.
>
>
> >*===========================================================.
> >*===========================================================.
> >define valid_ma (listvar=!charend('/') /interval=!charend('/') / skip=
> >!default ("None") !charend('/') / exclusiv= !default ("None") !cmdend)
> >  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
> > !conditie=!skip !ifend
> >echo
>
> >'-----------------------------------------------------------------------------------------------------------'.
> >
> >echo !quote(!concat(' Validating MA variables: ', !listvar)).
> >comp validerr=0.
> >count qq=!listvar  (!interval).
> >if ((qq=0 ) & ~!conditie) validerr=1.
> >exe.
> >echo !quote(!concat('ERROR: the following variables should NOT be
> >missing unless SKIP condition: ', !skip)).
> >do if validerr>0.
> >print / 'id=' id .
> >end if.
> >exe.
> >del var validerr .
> >echo
>
> >'-----------------------------------------------------------------------------------------------------------'.
> >if ((qq<>0 ) & (!conditie)) validerr=1.
> >exe.
> >echo !quote(!concat('ERROR: the following variables should be missing
> >when NOT SKIP condition: ', !skip)).
> >do if validerr>0.
> >print / 'id=' id .
> >end if.
> >exe.
> >del var validerr .
> >echo
>
> >'-----------------------------------------------------------------------------------------------------------'.
> >
> >  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
> > !test=!exclusiv !ifend
> >do if !test & qq>1.
> >comp validerr=1.
> >end if.
> >  !let !flag=0
> >do if validerr>0.
> >echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY CONDITION'.
> >print / 'id=' id .
> >end if.
> >exe.
> >del var validerr .
> >del var qq .
> >echo '----------------------------END
>
> >VALIDATION-----------------------------------------------------------------'.
> >!enddefine.
> >*===========================================================.
> Same remarks again.
>
>
> >*===========================================================.
> >thank you very much and once again sorry for any trouble,
>
> Good luck, and I think I've given you something to go on with.
> Richard
>
>


--
Vlad Simion
Data Analyst
Tel:      +40 720130611
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

Art Kendall
It is possible that there are easier ways to accomplish what you are
trying to do.

It is also possible that the process can be broken into sections.

Is your syntax  a production job that will be run on a regular basis or
is it a one time data analysis?

I didn't follow your thread carefully, but I have the impression that
you are mainly checking to see whether cases have only legitimate values?
Please describe what you are trying to do, without at this time getting
into how you are trying to do it.


Art Kendall
Social Research Consultants


vlad simion wrote:

> thank you Jon,
>
> i don't have spss 14 or 15 to try to run the same syntax and see if it
> works
> or not.
> i wrote the issue to the list because i saw that there are a lot of guys
> from spss support that give a lot of usefull suggestions :), including
> you,
> even if you are not in Technical Support :-))
>
> many thanks,
>
> vlad
>
> On 12/14/06, Richard Ristow <[hidden email]> wrote:
>
>>
>> Reply to an off-list response.
>>
>> At 03:37 AM 12/14/2006, vlad simion wrote:
>>
>> >i took the liberty to write only to you because i have attached 2
>> >print screens with the errors occured and i know that on the forum it
>> >is not allowed to attach files, i hope it is not a problem.
>>
>> In both cases, the errors are "assertion failures." Assertions are
>> debugging tools internal to a program; an assertion failure is, *ipso
>> facto*, a bug in SPSS. Not that there's reason to be debugging SPSS 13;
>> only if it's replicable in 15 would it (and should it) get attention.
>>
>> -----------------------
>> To readers at SPSS, Inc. - to the accuracy I can transcribe them, the
>> first message gives
>> -----------------------
>> Program: ...\spsswin.exe  [I've elided the path name]
>> files: Z:\cs_source\Datasource\src\dictnry.cpp
>> line: 183
>>
>> expression:fpIterator
>> -----------------------
>> The second gives
>> -----------------------
>> Program: ...\spsswin.exe  [I've elided the path name]
>> files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
>> line: 200
>>
>> expression:iInput <_my._xdsData._inputs.size()
>> -----------------------
>> Since this hasn't hit everybody, it's a fair guess these are (or this
>> is a) size-dependent bug, i.e. one that shows up only with input of a
>> certain size or complexity. They're quite common.
>>
>>
>> >i think it has something to do about the transformations program, but
>> >i can't figure out what, i've manage to go a little further, but
>> >still... it crash giving error: "an error occured while atempting to
>> >write a transformation file"
>> >
>> >and here are the macros that i use:
>>
>> Remarks follow. But this is a LOT of code; you should be debugging it,
>> and I'm not going to try a complete job. How many of the suggestions I
>> made in the last posting, have you applied?
>>
>> And you haven't said a word about what each macro does. Nor put in any
>> annotations or comments in your definitions. Those are extremely
>> important for your understanding and quality control; and a minimal
>> courtesy, for anybody else you ask to look at your code.
>>
>> >set mprint=on printback=on mexpand=on.
>> >
>> >define !enum_to(values= !enclose('[',']') /except= !enclose('[',']') /
>> >sep= !default ('') !enclose('[',']'))
>> >  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals= !null
>> >  !do !j= !head(!except) !to !tail(!tail(!except))
>> >  !let !exvals= !concat(!exvals,' ',!j)
>> >  !doend
>> >  !else !let !exvals=!except
>> >  !ifend
>> >  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
>> >  !do !i= !head(!values) !to !tail(!tail(!values))
>> >  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0) !then
>> >  !let !vals= !concat(!vals,' ',!i,!sep)
>> >  !ifend
>> >  !doend
>> >  !let !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
>> > 2)))
>> >  !let !vals= !concat(!head(!values),!tail(!vals))
>> >  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
>> > !concat(!vals,!tail(!tail(!values))) !ifend
>> >  !else !let !vals= !values
>> >  !ifend
>> >  !vals
>> >  !enddefine.
>> >*===========================================================.
>> I can't see what code this emits, with the effort I'm willing to put
>> out. It has lot of looping. Does it call a lot of macros in those
>> loops, and hence emit, many times, the code that they emit?  a lot of
>> macro calls? If so, which macros does it call?
>>
>> >*===========================================================.
>> >define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
>> >sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
>> >  !do !i=!2 !to !3
>> >  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
>> >  !else !concat(!1,!i, !unquote(!sufix),!sep)
>> >  !ifend
>> >  !doend
>> >  !enddefine.
>> >*===========================================================.
>> Again, I can't see what this does, except it seems to invoke macro
>> !sufix. To do what? (Yes, I could look, but saying what is an
>> elementary courtesy.)
>>
>>
>> >*===========================================================.
>> >define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
>> >dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
>> >("") !enclose('(',')'))
>> >  !do !i=!head(!dim1) !to !tail(!dim1)
>> >  !do !j=!head(!dim2) !to !tail(!dim2)
>> >  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
>> > !concat(!1,!i,!unquote(!sufix),!j)
>> >  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
>> >  !ifend
>> >  !doend
>> >  !doend
>> >  !enddefine.
>> >*===========================================================.
>> This appears to be another looping macro, that calls other macros many
>> times. It looks like it's mostly !sufix.
>>
>> Here's a macro that seems to be doing something:
>> >*===========================================================.
>> >define valid_sa (listvar=!charend('/') /  skip= !default ("None")
>> >!charend('/')  / vallist= !default ("") !cmdend )
>> >  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
>> > !conditie=!skip !ifend
>> >echo
>>
>> >'--------------------------------------------------------------------------'.
>>
>> >echo !quote(!concat(' Validating SA variables: ', !listvar)).
>> >echo
>>
>> >'--------------------------------------------------------------------------'.
>>
>> >echo 'VALIDATING MISSING VALUES'.
>> >string validerr(A200).
>> >  !do !oe !IN (!listvar)
>> >if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
>> >',', !quote(!oe)).
>> >  !doend
>> >if length(ltrim(rtrim(validerr)))>0
>> >validerr=substr(validerr,2,length(validerr)-1).
>> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >missing unless SKIP condition: ', !skip)).
>> >exec.
>> >do if length(ltrim(rtrim(validerr)))>0.
>> >print / 'id=' id ' variables: ' validerr.
>> >end if.
>> >exec.
>> >del var validerr .
>> >string validerr(A200).
>> >  !do !oe !IN (!listvar)
>> >if (~miss(!oe) & ~!conditie ) validerr=concat(ltrim(rtrim(validerr)),
>> >',', !quote(!oe)).
>> >  !doend
>> >if length(ltrim(rtrim(validerr)))>0
>> >validerr=substr(validerr,2,length(validerr)-1).
>> >echo !quote(!concat('ERROR: The following variables SHOULD be missing
>> >for SKIP condition: ',!skip)).
>> >exec.
>> >do if length(ltrim(rtrim(validerr)))>0.
>> >print / 'id=' id ' variables: ' validerr.
>> >end if.
>> >exec.
>> >del var validerr .
>> >echo
>>
>> >'--------------------------------------------------------------------------'.
>>
>> >echo 'VALIDATING VALUES/RANGES'.
>> >string validerr(A200).
>> >  !do !oe !IN (!listvar)
>> >if ((~any(!oe,!vallist)) & !conditie )
>> >validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
>> >ltrim(rtrim(string(!oe , F15)))).
>> >  !doend
>> >if length(ltrim(rtrim(validerr)))>0
>> >validerr=substr(validerr,2,length(validerr)-1).
>> >echo !quote(!concat('ERROR: The following variables do not fit the
>> >requested Range for SKIP condition: ',!skip)).
>> >exec.
>> >do if length(ltrim(rtrim(validerr)))>0.
>> >print / 'id=' id ' variables: ' validerr.
>> >end if.
>> >exec.
>> >del var validerr .
>> >echo '----------------END VALIDATION
>> >--------------------------------------------'.
>> >  !enddefine.
>>
>> Each of your tests (missing values and ranges) makes two passes through
>> the data, once to generate your error-message string 'validerr' for
>> each case, and once to print it. (It would be quite easy to have the
>> PRINT statements in the same transformation programs that do the tests.
>> Why don't you do it that way?)
>>
>> You delete and re-declare 'validerr' each time for each new test:
>> >del var validerr .
>> >string validerr(A200).
>> Better simply to set it to blank, at the beginning of each set of tests
>> for a new record. This may easily be something that strains SPSS, if
>> you do it often.
>>
>> And, you're generating your tests in macro loops. How long are the
>> resulting transformation programs? And how often is this macro called,
>> at four transformation programs each call?
>>
>>
>> >*===========================================================.
>> >*===========================================================.
>> >*===========================================================.
>> >define valid_oe (listvar=!charend('/') / skip=!default ("None")
>> >!cmdend)
>> >  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
>> > !conditie=!skip  !ifend
>> >echo
>>
>> >'--------------------------------------------------------------------------'.
>>
>> >echo !quote(!concat(' Validating OE variables: ', !listvar)).
>> >string validerr(A200).
>> >  !do !oe !IN (!listvar)
>> >if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
>> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>> >  !doend
>> >if length(ltrim(rtrim(validerr)))>0
>> >validerr=substr(validerr,2,length(validerr)-1).
>> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >missing unless SKIP condition: ', !skip)).
>> >exec.
>> >do if length(ltrim(rtrim(validerr)))>0.
>> >print / 'id=' id ' variables: ' validerr.
>> >end if.
>> >exec.
>> >del var validerr .
>> >string validerr(A200).
>> >  !do !oe !IN (!listvar)
>> >if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
>> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>> >  !doend
>> >if length(ltrim(rtrim(validerr)))>0
>> >validerr=substr(validerr,2,length(validerr)-1).
>> >echo !quote(!concat('ERROR: The following variables SHOULD be missing
>> >for SKIP condition: ',!skip)).
>> >exec.
>> >do if length(ltrim(rtrim(validerr)))>0.
>> >print / 'id=' id ' variables: ' validerr.
>> >end if.
>> >exec.
>> >del var validerr .
>> >echo '----------------END VALIDATION OE
>> >--------------------------------------------'.
>> >  !enddefine.
>> >*===========================================================.
>> Same remarks as for previous.
>>
>>
>> >*===========================================================.
>> >*===========================================================.
>> >define valid_ma (listvar=!charend('/') /interval=!charend('/') / skip=
>> >!default ("None") !charend('/') / exclusiv= !default ("None") !cmdend)
>> >  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
>> > !conditie=!skip !ifend
>> >echo
>>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >
>> >echo !quote(!concat(' Validating MA variables: ', !listvar)).
>> >comp validerr=0.
>> >count qq=!listvar  (!interval).
>> >if ((qq=0 ) & ~!conditie) validerr=1.
>> >exe.
>> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >missing unless SKIP condition: ', !skip)).
>> >do if validerr>0.
>> >print / 'id=' id .
>> >end if.
>> >exe.
>> >del var validerr .
>> >echo
>>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >if ((qq<>0 ) & (!conditie)) validerr=1.
>> >exe.
>> >echo !quote(!concat('ERROR: the following variables should be missing
>> >when NOT SKIP condition: ', !skip)).
>> >do if validerr>0.
>> >print / 'id=' id .
>> >end if.
>> >exe.
>> >del var validerr .
>> >echo
>>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >
>> >  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
>> > !test=!exclusiv !ifend
>> >do if !test & qq>1.
>> >comp validerr=1.
>> >end if.
>> >  !let !flag=0
>> >do if validerr>0.
>> >echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY CONDITION'.
>> >print / 'id=' id .
>> >end if.
>> >exe.
>> >del var validerr .
>> >del var qq .
>> >echo '----------------------------END
>>
>> >VALIDATION-----------------------------------------------------------------'.
>>
>> >!enddefine.
>> >*===========================================================.
>> Same remarks again.
>>
>>
>> >*===========================================================.
>> >thank you very much and once again sorry for any trouble,
>>
>> Good luck, and I think I've given you something to go on with.
>> Richard
>>
>>
>
>
> --
> Vlad Simion
> Data Analyst
> Tel:      +40 720130611
>
>
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

vlad simion
Hi Art,

for the moment it's just a one time job, but if I can make it work properly,
then it would become a production job :)
indeed, the main purpose is to check that cases have valid values
the easier way to accomplish this would be simple syntax code, but it would
be very long and there are a lot of repeating tasks. That's why I tried to
do this by macros.

Many thanks for your insights,

Vlad

On 12/15/06, Art Kendall <[hidden email]> wrote:

>
> It is possible that there are easier ways to accomplish what you are
> trying to do.
>
> It is also possible that the process can be broken into sections.
>
> Is your syntax  a production job that will be run on a regular basis or
> is it a one time data analysis?
>
> I didn't follow your thread carefully, but I have the impression that
> you are mainly checking to see whether cases have only legitimate values?
> Please describe what you are trying to do, without at this time getting
> into how you are trying to do it.
>
>
> Art Kendall
> Social Research Consultants
>
>
> vlad simion wrote:
>
> > thank you Jon,
> >
> > i don't have spss 14 or 15 to try to run the same syntax and see if it
> > works
> > or not.
> > i wrote the issue to the list because i saw that there are a lot of guys
> > from spss support that give a lot of usefull suggestions :), including
> > you,
> > even if you are not in Technical Support :-))
> >
> > many thanks,
> >
> > vlad
> >
> > On 12/14/06, Richard Ristow <[hidden email]> wrote:
> >
> >>
> >> Reply to an off-list response.
> >>
> >> At 03:37 AM 12/14/2006, vlad simion wrote:
> >>
> >> >i took the liberty to write only to you because i have attached 2
> >> >print screens with the errors occured and i know that on the forum it
> >> >is not allowed to attach files, i hope it is not a problem.
> >>
> >> In both cases, the errors are "assertion failures." Assertions are
> >> debugging tools internal to a program; an assertion failure is, *ipso
> >> facto*, a bug in SPSS. Not that there's reason to be debugging SPSS 13;
> >> only if it's replicable in 15 would it (and should it) get attention.
> >>
> >> -----------------------
> >> To readers at SPSS, Inc. - to the accuracy I can transcribe them, the
> >> first message gives
> >> -----------------------
> >> Program: ...\spsswin.exe  [I've elided the path name]
> >> files: Z:\cs_source\Datasource\src\dictnry.cpp
> >> line: 183
> >>
> >> expression:fpIterator
> >> -----------------------
> >> The second gives
> >> -----------------------
> >> Program: ...\spsswin.exe  [I've elided the path name]
> >> files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
> >> line: 200
> >>
> >> expression:iInput <_my._xdsData._inputs.size()
> >> -----------------------
> >> Since this hasn't hit everybody, it's a fair guess these are (or this
> >> is a) size-dependent bug, i.e. one that shows up only with input of a
> >> certain size or complexity. They're quite common.
> >>
> >>
> >> >i think it has something to do about the transformations program, but
> >> >i can't figure out what, i've manage to go a little further, but
> >> >still... it crash giving error: "an error occured while atempting to
> >> >write a transformation file"
> >> >
> >> >and here are the macros that i use:
> >>
> >> Remarks follow. But this is a LOT of code; you should be debugging it,
> >> and I'm not going to try a complete job. How many of the suggestions I
> >> made in the last posting, have you applied?
> >>
> >> And you haven't said a word about what each macro does. Nor put in any
> >> annotations or comments in your definitions. Those are extremely
> >> important for your understanding and quality control; and a minimal
> >> courtesy, for anybody else you ask to look at your code.
> >>
> >> >set mprint=on printback=on mexpand=on.
> >> >
> >> >define !enum_to(values= !enclose('[',']') /except= !enclose('[',']') /
> >> >sep= !default ('') !enclose('[',']'))
> >> >  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals= !null
> >> >  !do !j= !head(!except) !to !tail(!tail(!except))
> >> >  !let !exvals= !concat(!exvals,' ',!j)
> >> >  !doend
> >> >  !else !let !exvals=!except
> >> >  !ifend
> >> >  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
> >> >  !do !i= !head(!values) !to !tail(!tail(!values))
> >> >  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0) !then
> >> >  !let !vals= !concat(!vals,' ',!i,!sep)
> >> >  !ifend
> >> >  !doend
> >> >  !let !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
> >> > 2)))
> >> >  !let !vals= !concat(!head(!values),!tail(!vals))
> >> >  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
> >> > !concat(!vals,!tail(!tail(!values))) !ifend
> >> >  !else !let !vals= !values
> >> >  !ifend
> >> >  !vals
> >> >  !enddefine.
> >> >*===========================================================.
> >> I can't see what code this emits, with the effort I'm willing to put
> >> out. It has lot of looping. Does it call a lot of macros in those
> >> loops, and hence emit, many times, the code that they emit?  a lot of
> >> macro calls? If so, which macros does it call?
> >>
> >> >*===========================================================.
> >> >define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
> >> >sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
> >> >  !do !i=!2 !to !3
> >> >  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
> >> >  !else !concat(!1,!i, !unquote(!sufix),!sep)
> >> >  !ifend
> >> >  !doend
> >> >  !enddefine.
> >> >*===========================================================.
> >> Again, I can't see what this does, except it seems to invoke macro
> >> !sufix. To do what? (Yes, I could look, but saying what is an
> >> elementary courtesy.)
> >>
> >>
> >> >*===========================================================.
> >> >define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
> >> >dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
> >> >("") !enclose('(',')'))
> >> >  !do !i=!head(!dim1) !to !tail(!dim1)
> >> >  !do !j=!head(!dim2) !to !tail(!dim2)
> >> >  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
> >> > !concat(!1,!i,!unquote(!sufix),!j)
> >> >  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
> >> >  !ifend
> >> >  !doend
> >> >  !doend
> >> >  !enddefine.
> >> >*===========================================================.
> >> This appears to be another looping macro, that calls other macros many
> >> times. It looks like it's mostly !sufix.
> >>
> >> Here's a macro that seems to be doing something:
> >> >*===========================================================.
> >> >define valid_sa (listvar=!charend('/') /  skip= !default ("None")
> >> >!charend('/')  / vallist= !default ("") !cmdend )
> >> >  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
> >> > !conditie=!skip !ifend
> >> >echo
> >>
> >>
> >'--------------------------------------------------------------------------'.
> >>
> >> >echo !quote(!concat(' Validating SA variables: ', !listvar)).
> >> >echo
> >>
> >>
> >'--------------------------------------------------------------------------'.
> >>
> >> >echo 'VALIDATING MISSING VALUES'.
> >> >string validerr(A200).
> >> >  !do !oe !IN (!listvar)
> >> >if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
> >> >',', !quote(!oe)).
> >> >  !doend
> >> >if length(ltrim(rtrim(validerr)))>0
> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >echo !quote(!concat('ERROR: the following variables should NOT be
> >> >missing unless SKIP condition: ', !skip)).
> >> >exec.
> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >print / 'id=' id ' variables: ' validerr.
> >> >end if.
> >> >exec.
> >> >del var validerr .
> >> >string validerr(A200).
> >> >  !do !oe !IN (!listvar)
> >> >if (~miss(!oe) & ~!conditie ) validerr=concat(ltrim(rtrim(validerr)),
> >> >',', !quote(!oe)).
> >> >  !doend
> >> >if length(ltrim(rtrim(validerr)))>0
> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >echo !quote(!concat('ERROR: The following variables SHOULD be missing
> >> >for SKIP condition: ',!skip)).
> >> >exec.
> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >print / 'id=' id ' variables: ' validerr.
> >> >end if.
> >> >exec.
> >> >del var validerr .
> >> >echo
> >>
> >>
> >'--------------------------------------------------------------------------'.
> >>
> >> >echo 'VALIDATING VALUES/RANGES'.
> >> >string validerr(A200).
> >> >  !do !oe !IN (!listvar)
> >> >if ((~any(!oe,!vallist)) & !conditie )
> >> >validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
> >> >ltrim(rtrim(string(!oe , F15)))).
> >> >  !doend
> >> >if length(ltrim(rtrim(validerr)))>0
> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >echo !quote(!concat('ERROR: The following variables do not fit the
> >> >requested Range for SKIP condition: ',!skip)).
> >> >exec.
> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >print / 'id=' id ' variables: ' validerr.
> >> >end if.
> >> >exec.
> >> >del var validerr .
> >> >echo '----------------END VALIDATION
> >> >--------------------------------------------'.
> >> >  !enddefine.
> >>
> >> Each of your tests (missing values and ranges) makes two passes through
> >> the data, once to generate your error-message string 'validerr' for
> >> each case, and once to print it. (It would be quite easy to have the
> >> PRINT statements in the same transformation programs that do the tests.
> >> Why don't you do it that way?)
> >>
> >> You delete and re-declare 'validerr' each time for each new test:
> >> >del var validerr .
> >> >string validerr(A200).
> >> Better simply to set it to blank, at the beginning of each set of tests
> >> for a new record. This may easily be something that strains SPSS, if
> >> you do it often.
> >>
> >> And, you're generating your tests in macro loops. How long are the
> >> resulting transformation programs? And how often is this macro called,
> >> at four transformation programs each call?
> >>
> >>
> >> >*===========================================================.
> >> >*===========================================================.
> >> >*===========================================================.
> >> >define valid_oe (listvar=!charend('/') / skip=!default ("None")
> >> >!cmdend)
> >> >  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
> >> > !conditie=!skip  !ifend
> >> >echo
> >>
> >>
> >'--------------------------------------------------------------------------'.
> >>
> >> >echo !quote(!concat(' Validating OE variables: ', !listvar)).
> >> >string validerr(A200).
> >> >  !do !oe !IN (!listvar)
> >> >if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
> >> >  !doend
> >> >if length(ltrim(rtrim(validerr)))>0
> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >echo !quote(!concat('ERROR: the following variables should NOT be
> >> >missing unless SKIP condition: ', !skip)).
> >> >exec.
> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >print / 'id=' id ' variables: ' validerr.
> >> >end if.
> >> >exec.
> >> >del var validerr .
> >> >string validerr(A200).
> >> >  !do !oe !IN (!listvar)
> >> >if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
> >> >  !doend
> >> >if length(ltrim(rtrim(validerr)))>0
> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >echo !quote(!concat('ERROR: The following variables SHOULD be missing
> >> >for SKIP condition: ',!skip)).
> >> >exec.
> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >print / 'id=' id ' variables: ' validerr.
> >> >end if.
> >> >exec.
> >> >del var validerr .
> >> >echo '----------------END VALIDATION OE
> >> >--------------------------------------------'.
> >> >  !enddefine.
> >> >*===========================================================.
> >> Same remarks as for previous.
> >>
> >>
> >> >*===========================================================.
> >> >*===========================================================.
> >> >define valid_ma (listvar=!charend('/') /interval=!charend('/') / skip=
> >> >!default ("None") !charend('/') / exclusiv= !default ("None") !cmdend)
> >> >  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
> >> > !conditie=!skip !ifend
> >> >echo
> >>
> >>
> >'-----------------------------------------------------------------------------------------------------------'.
> >>
> >> >
> >> >echo !quote(!concat(' Validating MA variables: ', !listvar)).
> >> >comp validerr=0.
> >> >count qq=!listvar  (!interval).
> >> >if ((qq=0 ) & ~!conditie) validerr=1.
> >> >exe.
> >> >echo !quote(!concat('ERROR: the following variables should NOT be
> >> >missing unless SKIP condition: ', !skip)).
> >> >do if validerr>0.
> >> >print / 'id=' id .
> >> >end if.
> >> >exe.
> >> >del var validerr .
> >> >echo
> >>
> >>
> >'-----------------------------------------------------------------------------------------------------------'.
> >>
> >> >if ((qq<>0 ) & (!conditie)) validerr=1.
> >> >exe.
> >> >echo !quote(!concat('ERROR: the following variables should be missing
> >> >when NOT SKIP condition: ', !skip)).
> >> >do if validerr>0.
> >> >print / 'id=' id .
> >> >end if.
> >> >exe.
> >> >del var validerr .
> >> >echo
> >>
> >>
> >'-----------------------------------------------------------------------------------------------------------'.
> >>
> >> >
> >> >  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
> >> > !test=!exclusiv !ifend
> >> >do if !test & qq>1.
> >> >comp validerr=1.
> >> >end if.
> >> >  !let !flag=0
> >> >do if validerr>0.
> >> >echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY CONDITION'.
> >> >print / 'id=' id .
> >> >end if.
> >> >exe.
> >> >del var validerr .
> >> >del var qq .
> >> >echo '----------------------------END
> >>
> >>
> >VALIDATION-----------------------------------------------------------------'.
> >>
> >> >!enddefine.
> >> >*===========================================================.
> >> Same remarks again.
> >>
> >>
> >> >*===========================================================.
> >> >thank you very much and once again sorry for any trouble,
> >>
> >> Good luck, and I think I've given you something to go on with.
> >> Richard
> >>
> >>
> >
> >
> > --
> > Vlad Simion
> > Data Analyst
> > Tel:      +40 720130611
> >
> >
>
>


--
Vlad Simion
Data Analyst
Tel:      +40 720130611
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

Art Kendall
There are ways in syntax to apply the same syntax to to many variables.
do repeat
loop
long variable lists.

Of course a lot depends on the substantive nature of your data and its
arrangement in the file.


There are ways in the current version to set up data validation rules
and save them to apply to other files.
In other words, that wheel has already been invented.

I would suggest that you compare the total costs (your time,
frustration, obsolescence of your efforts before you start, etc.) if you
become current in your SPSS version now, vs waiting until later to
become current.

Also, macros and scripts are not  up-to-date programmability tools.  I
would suggest looking at the programmability tools provided via Python
which is free and works well with current SPSS versions.

Art Kendall
Social Research Consultants





vlad simion wrote:

> Hi Art,
>
> for the moment it's just a one time job, but if I can make it work
> properly,
> then it would become a production job :)
> indeed, the main purpose is to check that cases have valid values
> the easier way to accomplish this would be simple syntax code, but it
> would
> be very long and there are a lot of repeating tasks. That's why I
> tried to
> do this by macros.
>
> Many thanks for your insights,
>
> Vlad
>
> On 12/15/06, Art Kendall <[hidden email]> wrote:
>
>>
>> It is possible that there are easier ways to accomplish what you are
>> trying to do.
>>
>> It is also possible that the process can be broken into sections.
>>
>> Is your syntax  a production job that will be run on a regular basis or
>> is it a one time data analysis?
>>
>> I didn't follow your thread carefully, but I have the impression that
>> you are mainly checking to see whether cases have only legitimate
>> values?
>> Please describe what you are trying to do, without at this time getting
>> into how you are trying to do it.
>>
>>
>> Art Kendall
>> Social Research Consultants
>>
>>
>> vlad simion wrote:
>>
>> > thank you Jon,
>> >
>> > i don't have spss 14 or 15 to try to run the same syntax and see if it
>> > works
>> > or not.
>> > i wrote the issue to the list because i saw that there are a lot of
>> guys
>> > from spss support that give a lot of usefull suggestions :), including
>> > you,
>> > even if you are not in Technical Support :-))
>> >
>> > many thanks,
>> >
>> > vlad
>> >
>> > On 12/14/06, Richard Ristow <[hidden email]> wrote:
>> >
>> >>
>> >> Reply to an off-list response.
>> >>
>> >> At 03:37 AM 12/14/2006, vlad simion wrote:
>> >>
>> >> >i took the liberty to write only to you because i have attached 2
>> >> >print screens with the errors occured and i know that on the
>> forum it
>> >> >is not allowed to attach files, i hope it is not a problem.
>> >>
>> >> In both cases, the errors are "assertion failures." Assertions are
>> >> debugging tools internal to a program; an assertion failure is, *ipso
>> >> facto*, a bug in SPSS. Not that there's reason to be debugging
>> SPSS 13;
>> >> only if it's replicable in 15 would it (and should it) get attention.
>> >>
>> >> -----------------------
>> >> To readers at SPSS, Inc. - to the accuracy I can transcribe them, the
>> >> first message gives
>> >> -----------------------
>> >> Program: ...\spsswin.exe  [I've elided the path name]
>> >> files: Z:\cs_source\Datasource\src\dictnry.cpp
>> >> line: 183
>> >>
>> >> expression:fpIterator
>> >> -----------------------
>> >> The second gives
>> >> -----------------------
>> >> Program: ...\spsswin.exe  [I've elided the path name]
>> >> files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
>> >> line: 200
>> >>
>> >> expression:iInput <_my._xdsData._inputs.size()
>> >> -----------------------
>> >> Since this hasn't hit everybody, it's a fair guess these are (or this
>> >> is a) size-dependent bug, i.e. one that shows up only with input of a
>> >> certain size or complexity. They're quite common.
>> >>
>> >>
>> >> >i think it has something to do about the transformations program,
>> but
>> >> >i can't figure out what, i've manage to go a little further, but
>> >> >still... it crash giving error: "an error occured while atempting to
>> >> >write a transformation file"
>> >> >
>> >> >and here are the macros that i use:
>> >>
>> >> Remarks follow. But this is a LOT of code; you should be debugging
>> it,
>> >> and I'm not going to try a complete job. How many of the
>> suggestions I
>> >> made in the last posting, have you applied?
>> >>
>> >> And you haven't said a word about what each macro does. Nor put in
>> any
>> >> annotations or comments in your definitions. Those are extremely
>> >> important for your understanding and quality control; and a minimal
>> >> courtesy, for anybody else you ask to look at your code.
>> >>
>> >> >set mprint=on printback=on mexpand=on.
>> >> >
>> >> >define !enum_to(values= !enclose('[',']') /except=
>> !enclose('[',']') /
>> >> >sep= !default ('') !enclose('[',']'))
>> >> >  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals=
>> !null
>> >> >  !do !j= !head(!except) !to !tail(!tail(!except))
>> >> >  !let !exvals= !concat(!exvals,' ',!j)
>> >> >  !doend
>> >> >  !else !let !exvals=!except
>> >> >  !ifend
>> >> >  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
>> >> >  !do !i= !head(!values) !to !tail(!tail(!values))
>> >> >  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0) !then
>> >> >  !let !vals= !concat(!vals,' ',!i,!sep)
>> >> >  !ifend
>> >> >  !doend
>> >> >  !let
>> !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
>> >> > 2)))
>> >> >  !let !vals= !concat(!head(!values),!tail(!vals))
>> >> >  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
>> >> > !concat(!vals,!tail(!tail(!values))) !ifend
>> >> >  !else !let !vals= !values
>> >> >  !ifend
>> >> >  !vals
>> >> >  !enddefine.
>> >> >*===========================================================.
>> >> I can't see what code this emits, with the effort I'm willing to put
>> >> out. It has lot of looping. Does it call a lot of macros in those
>> >> loops, and hence emit, many times, the code that they emit?  a lot of
>> >> macro calls? If so, which macros does it call?
>> >>
>> >> >*===========================================================.
>> >> >define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
>> >> >sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
>> >> >  !do !i=!2 !to !3
>> >> >  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
>> >> >  !else !concat(!1,!i, !unquote(!sufix),!sep)
>> >> >  !ifend
>> >> >  !doend
>> >> >  !enddefine.
>> >> >*===========================================================.
>> >> Again, I can't see what this does, except it seems to invoke macro
>> >> !sufix. To do what? (Yes, I could look, but saying what is an
>> >> elementary courtesy.)
>> >>
>> >>
>> >> >*===========================================================.
>> >> >define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
>> >> >dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
>> >> >("") !enclose('(',')'))
>> >> >  !do !i=!head(!dim1) !to !tail(!dim1)
>> >> >  !do !j=!head(!dim2) !to !tail(!dim2)
>> >> >  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
>> >> > !concat(!1,!i,!unquote(!sufix),!j)
>> >> >  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
>> >> >  !ifend
>> >> >  !doend
>> >> >  !doend
>> >> >  !enddefine.
>> >> >*===========================================================.
>> >> This appears to be another looping macro, that calls other macros
>> many
>> >> times. It looks like it's mostly !sufix.
>> >>
>> >> Here's a macro that seems to be doing something:
>> >> >*===========================================================.
>> >> >define valid_sa (listvar=!charend('/') /  skip= !default ("None")
>> >> >!charend('/')  / vallist= !default ("") !cmdend )
>> >> >  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
>> >> > !conditie=!skip !ifend
>> >> >echo
>> >>
>> >>
>> >'--------------------------------------------------------------------------'.
>>
>> >>
>> >> >echo !quote(!concat(' Validating SA variables: ', !listvar)).
>> >> >echo
>> >>
>> >>
>> >'--------------------------------------------------------------------------'.
>>
>> >>
>> >> >echo 'VALIDATING MISSING VALUES'.
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
>> >> >',', !quote(!oe)).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >> >missing unless SKIP condition: ', !skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if (~miss(!oe) & ~!conditie )
>> validerr=concat(ltrim(rtrim(validerr)),
>> >> >',', !quote(!oe)).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: The following variables SHOULD be
>> missing
>> >> >for SKIP condition: ',!skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >echo
>> >>
>> >>
>> >'--------------------------------------------------------------------------'.
>>
>> >>
>> >> >echo 'VALIDATING VALUES/RANGES'.
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if ((~any(!oe,!vallist)) & !conditie )
>> >> >validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
>> >> >ltrim(rtrim(string(!oe , F15)))).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: The following variables do not fit the
>> >> >requested Range for SKIP condition: ',!skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >echo '----------------END VALIDATION
>> >> >--------------------------------------------'.
>> >> >  !enddefine.
>> >>
>> >> Each of your tests (missing values and ranges) makes two passes
>> through
>> >> the data, once to generate your error-message string 'validerr' for
>> >> each case, and once to print it. (It would be quite easy to have the
>> >> PRINT statements in the same transformation programs that do the
>> tests.
>> >> Why don't you do it that way?)
>> >>
>> >> You delete and re-declare 'validerr' each time for each new test:
>> >> >del var validerr .
>> >> >string validerr(A200).
>> >> Better simply to set it to blank, at the beginning of each set of
>> tests
>> >> for a new record. This may easily be something that strains SPSS, if
>> >> you do it often.
>> >>
>> >> And, you're generating your tests in macro loops. How long are the
>> >> resulting transformation programs? And how often is this macro
>> called,
>> >> at four transformation programs each call?
>> >>
>> >>
>> >> >*===========================================================.
>> >> >*===========================================================.
>> >> >*===========================================================.
>> >> >define valid_oe (listvar=!charend('/') / skip=!default ("None")
>> >> >!cmdend)
>> >> >  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
>> >> > !conditie=!skip  !ifend
>> >> >echo
>> >>
>> >>
>> >'--------------------------------------------------------------------------'.
>>
>> >>
>> >> >echo !quote(!concat(' Validating OE variables: ', !listvar)).
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
>> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >> >missing unless SKIP condition: ', !skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
>> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: The following variables SHOULD be
>> missing
>> >> >for SKIP condition: ',!skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >echo '----------------END VALIDATION OE
>> >> >--------------------------------------------'.
>> >> >  !enddefine.
>> >> >*===========================================================.
>> >> Same remarks as for previous.
>> >>
>> >>
>> >> >*===========================================================.
>> >> >*===========================================================.
>> >> >define valid_ma (listvar=!charend('/') /interval=!charend('/') /
>> skip=
>> >> >!default ("None") !charend('/') / exclusiv= !default ("None")
>> !cmdend)
>> >> >  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
>> >> > !conditie=!skip !ifend
>> >> >echo
>> >>
>> >>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >>
>> >> >
>> >> >echo !quote(!concat(' Validating MA variables: ', !listvar)).
>> >> >comp validerr=0.
>> >> >count qq=!listvar  (!interval).
>> >> >if ((qq=0 ) & ~!conditie) validerr=1.
>> >> >exe.
>> >> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >> >missing unless SKIP condition: ', !skip)).
>> >> >do if validerr>0.
>> >> >print / 'id=' id .
>> >> >end if.
>> >> >exe.
>> >> >del var validerr .
>> >> >echo
>> >>
>> >>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >>
>> >> >if ((qq<>0 ) & (!conditie)) validerr=1.
>> >> >exe.
>> >> >echo !quote(!concat('ERROR: the following variables should be
>> missing
>> >> >when NOT SKIP condition: ', !skip)).
>> >> >do if validerr>0.
>> >> >print / 'id=' id .
>> >> >end if.
>> >> >exe.
>> >> >del var validerr .
>> >> >echo
>> >>
>> >>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >>
>> >> >
>> >> >  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
>> >> > !test=!exclusiv !ifend
>> >> >do if !test & qq>1.
>> >> >comp validerr=1.
>> >> >end if.
>> >> >  !let !flag=0
>> >> >do if validerr>0.
>> >> >echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY
>> CONDITION'.
>> >> >print / 'id=' id .
>> >> >end if.
>> >> >exe.
>> >> >del var validerr .
>> >> >del var qq .
>> >> >echo '----------------------------END
>> >>
>> >>
>> >VALIDATION-----------------------------------------------------------------'.
>>
>> >>
>> >> >!enddefine.
>> >> >*===========================================================.
>> >> Same remarks again.
>> >>
>> >>
>> >> >*===========================================================.
>> >> >thank you very much and once again sorry for any trouble,
>> >>
>> >> Good luck, and I think I've given you something to go on with.
>> >> Richard
>> >>
>> >>
>> >
>> >
>> > --
>> > Vlad Simion
>> > Data Analyst
>> > Tel:      +40 720130611
>> >
>> >
>>
>>
>
>
> --
> Vlad Simion
> Data Analyst
> Tel:      +40 720130611
>
>
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

vlad simion
Hi Art,

even with do repeat or loop, the code would be very long, the database has
aproximately 25 thousands variables or more.
I use Spss 13, so I don't have acces to the programability feature :(

Greetings,

Vlad

On 12/15/06, Art Kendall <[hidden email]> wrote:

>
> There are ways in syntax to apply the same syntax to to many variables.
> do repeat
> loop
> long variable lists.
>
> Of course a lot depends on the substantive nature of your data and its
> arrangement in the file.
>
>
> There are ways in the current version to set up data validation rules
> and save them to apply to other files.
> In other words, that wheel has already been invented.
>
> I would suggest that you compare the total costs (your time,
> frustration, obsolescence of your efforts before you start, etc.) if you
> become current in your SPSS version now, vs waiting until later to
> become current.
>
> Also, macros and scripts are not  up-to-date programmability tools.  I
> would suggest looking at the programmability tools provided via Python
> which is free and works well with current SPSS versions.
>
> Art Kendall
> Social Research Consultants
>
>
>
>
>
> vlad simion wrote:
>
> > Hi Art,
> >
> > for the moment it's just a one time job, but if I can make it work
> > properly,
> > then it would become a production job :)
> > indeed, the main purpose is to check that cases have valid values
> > the easier way to accomplish this would be simple syntax code, but it
> > would
> > be very long and there are a lot of repeating tasks. That's why I
> > tried to
> > do this by macros.
> >
> > Many thanks for your insights,
> >
> > Vlad
> >
> > On 12/15/06, Art Kendall <[hidden email]> wrote:
> >
> >>
> >> It is possible that there are easier ways to accomplish what you are
> >> trying to do.
> >>
> >> It is also possible that the process can be broken into sections.
> >>
> >> Is your syntax  a production job that will be run on a regular basis or
> >> is it a one time data analysis?
> >>
> >> I didn't follow your thread carefully, but I have the impression that
> >> you are mainly checking to see whether cases have only legitimate
> >> values?
> >> Please describe what you are trying to do, without at this time getting
> >> into how you are trying to do it.
> >>
> >>
> >> Art Kendall
> >> Social Research Consultants
> >>
> >>
> >> vlad simion wrote:
> >>
> >> > thank you Jon,
> >> >
> >> > i don't have spss 14 or 15 to try to run the same syntax and see if
> it
> >> > works
> >> > or not.
> >> > i wrote the issue to the list because i saw that there are a lot of
> >> guys
> >> > from spss support that give a lot of usefull suggestions :),
> including
> >> > you,
> >> > even if you are not in Technical Support :-))
> >> >
> >> > many thanks,
> >> >
> >> > vlad
> >> >
> >> > On 12/14/06, Richard Ristow <[hidden email]> wrote:
> >> >
> >> >>
> >> >> Reply to an off-list response.
> >> >>
> >> >> At 03:37 AM 12/14/2006, vlad simion wrote:
> >> >>
> >> >> >i took the liberty to write only to you because i have attached 2
> >> >> >print screens with the errors occured and i know that on the
> >> forum it
> >> >> >is not allowed to attach files, i hope it is not a problem.
> >> >>
> >> >> In both cases, the errors are "assertion failures." Assertions are
> >> >> debugging tools internal to a program; an assertion failure is,
> *ipso
> >> >> facto*, a bug in SPSS. Not that there's reason to be debugging
> >> SPSS 13;
> >> >> only if it's replicable in 15 would it (and should it) get
> attention.
> >> >>
> >> >> -----------------------
> >> >> To readers at SPSS, Inc. - to the accuracy I can transcribe them,
> the
> >> >> first message gives
> >> >> -----------------------
> >> >> Program: ...\spsswin.exe  [I've elided the path name]
> >> >> files: Z:\cs_source\Datasource\src\dictnry.cpp
> >> >> line: 183
> >> >>
> >> >> expression:fpIterator
> >> >> -----------------------
> >> >> The second gives
> >> >> -----------------------
> >> >> Program: ...\spsswin.exe  [I've elided the path name]
> >> >> files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
> >> >> line: 200
> >> >>
> >> >> expression:iInput <_my._xdsData._inputs.size()
> >> >> -----------------------
> >> >> Since this hasn't hit everybody, it's a fair guess these are (or
> this
> >> >> is a) size-dependent bug, i.e. one that shows up only with input of
> a
> >> >> certain size or complexity. They're quite common.
> >> >>
> >> >>
> >> >> >i think it has something to do about the transformations program,
> >> but
> >> >> >i can't figure out what, i've manage to go a little further, but
> >> >> >still... it crash giving error: "an error occured while atempting
> to
> >> >> >write a transformation file"
> >> >> >
> >> >> >and here are the macros that i use:
> >> >>
> >> >> Remarks follow. But this is a LOT of code; you should be debugging
> >> it,
> >> >> and I'm not going to try a complete job. How many of the
> >> suggestions I
> >> >> made in the last posting, have you applied?
> >> >>
> >> >> And you haven't said a word about what each macro does. Nor put in
> >> any
> >> >> annotations or comments in your definitions. Those are extremely
> >> >> important for your understanding and quality control; and a minimal
> >> >> courtesy, for anybody else you ask to look at your code.
> >> >>
> >> >> >set mprint=on printback=on mexpand=on.
> >> >> >
> >> >> >define !enum_to(values= !enclose('[',']') /except=
> >> !enclose('[',']') /
> >> >> >sep= !default ('') !enclose('[',']'))
> >> >> >  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals=
> >> !null
> >> >> >  !do !j= !head(!except) !to !tail(!tail(!except))
> >> >> >  !let !exvals= !concat(!exvals,' ',!j)
> >> >> >  !doend
> >> >> >  !else !let !exvals=!except
> >> >> >  !ifend
> >> >> >  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
> >> >> >  !do !i= !head(!values) !to !tail(!tail(!values))
> >> >> >  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0)
> !then
> >> >> >  !let !vals= !concat(!vals,' ',!i,!sep)
> >> >> >  !ifend
> >> >> >  !doend
> >> >> >  !let
> >> !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
> >> >> > 2)))
> >> >> >  !let !vals= !concat(!head(!values),!tail(!vals))
> >> >> >  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
> >> >> > !concat(!vals,!tail(!tail(!values))) !ifend
> >> >> >  !else !let !vals= !values
> >> >> >  !ifend
> >> >> >  !vals
> >> >> >  !enddefine.
> >> >> >*===========================================================.
> >> >> I can't see what code this emits, with the effort I'm willing to put
> >> >> out. It has lot of looping. Does it call a lot of macros in those
> >> >> loops, and hence emit, many times, the code that they emit?  a lot
> of
> >> >> macro calls? If so, which macros does it call?
> >> >>
> >> >> >*===========================================================.
> >> >> >define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
> >> >> >sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
> >> >> >  !do !i=!2 !to !3
> >> >> >  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
> >> >> >  !else !concat(!1,!i, !unquote(!sufix),!sep)
> >> >> >  !ifend
> >> >> >  !doend
> >> >> >  !enddefine.
> >> >> >*===========================================================.
> >> >> Again, I can't see what this does, except it seems to invoke macro
> >> >> !sufix. To do what? (Yes, I could look, but saying what is an
> >> >> elementary courtesy.)
> >> >>
> >> >>
> >> >> >*===========================================================.
> >> >> >define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
> >> >> >dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
> >> >> >("") !enclose('(',')'))
> >> >> >  !do !i=!head(!dim1) !to !tail(!dim1)
> >> >> >  !do !j=!head(!dim2) !to !tail(!dim2)
> >> >> >  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
> >> >> > !concat(!1,!i,!unquote(!sufix),!j)
> >> >> >  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
> >> >> >  !ifend
> >> >> >  !doend
> >> >> >  !doend
> >> >> >  !enddefine.
> >> >> >*===========================================================.
> >> >> This appears to be another looping macro, that calls other macros
> >> many
> >> >> times. It looks like it's mostly !sufix.
> >> >>
> >> >> Here's a macro that seems to be doing something:
> >> >> >*===========================================================.
> >> >> >define valid_sa (listvar=!charend('/') /  skip= !default ("None")
> >> >> >!charend('/')  / vallist= !default ("") !cmdend )
> >> >> >  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
> >> >> > !conditie=!skip !ifend
> >> >> >echo
> >> >>
> >> >>
> >>
> >'--------------------------------------------------------------------------'.
> >>
> >> >>
> >> >> >echo !quote(!concat(' Validating SA variables: ', !listvar)).
> >> >> >echo
> >> >>
> >> >>
> >>
> >'--------------------------------------------------------------------------'.
> >>
> >> >>
> >> >> >echo 'VALIDATING MISSING VALUES'.
> >> >> >string validerr(A200).
> >> >> >  !do !oe !IN (!listvar)
> >> >> >if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
> >> >> >',', !quote(!oe)).
> >> >> >  !doend
> >> >> >if length(ltrim(rtrim(validerr)))>0
> >> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >> >echo !quote(!concat('ERROR: the following variables should NOT be
> >> >> >missing unless SKIP condition: ', !skip)).
> >> >> >exec.
> >> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >> >print / 'id=' id ' variables: ' validerr.
> >> >> >end if.
> >> >> >exec.
> >> >> >del var validerr .
> >> >> >string validerr(A200).
> >> >> >  !do !oe !IN (!listvar)
> >> >> >if (~miss(!oe) & ~!conditie )
> >> validerr=concat(ltrim(rtrim(validerr)),
> >> >> >',', !quote(!oe)).
> >> >> >  !doend
> >> >> >if length(ltrim(rtrim(validerr)))>0
> >> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >> >echo !quote(!concat('ERROR: The following variables SHOULD be
> >> missing
> >> >> >for SKIP condition: ',!skip)).
> >> >> >exec.
> >> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >> >print / 'id=' id ' variables: ' validerr.
> >> >> >end if.
> >> >> >exec.
> >> >> >del var validerr .
> >> >> >echo
> >> >>
> >> >>
> >>
> >'--------------------------------------------------------------------------'.
> >>
> >> >>
> >> >> >echo 'VALIDATING VALUES/RANGES'.
> >> >> >string validerr(A200).
> >> >> >  !do !oe !IN (!listvar)
> >> >> >if ((~any(!oe,!vallist)) & !conditie )
> >> >> >validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
> >> >> >ltrim(rtrim(string(!oe , F15)))).
> >> >> >  !doend
> >> >> >if length(ltrim(rtrim(validerr)))>0
> >> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >> >echo !quote(!concat('ERROR: The following variables do not fit the
> >> >> >requested Range for SKIP condition: ',!skip)).
> >> >> >exec.
> >> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >> >print / 'id=' id ' variables: ' validerr.
> >> >> >end if.
> >> >> >exec.
> >> >> >del var validerr .
> >> >> >echo '----------------END VALIDATION
> >> >> >--------------------------------------------'.
> >> >> >  !enddefine.
> >> >>
> >> >> Each of your tests (missing values and ranges) makes two passes
> >> through
> >> >> the data, once to generate your error-message string 'validerr' for
> >> >> each case, and once to print it. (It would be quite easy to have the
> >> >> PRINT statements in the same transformation programs that do the
> >> tests.
> >> >> Why don't you do it that way?)
> >> >>
> >> >> You delete and re-declare 'validerr' each time for each new test:
> >> >> >del var validerr .
> >> >> >string validerr(A200).
> >> >> Better simply to set it to blank, at the beginning of each set of
> >> tests
> >> >> for a new record. This may easily be something that strains SPSS, if
> >> >> you do it often.
> >> >>
> >> >> And, you're generating your tests in macro loops. How long are the
> >> >> resulting transformation programs? And how often is this macro
> >> called,
> >> >> at four transformation programs each call?
> >> >>
> >> >>
> >> >> >*===========================================================.
> >> >> >*===========================================================.
> >> >> >*===========================================================.
> >> >> >define valid_oe (listvar=!charend('/') / skip=!default ("None")
> >> >> >!cmdend)
> >> >> >  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
> >> >> > !conditie=!skip  !ifend
> >> >> >echo
> >> >>
> >> >>
> >>
> >'--------------------------------------------------------------------------'.
> >>
> >> >>
> >> >> >echo !quote(!concat(' Validating OE variables: ', !listvar)).
> >> >> >string validerr(A200).
> >> >> >  !do !oe !IN (!listvar)
> >> >> >if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
> >> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
> >> >> >  !doend
> >> >> >if length(ltrim(rtrim(validerr)))>0
> >> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >> >echo !quote(!concat('ERROR: the following variables should NOT be
> >> >> >missing unless SKIP condition: ', !skip)).
> >> >> >exec.
> >> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >> >print / 'id=' id ' variables: ' validerr.
> >> >> >end if.
> >> >> >exec.
> >> >> >del var validerr .
> >> >> >string validerr(A200).
> >> >> >  !do !oe !IN (!listvar)
> >> >> >if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
> >> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
> >> >> >  !doend
> >> >> >if length(ltrim(rtrim(validerr)))>0
> >> >> >validerr=substr(validerr,2,length(validerr)-1).
> >> >> >echo !quote(!concat('ERROR: The following variables SHOULD be
> >> missing
> >> >> >for SKIP condition: ',!skip)).
> >> >> >exec.
> >> >> >do if length(ltrim(rtrim(validerr)))>0.
> >> >> >print / 'id=' id ' variables: ' validerr.
> >> >> >end if.
> >> >> >exec.
> >> >> >del var validerr .
> >> >> >echo '----------------END VALIDATION OE
> >> >> >--------------------------------------------'.
> >> >> >  !enddefine.
> >> >> >*===========================================================.
> >> >> Same remarks as for previous.
> >> >>
> >> >>
> >> >> >*===========================================================.
> >> >> >*===========================================================.
> >> >> >define valid_ma (listvar=!charend('/') /interval=!charend('/') /
> >> skip=
> >> >> >!default ("None") !charend('/') / exclusiv= !default ("None")
> >> !cmdend)
> >> >> >  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
> >> >> > !conditie=!skip !ifend
> >> >> >echo
> >> >>
> >> >>
> >>
> >'-----------------------------------------------------------------------------------------------------------'.
> >>
> >> >>
> >> >> >
> >> >> >echo !quote(!concat(' Validating MA variables: ', !listvar)).
> >> >> >comp validerr=0.
> >> >> >count qq=!listvar  (!interval).
> >> >> >if ((qq=0 ) & ~!conditie) validerr=1.
> >> >> >exe.
> >> >> >echo !quote(!concat('ERROR: the following variables should NOT be
> >> >> >missing unless SKIP condition: ', !skip)).
> >> >> >do if validerr>0.
> >> >> >print / 'id=' id .
> >> >> >end if.
> >> >> >exe.
> >> >> >del var validerr .
> >> >> >echo
> >> >>
> >> >>
> >>
> >'-----------------------------------------------------------------------------------------------------------'.
> >>
> >> >>
> >> >> >if ((qq<>0 ) & (!conditie)) validerr=1.
> >> >> >exe.
> >> >> >echo !quote(!concat('ERROR: the following variables should be
> >> missing
> >> >> >when NOT SKIP condition: ', !skip)).
> >> >> >do if validerr>0.
> >> >> >print / 'id=' id .
> >> >> >end if.
> >> >> >exe.
> >> >> >del var validerr .
> >> >> >echo
> >> >>
> >> >>
> >>
> >'-----------------------------------------------------------------------------------------------------------'.
> >>
> >> >>
> >> >> >
> >> >> >  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
> >> >> > !test=!exclusiv !ifend
> >> >> >do if !test & qq>1.
> >> >> >comp validerr=1.
> >> >> >end if.
> >> >> >  !let !flag=0
> >> >> >do if validerr>0.
> >> >> >echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY
> >> CONDITION'.
> >> >> >print / 'id=' id .
> >> >> >end if.
> >> >> >exe.
> >> >> >del var validerr .
> >> >> >del var qq .
> >> >> >echo '----------------------------END
> >> >>
> >> >>
> >>
> >VALIDATION-----------------------------------------------------------------'.
> >>
> >> >>
> >> >> >!enddefine.
> >> >> >*===========================================================.
> >> >> Same remarks again.
> >> >>
> >> >>
> >> >> >*===========================================================.
> >> >> >thank you very much and once again sorry for any trouble,
> >> >>
> >> >> Good luck, and I think I've given you something to go on with.
> >> >> Richard
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> > Vlad Simion
> >> > Data Analyst
> >> > Tel:      +40 720130611
> >> >
> >> >
> >>
> >>
> >
> >
> > --
> > Vlad Simion
> > Data Analyst
> > Tel:      +40 720130611
> >
> >
>
>


--
Vlad Simion
Data Analyst
Tel:      +40 720130611
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

Richard Ristow
In reply to this post by vlad simion
(Jon Peck: of most interest to you, see suggestion E., below)

At 07:53 AM 12/15/2006, vlad simion wrote:

>thank you very much Richard for your time and kindness to look into
>the codes i sent you and sorry for not giving all the explanations
>about the macros, i didn't mean to be dissrespectfull, it's just that
>i was under presure of resolving the issue and ... got carried away

Thank you - and sorry to be testy. And thank you for the explanations
in this latest note; I'm not quoting them here.

>i have tried all the suggestions you gave me, but with no better
>results

OK, then
. Removing the "del var validerr." statements, and adjusting
accordingly (i.e., COMPUTE rather than re-declaring) doesn't help
. It won't run on 250 of the variables, or on 25 of them.

Further thoughts:

A. Will it run against 5 of the variables? 1 of them? (When you have a
complicated program that won't run, first you try to see what's wrong
and fix it. If you can't, you try simpler and simpler versions, until
you find one that will run; then you restore features until you find a
point where it fails.)


B. Your main macros announce their beginnings with ECHO statements,
like

echo !quote(!concat(' Validating SA variables: ', !listvar)).
echo '----------------------------------------------------------'.
echo 'VALIDATING MISSING VALUES'.

That's good. You might trace further, having the transformation
programs announce *their* beginnings:

echo !quote(!concat(' Validating SA variables: ', !listvar)).
echo
'--------------------------------------------------------------------------'.
echo 'VALIDATING MISSING VALUES'.
DO IF $CASENUM EQ 1.
.  PRINT / 'VALIDATING MISSING VALUES-begin validation run'.
END IF.
string validerr(A200).


C. By way of cleaning up code and logic, I said you could generate and
print your error messages in one transformation program instead of two.
That is, in syntax like

exec.
do if length(ltrim(rtrim(validerr)))>0.
print / 'id=' id ' variables: ' validerr.
end if.
exec.

eliminate the first "exec." Alternatively, use LIST: replace the above
lines (all five of them) by

TEMPORARY.
SELECT IF length(ltrim(rtrim(validerr)))>0.
LIST /VARIABLES=id validerr.


D. You're running with SET MPRINT ON - good for you. You're running, or
should be, with draft output, which makes it much easier to see the
(long) code you're generating and running.

If you take the generated code from the listing and run it, i.e.
eliminate the macro expansions, does it run? You'll probably want to do
this for a small number of your variables, so the size of the code is
manageable. This is to separate the problem, distinguishing macro
expansion problems from native SPSS problems (an important point).


E. After the above tests and clean-ups, send me syntax that calls the
macros but doesn't work, plus some data. (If you send some data, it
should have all 2500 variables; it must be NON-CONFIDENTIAL; and few
enough records to be of modest size, say 5 megabytes or less.) Send the
latest, cleaned macro definitions.

I'll look into trying it with SPSS 15 - no promises when, no promises
I'll ever manage it, but it's worth a look. If it still fails, and I
can get a clean, reliable failure, I'll submit to Tech Support.


F. And I know this is late in your project to suggest it, but 2500
variables is an awful lot of variables. Often, datasets with that many
can be reorganized to a 'longer' organization with fewer variables, and
that can simplify working with them quite a bit. But you're probably
pretty far into your project to look into this.

-Good luck,
  Richard
Reply | Threaded
Open this post in threaded view
|

Re: problem with large datafile and macros

Art Kendall-2
In reply to this post by vlad simion
With current versions, to check for unlabelled values might take
execution time, but depending on the kind of data might not take much
syntax.

One point I might not have been clear on is that newer versions can save
a lot of staff time which may end up in a lot less total cost the price
of an upgrade.

Of course it makes a difference what value the decision maker puts on
your time, and the cost of maintainability of the process over the next
few years.

The time to learn legacy macros and scripting is very expensive in light
of the need to learn programmability features when your organization
does get around to upgrading.  At that time there will be a new effort
to learn PYTHON.


To check for unlabelled values might take execution time, but depending
on the kind of data might not take much syntax, even without getting
into programmability, or the data validation section of the GUI.
Iff  there is some structure to your data such as would occur with test
scores, a more current version would allow something like this..
do repeat checkvar = math000001 to math000200/ item= 1 to 200.
do if valuelabel(checkvar) eq "".
print / '**** in math items'
         /caseid item
        / 'has unlabelled value' checkvar
        /.
end repeat.
do repeat checkvar = mmpi001 to mmpi583/ item= 1 to 583.
do if valuelabel(checkvar) eq "".
print / '**** in MMPI items'
         /caseid item
        / 'has unlabelled value' checkvar
        /.
end repeat.
etc.

With python you could read the data dictionary, create 2 lists of which
variables have value labels, 1 with the variable names and 1 with the
variable names in quotes
and create syntax like this.
do repeat checkvar =
 {first list produced by python} /
          varid =
 {second list produced by python}/.
do if valuelabel(checkvar) eq ""
print / '****  for variable' varid
         /'case ' caseid
        / 'has unlabelled value' checkvar
        /.
end repeat.

[hint to Jon Peck] Once someone wrote the python code to read the
dictionary and produce the syntax with the two lists,  this could be
applied to any file. A possible contribution for developer central?


Of course, if routine quality assurance procedures have been followed
(double keying) this procedure should find very few unlabelled values.

If you have a large number of cases in addition to a large number of
variables, you could run such a procedure on a small sample of the cases
to be sure that the value labels have been correctly entered.


Art Kendall
Social Research Consultants



vlad simion wrote:

> Hi Art,
>
> even with do repeat or loop, the code would be very long, the database
> has aproximately 25 thousands variables or more.
> I use Spss 13, so I don't have acces to the programability feature :(
>
> Greetings,
>
> Vlad
>
> On 12/15/06, Art Kendall <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     There are ways in syntax to apply the same syntax to to many
>     variables.
>     do repeat
>     loop
>     long variable lists.
>
>     Of course a lot depends on the substantive nature of your data and its
>     arrangement in the file.
>
>
>     There are ways in the current version to set up data validation rules
>     and save them to apply to other files.
>     In other words, that wheel has already been invented.
>
>     I would suggest that you compare the total costs (your time,
>     frustration, obsolescence of your efforts before you start, etc.)
>     if you
>     become current in your SPSS version now, vs waiting until later to
>     become current.
>
>     Also, macros and scripts are not  up-to-date programmability
>     tools.  I
>     would suggest looking at the programmability tools provided via Python
>     which is free and works well with current SPSS versions.
>
>     Art Kendall
>     Social Research Consultants
>
>
>
>
>
>     vlad simion wrote:
>
>     > Hi Art,
>     >
>     > for the moment it's just a one time job, but if I can make it work
>     > properly,
>     > then it would become a production job :)
>     > indeed, the main purpose is to check that cases have valid values
>     > the easier way to accomplish this would be simple syntax code,
>     but it
>     > would
>     > be very long and there are a lot of repeating tasks. That's why I
>     > tried to
>     > do this by macros.
>     >
>     > Many thanks for your insights,
>     >
>     > Vlad
>     >
>     > On 12/15/06, Art Kendall <[hidden email]
>     <mailto:[hidden email]>> wrote:
>     >
>     >>
>     >> It is possible that there are easier ways to accomplish what
>     you are
>     >> trying to do.
>     >>
>     >> It is also possible that the process can be broken into sections.
>     >>
>     >> Is your syntax  a production job that will be run on a regular
>     basis or
>     >> is it a one time data analysis?
>     >>
>     >> I didn't follow your thread carefully, but I have the
>     impression that
>     >> you are mainly checking to see whether cases have only legitimate
>     >> values?
>     >> Please describe what you are trying to do, without at this time
>     getting
>     >> into how you are trying to do it.
>     >>
>     >>
>     >> Art Kendall
>     >> Social Research Consultants
>     >>
>     >>
>     >> vlad simion wrote:
>     >>
>     >> > thank you Jon,
>     >> >
>     >> > i don't have spss 14 or 15 to try to run the same syntax and
>     see if it
>     >> > works
>     >> > or not.
>     >> > i wrote the issue to the list because i saw that there are a
>     lot of
>     >> guys
>     >> > from spss support that give a lot of usefull suggestions :),
>     including
>     >> > you,
>     >> > even if you are not in Technical Support :-))
>     >> >
>     >> > many thanks,
>     >> >
>     >> > vlad
>     >> >
>     >> > On 12/14/06, Richard Ristow <[hidden email]
>     <mailto:[hidden email]>> wrote:
>     >> >
>     >> >>
>     >> >> Reply to an off-list response.
>     >> >>
>     >> >> At 03:37 AM 12/14/2006, vlad simion wrote:
>     >> >>
>     >> >> >i took the liberty to write only to you because i have
>     attached 2
>     >> >> >print screens with the errors occured and i know that on the
>     >> forum it
>     >> >> >is not allowed to attach files, i hope it is not a problem.
>     >> >>
>     >> >> In both cases, the errors are "assertion failures."
>     Assertions are
>     >> >> debugging tools internal to a program; an assertion failure
>     is, *ipso
>     >> >> facto*, a bug in SPSS. Not that there's reason to be debugging
>     >> SPSS 13;
>     >> >> only if it's replicable in 15 would it (and should it) get
>     attention.
>     >> >>
>     >> >> -----------------------
>     >> >> To readers at SPSS, Inc. - to the accuracy I can transcribe
>     them, the
>     >> >> first message gives
>     >> >> -----------------------
>     >> >> Program: ...\spsswin.exe  [I've elided the path name]
>     >> >> files: Z:\cs_source\Datasource\src\dictnry.cpp
>     >> >> line: 183
>     >> >>
>     >> >> expression:fpIterator
>     >> >> -----------------------
>     >> >> The second gives
>     >> >> -----------------------
>     >> >> Program: ...\spsswin.exe  [I've elided the path name]
>     >> >> files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
>     >> >> line: 200
>     >> >>
>     >> >> expression:iInput <_my._xdsData._inputs.size()
>     >> >> -----------------------
>     >> >> Since this hasn't hit everybody, it's a fair guess these are
>     (or this
>     >> >> is a) size-dependent bug, i.e. one that shows up only with
>     input of a
>     >> >> certain size or complexity. They're quite common.
>     >> >>
>     >> >>
>     >> >> >i think it has something to do about the transformations
>     program,
>     >> but
>     >> >> >i can't figure out what, i've manage to go a little
>     further, but
>     >> >> >still... it crash giving error: "an error occured while
>     atempting to
>     >> >> >write a transformation file"
>     >> >> >
>     >> >> >and here are the macros that i use:
>     >> >>
>     >> >> Remarks follow. But this is a LOT of code; you should be
>     debugging
>     >> it,
>     >> >> and I'm not going to try a complete job. How many of the
>     >> suggestions I
>     >> >> made in the last posting, have you applied?
>     >> >>
>     >> >> And you haven't said a word about what each macro does. Nor
>     put in
>     >> any
>     >> >> annotations or comments in your definitions. Those are
>     extremely
>     >> >> important for your understanding and quality control; and a
>     minimal
>     >> >> courtesy, for anybody else you ask to look at your code.
>     >> >>
>     >> >> >set mprint=on printback=on mexpand=on.
>     >> >> >
>     >> >> >define !enum_to(values= !enclose('[',']') /except=
>     >> !enclose('[',']') /
>     >> >> >sep= !default ('') !enclose('[',']'))
>     >> >> >  !if (!upcase(!head(!tail(!except)))='TO') !then !let
>     !exvals=
>     >> !null
>     >> >> >  !do !j= !head(!except) !to !tail(!tail(!except))
>     >> >> >  !let !exvals= !concat(!exvals,' ',!j)
>     >> >> >  !doend
>     >> >> >  !else !let !exvals=!except
>     >> >> >  !ifend
>     >> >> >  !if (!upcase(!head(!tail(!values)))='TO') !then !let
>     !vals= !null
>     >> >> >  !do !i= !head(!values) !to !tail(!tail(!values))
>     >> >> >  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,'
>     '))=0) !then
>     >> >> >  !let !vals= !concat(!vals,' ',!i,!sep)
>     >> >> >  !ifend
>     >> >> >  !doend
>     >> >> >  !let
>     >> !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
>     >> >> > 2)))
>     >> >> >  !let !vals= !concat(!head(!values),!tail(!vals))
>     >> >> >  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
>     >> >> > !concat(!vals,!tail(!tail(!values))) !ifend
>     >> >> >  !else !let !vals= !values
>     >> >> >  !ifend
>     >> >> >  !vals
>     >> >> >  !enddefine.
>     >> >> >*===========================================================.
>     >> >> I can't see what code this emits, with the effort I'm
>     willing to put
>     >> >> out. It has lot of looping. Does it call a lot of macros in
>     those
>     >> >> loops, and hence emit, many times, the code that they
>     emit?  a lot of
>     >> >> macro calls? If so, which macros does it call?
>     >> >>
>     >> >> >*===========================================================.
>     >> >> >define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos
>     !tokens(1) /
>     >> >> >sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
>     >> >> >  !do !i=!2 !to !3
>     >> >> >  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
>     >> >> >  !else !concat(!1,!i, !unquote(!sufix),!sep)
>     >> >> >  !ifend
>     >> >> >  !doend
>     >> >> >  !enddefine.
>     >> >> >*===========================================================.
>     >> >> Again, I can't see what this does, except it seems to invoke
>     macro
>     >> >> !sufix. To do what? (Yes, I could look, but saying what is an
>     >> >> elementary courtesy.)
>     >> >>
>     >> >>
>     >> >> >*===========================================================.
>     >> >> >define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
>     >> >> >dim2=!encl("[","]") / sufix=!default("_") !tokens(1)
>     /sep=!default
>     >> >> >("") !enclose('(',')'))
>     >> >> >  !do !i=!head(!dim1) !to !tail(!dim1)
>     >> >> >  !do !j=!head(!dim2) !to !tail(!dim2)
>     >> >> >  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
>     >> >> > !concat(!1,!i,!unquote(!sufix),!j)
>     >> >> >  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
>     >> >> >  !ifend
>     >> >> >  !doend
>     >> >> >  !doend
>     >> >> >  !enddefine.
>     >> >> >*===========================================================.
>     >> >> This appears to be another looping macro, that calls other
>     macros
>     >> many
>     >> >> times. It looks like it's mostly !sufix.
>     >> >>
>     >> >> Here's a macro that seems to be doing something:
>     >> >> >*===========================================================.
>     >> >> >define valid_sa (listvar=!charend('/') /  skip= !default
>     ("None")
>     >> >> >!charend('/')  / vallist= !default ("") !cmdend )
>     >> >> >  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
>     >> >> > !conditie=!skip !ifend
>     >> >> >echo
>     >> >>
>     >> >>
>     >>
>     >'--------------------------------------------------------------------------'.
>     >>
>     >> >>
>     >> >> >echo !quote(!concat(' Validating SA variables: ', !listvar)).
>     >> >> >echo
>     >> >>
>     >> >>
>     >>
>     >'--------------------------------------------------------------------------'.
>
>     >>
>     >> >>
>     >> >> >echo 'VALIDATING MISSING VALUES'.
>     >> >> >string validerr(A200).
>     >> >> >  !do !oe !IN (!listvar)
>     >> >> >if (miss(!oe) & !conditie )
>     validerr=concat(ltrim(rtrim(validerr)),
>     >> >> >',', !quote(!oe)).
>     >> >> >  !doend
>     >> >> >if length(ltrim(rtrim(validerr)))>0
>     >> >> >validerr=substr(validerr,2,length(validerr)-1).
>     >> >> >echo !quote(!concat('ERROR: the following variables should
>     NOT be
>     >> >> >missing unless SKIP condition: ', !skip)).
>     >> >> >exec.
>     >> >> >do if length(ltrim(rtrim(validerr)))>0.
>     >> >> >print / 'id=' id ' variables: ' validerr.
>     >> >> >end if.
>     >> >> >exec.
>     >> >> >del var validerr .
>     >> >> >string validerr(A200).
>     >> >> >  !do !oe !IN (!listvar)
>     >> >> >if (~miss(!oe) & ~!conditie )
>     >> validerr=concat(ltrim(rtrim(validerr)),
>     >> >> >',', !quote(!oe)).
>     >> >> >  !doend
>     >> >> >if length(ltrim(rtrim(validerr)))>0
>     >> >> >validerr=substr(validerr,2,length(validerr)-1).
>     >> >> >echo !quote(!concat('ERROR: The following variables SHOULD be
>     >> missing
>     >> >> >for SKIP condition: ',!skip)).
>     >> >> >exec.
>     >> >> >do if length(ltrim(rtrim(validerr)))>0.
>     >> >> >print / 'id=' id ' variables: ' validerr.
>     >> >> >end if.
>     >> >> >exec.
>     >> >> >del var validerr .
>     >> >> >echo
>     >> >>
>     >> >>
>     >>
>     >'--------------------------------------------------------------------------'.
>     >>
>     >> >>
>     >> >> >echo 'VALIDATING VALUES/RANGES'.
>     >> >> >string validerr(A200).
>     >> >> >  !do !oe !IN (!listvar)
>     >> >> >if ((~any(!oe,!vallist)) & !conditie )
>     >> >> >validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
>     >> >> >ltrim(rtrim(string(!oe , F15)))).
>     >> >> >  !doend
>     >> >> >if length(ltrim(rtrim(validerr)))>0
>     >> >> >validerr=substr(validerr,2,length(validerr)-1).
>     >> >> >echo !quote(!concat('ERROR: The following variables do not
>     fit the
>     >> >> >requested Range for SKIP condition: ',!skip)).
>     >> >> >exec.
>     >> >> >do if length(ltrim(rtrim(validerr)))>0.
>     >> >> >print / 'id=' id ' variables: ' validerr.
>     >> >> >end if.
>     >> >> >exec.
>     >> >> >del var validerr .
>     >> >> >echo '----------------END VALIDATION
>     >> >> >--------------------------------------------'.
>     >> >> >  !enddefine.
>     >> >>
>     >> >> Each of your tests (missing values and ranges) makes two passes
>     >> through
>     >> >> the data, once to generate your error-message string
>     'validerr' for
>     >> >> each case, and once to print it. (It would be quite easy to
>     have the
>     >> >> PRINT statements in the same transformation programs that do the
>     >> tests.
>     >> >> Why don't you do it that way?)
>     >> >>
>     >> >> You delete and re-declare 'validerr' each time for each new
>     test:
>     >> >> >del var validerr .
>     >> >> >string validerr(A200).
>     >> >> Better simply to set it to blank, at the beginning of each
>     set of
>     >> tests
>     >> >> for a new record. This may easily be something that strains
>     SPSS, if
>     >> >> you do it often.
>     >> >>
>     >> >> And, you're generating your tests in macro loops. How long
>     are the
>     >> >> resulting transformation programs? And how often is this macro
>     >> called,
>     >> >> at four transformation programs each call?
>     >> >>
>     >> >>
>     >> >> >*===========================================================.
>     >> >> >*===========================================================.
>     >> >> >*===========================================================.
>     >> >> >define valid_oe (listvar=!charend('/') / skip=!default
>     ("None")
>     >> >> >!cmdend)
>     >> >> >  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
>     >> >> > !conditie=!skip  !ifend
>     >> >> >echo
>     >> >>
>     >> >>
>     >>
>     >'--------------------------------------------------------------------------'.
>     >>
>     >> >>
>     >> >> >echo !quote(!concat(' Validating OE variables: ', !listvar)).
>     >> >> >string validerr(A200).
>     >> >> >  !do !oe !IN (!listvar)
>     >> >> >if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
>     >> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>     >> >> >  !doend
>     >> >> >if length(ltrim(rtrim(validerr)))>0
>     >> >> >validerr=substr(validerr,2,length(validerr)-1).
>     >> >> >echo !quote(!concat('ERROR: the following variables should
>     NOT be
>     >> >> >missing unless SKIP condition: ', !skip)).
>     >> >> >exec.
>     >> >> >do if length(ltrim(rtrim(validerr)))>0.
>     >> >> >print / 'id=' id ' variables: ' validerr.
>     >> >> >end if.
>     >> >> >exec.
>     >> >> >del var validerr .
>     >> >> >string validerr(A200).
>     >> >> >  !do !oe !IN (!listvar)
>     >> >> >if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
>     >> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>     >> >> >  !doend
>     >> >> >if length(ltrim(rtrim(validerr)))>0
>     >> >> >validerr=substr(validerr,2,length(validerr)-1).
>     >> >> >echo !quote(!concat('ERROR: The following variables SHOULD be
>     >> missing
>     >> >> >for SKIP condition: ',!skip)).
>     >> >> >exec.
>     >> >> >do if length(ltrim(rtrim(validerr)))>0.
>     >> >> >print / 'id=' id ' variables: ' validerr.
>     >> >> >end if.
>     >> >> >exec.
>     >> >> >del var validerr .
>     >> >> >echo '----------------END VALIDATION OE
>     >> >> >--------------------------------------------'.
>     >> >> >  !enddefine.
>     >> >> >*===========================================================.
>     >> >> Same remarks as for previous.
>     >> >>
>     >> >>
>     >> >> >*===========================================================.
>     >> >> >*===========================================================.
>     >> >> >define valid_ma (listvar=!charend('/')
>     /interval=!charend('/') /
>     >> skip=
>     >> >> >!default ("None") !charend('/') / exclusiv= !default ("None")
>     >> !cmdend)
>     >> >> >  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
>     >> >> > !conditie=!skip !ifend
>     >> >> >echo
>     >> >>
>     >> >>
>     >>
>     >'-----------------------------------------------------------------------------------------------------------'.
>
>     >>
>     >> >>
>     >> >> >
>     >> >> >echo !quote(!concat(' Validating MA variables: ', !listvar)).
>     >> >> >comp validerr=0.
>     >> >> >count qq=!listvar  (!interval).
>     >> >> >if ((qq=0 ) & ~!conditie) validerr=1.
>     >> >> >exe.
>     >> >> >echo !quote(!concat('ERROR: the following variables should
>     NOT be
>     >> >> >missing unless SKIP condition: ', !skip)).
>     >> >> >do if validerr>0.
>     >> >> >print / 'id=' id .
>     >> >> >end if.
>     >> >> >exe.
>     >> >> >del var validerr .
>     >> >> >echo
>     >> >>
>     >> >>
>     >>
>     >'-----------------------------------------------------------------------------------------------------------'.
>     >>
>     >> >>
>     >> >> >if ((qq<>0 ) & (!conditie)) validerr=1.
>     >> >> >exe.
>     >> >> >echo !quote(!concat('ERROR: the following variables should be
>     >> missing
>     >> >> >when NOT SKIP condition: ', !skip)).
>     >> >> >do if validerr>0.
>     >> >> >print / 'id=' id .
>     >> >> >end if.
>     >> >> >exe.
>     >> >> >del var validerr .
>     >> >> >echo
>     >> >>
>     >> >>
>     >>
>     >'-----------------------------------------------------------------------------------------------------------'.
>     >>
>     >> >>
>     >> >> >
>     >> >> >  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
>     >> >> > !test=!exclusiv !ifend
>     >> >> >do if !test & qq>1.
>     >> >> >comp validerr=1.
>     >> >> >end if.
>     >> >> >  !let !flag=0
>     >> >> >do if validerr>0.
>     >> >> >echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY
>     >> CONDITION'.
>     >> >> >print / 'id=' id .
>     >> >> >end if.
>     >> >> >exe.
>     >> >> >del var validerr .
>     >> >> >del var qq .
>     >> >> >echo '----------------------------END
>     >> >>
>     >> >>
>     >>
>     >VALIDATION-----------------------------------------------------------------'.
>
>     >>
>     >> >>
>     >> >> >!enddefine.
>     >> >> >*===========================================================.
>     >> >> Same remarks again.
>     >> >>
>     >> >>
>     >> >> >*===========================================================.
>     >> >> >thank you very much and once again sorry for any trouble,
>     >> >>
>     >> >> Good luck, and I think I've given you something to go on with.
>     >> >> Richard
>     >> >>
>     >> >>
>     >> >
>     >> >
>     >> > --
>     >> > Vlad Simion
>     >> > Data Analyst
>     >> > Tel:      +40 720130611
>     >> >
>     >> >
>     >>
>     >>
>     >
>     >
>     > --
>     > Vlad Simion
>     > Data Analyst
>     > Tel:      +40 720130611
>     >
>     >
>
>
>
>
> --
> Vlad Simion
> Data Analyst
> Tel:      +40 720130611
Reply | Threaded
Open this post in threaded view
|

Re: checking for unlabelled values

Peck, Jon
I actually wrote an example of checking for unlabeled values for one of the classes I gave on programmability.  I am posting the example code below.  It is for only a single variable in a particular file, but it is easy to generalize.

 

However, I’d like to point out that checking for unlabeled values is a built-in rule, along with many others, that you can apply to variables using the Validate Data procedure in the Data Validation (Data>Validation>Validate Data on the menus) option to SPSS by just clicking a checkbox.

 

The Data Validation option was introduced in SPSS 14 and provides an extensive set of checks and reports for this kind of work.

 

Here is the Python code, which works with SPSS 14.0.1 or later.  Explanations below.

 

BEGIN PROGRAM.

# find all values of origin in cars1.sav that

# are not labeled and display all occuring or labeled values

 

import spss, spssaux, spssdata

 

spssaux.OpenDataFile("c:/progclass/cars1.sav")

d = spssaux.VariableDict()

originlabelstr = d['origin'].ValueLabels

originlabels = set()

for vl in originlabelstr:

    originlabels.add(int(vl))

data = spssdata.Spssdata(indexes=['origin'])

ovalues = set()

for case in data:

    ovalues.add(case.origin)

 

data.close()

print "The Set of Origin values\n", ovalues

print "The Set of Unlabeled Origin Values\n",\

    ovalues.difference(originlabels)

print "The Set of labels and occurring Values\n",\

    ovalues.union(originlabels)

END PROGRAM.

 

d is created as a variable dictionary containing all variables in the active dataset.

 

originlabelstr is created as a Python dictionary of all the values and labels for the variable named origin.

It is then converted into the set originlabels.

for case in data

loops over all the cases, adding each distinct value of origin to the set ovalues.

 

Then the program just prints the difference of the two sets.

 

(I really like the built-in set datatype in Python.)

 

-Jon Peck

________________________________

From: Art Kendall [mailto:[hidden email]]
Sent: Friday, December 15, 2006 12:31 PM
To: vlad simion; Peck, Jon
Cc: [hidden email]
Subject: Re: problem with large datafile and macros

 

With current versions, to check for unlabelled values might take execution time, but depending on the kind of data might not take much syntax.

One point I might not have been clear on is that newer versions can save a lot of staff time which may end up in a lot less total cost the price of an upgrade.

Of course it makes a difference what value the decision maker puts on your time, and the cost of maintainability of the process over the next few years.

The time to learn legacy macros and scripting is very expensive in light of the need to learn programmability features when your organization does get around to upgrading.  At that time there will be a new effort to learn PYTHON.
 

To check for unlabelled values might take execution time, but depending on the kind of data might not take much syntax, even without getting into programmability, or the data validation section of the GUI.
Iff  there is some structure to your data such as would occur with test scores, a more current version would allow something like this..
do repeat checkvar = math000001 to math000200/ item= 1 to 200.
do if valuelabel(checkvar) eq "".
print / '**** in math items'
         /caseid item
        / 'has unlabelled value' checkvar
        /.
end repeat.
do repeat checkvar = mmpi001 to mmpi583/ item= 1 to 583.
do if valuelabel(checkvar) eq "".
print / '**** in MMPI items'
         /caseid item
        / 'has unlabelled value' checkvar
        /.
end repeat.
etc.

With python you could read the data dictionary, create 2 lists of which variables have value labels, 1 with the variable names and 1 with the variable names in quotes
and create syntax like this.
do repeat checkvar =
 {first list produced by python} /
          varid =
 {second list produced by python}/.
do if valuelabel(checkvar) eq ""
print / '****  for variable' varid
         /'case ' caseid
        / 'has unlabelled value' checkvar
        /.
end repeat.

[hint to Jon Peck] Once someone wrote the python code to read the dictionary and produce the syntax with the two lists,  this could be applied to any file. A possible contribution for developer central?


Of course, if routine quality assurance procedures have been followed (double keying) this procedure should find very few unlabelled values.

If you have a large number of cases in addition to a large number of variables, you could run such a procedure on a small sample of the cases to be sure that the value labels have been correctly entered.


Art Kendall
Social Research Consultants



vlad simion wrote:



Hi Art,

even with do repeat or loop, the code would be very long, the database has aproximately 25 thousands variables or more.
I use Spss 13, so I don't have acces to the programability feature :(

Greetings,

Vlad

On 12/15/06, Art Kendall <[hidden email]> wrote:

There are ways in syntax to apply the same syntax to to many variables.
do repeat
loop
long variable lists.

Of course a lot depends on the substantive nature of your data and its
arrangement in the file.


There are ways in the current version to set up data validation rules
and save them to apply to other files.
In other words, that wheel has already been invented.

I would suggest that you compare the total costs (your time,
frustration, obsolescence of your efforts before you start, etc.) if you
become current in your SPSS version now, vs waiting until later to
become current.

Also, macros and scripts are not  up-to-date programmability tools.  I
would suggest looking at the programmability tools provided via Python
which is free and works well with current SPSS versions.

Art Kendall
Social Research Consultants





vlad simion wrote:

> Hi Art,
>
> for the moment it's just a one time job, but if I can make it work
> properly,
> then it would become a production job :)
> indeed, the main purpose is to check that cases have valid values
> the easier way to accomplish this would be simple syntax code, but it
> would
> be very long and there are a lot of repeating tasks. That's why I
> tried to
> do this by macros.
>
> Many thanks for your insights,
>
> Vlad
>
> On 12/15/06, Art Kendall <[hidden email]> wrote:
>
>>
>> It is possible that there are easier ways to accomplish what you are
>> trying to do.
>>
>> It is also possible that the process can be broken into sections.
>>
>> Is your syntax  a production job that will be run on a regular basis or
>> is it a one time data analysis?
>>
>> I didn't follow your thread carefully, but I have the impression that
>> you are mainly checking to see whether cases have only legitimate
>> values?
>> Please describe what you are trying to do, without at this time getting
>> into how you are trying to do it.
>>
>>
>> Art Kendall
>> Social Research Consultants
>>
>>
>> vlad simion wrote:
>>
>> > thank you Jon,
>> >
>> > i don't have spss 14 or 15 to try to run the same syntax and see if it
>> > works
>> > or not.
>> > i wrote the issue to the list because i saw that there are a lot of
>> guys
>> > from spss support that give a lot of usefull suggestions :), including
>> > you,
>> > even if you are not in Technical Support :-))
>> >
>> > many thanks,
>> >
>> > vlad
>> >
>> > On 12/14/06, Richard Ristow <[hidden email]> wrote:
>> >
>> >>
>> >> Reply to an off-list response.
>> >>
>> >> At 03:37 AM 12/14/2006, vlad simion wrote:
>> >>
>> >> >i took the liberty to write only to you because i have attached 2
>> >> >print screens with the errors occured and i know that on the
>> forum it
>> >> >is not allowed to attach files, i hope it is not a problem.
>> >>
>> >> In both cases, the errors are "assertion failures." Assertions are
>> >> debugging tools internal to a program; an assertion failure is, *ipso
>> >> facto*, a bug in SPSS. Not that there's reason to be debugging
>> SPSS 13;
>> >> only if it's replicable in 15 would it (and should it) get attention.
>> >>
>> >> -----------------------
>> >> To readers at SPSS, Inc. - to the accuracy I can transcribe them, the
>> >> first message gives
>> >> -----------------------
>> >> Program: ...\spsswin.exe  [I've elided the path name]
>> >> files: Z:\cs_source\Datasource\src\dictnry.cpp
>> >> line: 183
>> >>
>> >> expression:fpIterator
>> >> -----------------------
>> >> The second gives
>> >> -----------------------
>> >> Program: ...\spsswin.exe  [I've elided the path name]
>> >> files: Z:\cs_source\Datasource\src\TransformDataStore.cpp
>> >> line: 200
>> >>
>> >> expression:iInput <_my._xdsData._inputs.size()
>> >> -----------------------
>> >> Since this hasn't hit everybody, it's a fair guess these are (or this
>> >> is a) size-dependent bug, i.e. one that shows up only with input of a
>> >> certain size or complexity. They're quite common.
>> >>
>> >>
>> >> >i think it has something to do about the transformations program,
>> but
>> >> >i can't figure out what, i've manage to go a little further, but
>> >> >still... it crash giving error: "an error occured while atempting to
>> >> >write a transformation file"
>> >> >
>> >> >and here are the macros that i use:
>> >>
>> >> Remarks follow. But this is a LOT of code; you should be debugging
>> it,
>> >> and I'm not going to try a complete job. How many of the
>> suggestions I
>> >> made in the last posting, have you applied?
>> >>
>> >> And you haven't said a word about what each macro does. Nor put in
>> any
>> >> annotations or comments in your definitions. Those are extremely
>> >> important for your understanding and quality control; and a minimal
>> >> courtesy, for anybody else you ask to look at your code.
>> >>
>> >> >set mprint=on printback=on mexpand=on.
>> >> >
>> >> >define !enum_to(values= !enclose('[',']') /except=
>> !enclose('[',']') /
>> >> >sep= !default ('') !enclose('[',']'))
>> >> >  !if (!upcase(!head(!tail(!except)))='TO') !then !let !exvals=
>> !null
>> >> >  !do !j= !head(!except) !to !tail(!tail(!except))
>> >> >  !let !exvals= !concat(!exvals,' ',!j)
>> >> >  !doend
>> >> >  !else !let !exvals=!except
>> >> >  !ifend
>> >> >  !if (!upcase(!head(!tail(!values)))='TO') !then !let !vals= !null
>> >> >  !do !i= !head(!values) !to !tail(!tail(!values))
>> >> >  !if (!index(!concat(' ',!exvals,' '),!concat(' ',!i,' '))=0) !then
>> >> >  !let !vals= !concat(!vals,' ',!i,!sep)
>> >> >  !ifend
>> >> >  !doend
>> >> >  !let
>> !vals=!substr(!vals,1,!length(!substr(!blanks(!length(!vals)),
>> >> > 2)))
>> >> >  !let !vals= !concat(!head(!values),!tail(!vals))
>> >> >  !if (!i<>!head(!tail(!tail(!values)))) !then !let !vals=
>> >> > !concat(!vals,!tail(!tail(!values))) !ifend
>> >> >  !else !let !vals= !values
>> >> >  !ifend
>> >> >  !vals
>> >> >  !enddefine.
>> >> >*===========================================================.
>> >> I can't see what code this emits, with the effort I'm willing to put
>> >> out. It has lot of looping. Does it call a lot of macros in those
>> >> loops, and hence emit, many times, the code that they emit?  a lot of
>> >> macro calls? If so, which macros does it call?
>> >>
>> >> >*===========================================================.
>> >> >define !xpand(!pos !tokens(1) / !pos !tokens(1) / !pos !tokens(1) /
>> >> >sufix=!default("") !tokens(1) /sep=!default ("") !tokens(1))
>> >> >  !do !i=!2 !to !3
>> >> >  !if (!i=!3) !then !concat(!1,!i,!unquote(!sufix))
>> >> >  !else !concat(!1,!i, !unquote(!sufix),!sep)
>> >> >  !ifend
>> >> >  !doend
>> >> >  !enddefine.
>> >> >*===========================================================.
>> >> Again, I can't see what this does, except it seems to invoke macro
>> >> !sufix. To do what? (Yes, I could look, but saying what is an
>> >> elementary courtesy.)
>> >>
>> >>
>> >> >*===========================================================.
>> >> >define !xpand2(!pos !tokens(1) / dim1= !encl("[","]") /
>> >> >dim2=!encl("[","]") / sufix=!default("_") !tokens(1) /sep=!default
>> >> >("") !enclose('(',')'))
>> >> >  !do !i=!head(!dim1) !to !tail(!dim1)
>> >> >  !do !j=!head(!dim2) !to !tail(!dim2)
>> >> >  !if (!i=!head(!tail(!dim1)) & !j=!head(!tail(!dim2))) !then
>> >> > !concat(!1,!i,!unquote(!sufix),!j)
>> >> >  !else !concat(!1,!i,!unquote(!sufix),!j,!sep)
>> >> >  !ifend
>> >> >  !doend
>> >> >  !doend
>> >> >  !enddefine.
>> >> >*===========================================================.
>> >> This appears to be another looping macro, that calls other macros
>> many
>> >> times. It looks like it's mostly !sufix.
>> >>
>> >> Here's a macro that seems to be doing something:
>> >> >*===========================================================.
>> >> >define valid_sa (listvar=!charend('/') /  skip= !default ("None")
>> >> >!charend('/')  / vallist= !default ("") !cmdend )
>> >> >  !if (!skip="None") !then !let !conditie="(1=1)" !else !let
>> >> > !conditie=!skip !ifend
>> >> >echo
>> >>
>> >>
>> >'--------------------------------------------------------------------------'.
>>
>> >>
>> >> >echo !quote(!concat(' Validating SA variables: ', !listvar)).
>> >> >echo
>> >>
>> >>
>> >'--------------------------------------------------------------------------'.
>>
>> >>
>> >> >echo 'VALIDATING MISSING VALUES'.
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if (miss(!oe) & !conditie ) validerr=concat(ltrim(rtrim(validerr)),
>> >> >',', !quote(!oe)).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >> >missing unless SKIP condition: ', !skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if (~miss(!oe) & ~!conditie )
>> validerr=concat(ltrim(rtrim(validerr)),
>> >> >',', !quote(!oe)).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: The following variables SHOULD be
>> missing
>> >> >for SKIP condition: ',!skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >echo
>> >>
>> >>
>> >'--------------------------------------------------------------------------'.
>>
>> >>
>> >> >echo 'VALIDATING VALUES/RANGES'.
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if ((~any(!oe,!vallist)) & !conditie )
>> >> >validerr=concat(ltrim(rtrim(validerr)), ',' , !quote(!oe),'=' ,
>> >> >ltrim(rtrim(string(!oe , F15)))).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: The following variables do not fit the
>> >> >requested Range for SKIP condition: ',!skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >echo '----------------END VALIDATION
>> >> >--------------------------------------------'.
>> >> >  !enddefine.
>> >>
>> >> Each of your tests (missing values and ranges) makes two passes
>> through
>> >> the data, once to generate your error-message string 'validerr' for
>> >> each case, and once to print it. (It would be quite easy to have the
>> >> PRINT statements in the same transformation programs that do the
>> tests.
>> >> Why don't you do it that way?)
>> >>
>> >> You delete and re-declare 'validerr' each time for each new test:
>> >> >del var validerr .
>> >> >string validerr(A200).
>> >> Better simply to set it to blank, at the beginning of each set of
>> tests
>> >> for a new record. This may easily be something that strains SPSS, if
>> >> you do it often.
>> >>
>> >> And, you're generating your tests in macro loops. How long are the
>> >> resulting transformation programs? And how often is this macro
>> called,
>> >> at four transformation programs each call?
>> >>
>> >>
>> >> >*===========================================================.
>> >> >*===========================================================.
>> >> >*===========================================================.
>> >> >define valid_oe (listvar=!charend('/') / skip=!default ("None")
>> >> >!cmdend)
>> >> >  !if (!skip="None")  !then  !let !conditie="(1=2)" !else  !let
>> >> > !conditie=!skip  !ifend
>> >> >echo
>> >>
>> >>
>> >'--------------------------------------------------------------------------'.
>>
>> >>
>> >> >echo !quote(!concat(' Validating OE variables: ', !listvar)).
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if ((length(ltrim(rtrim(!oe)))=0) & ~(!conditie) )
>> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >> >missing unless SKIP condition: ', !skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >string validerr(A200).
>> >> >  !do !oe !IN (!listvar)
>> >> >if ((length(ltrim(rtrim(!oe)))<>0) & (!conditie) )
>> >> >validerr=concat(ltrim(rtrim(validerr)), ',', !quote(!oe)).
>> >> >  !doend
>> >> >if length(ltrim(rtrim(validerr)))>0
>> >> >validerr=substr(validerr,2,length(validerr)-1).
>> >> >echo !quote(!concat('ERROR: The following variables SHOULD be
>> missing
>> >> >for SKIP condition: ',!skip)).
>> >> >exec.
>> >> >do if length(ltrim(rtrim(validerr)))>0.
>> >> >print / 'id=' id ' variables: ' validerr.
>> >> >end if.
>> >> >exec.
>> >> >del var validerr .
>> >> >echo '----------------END VALIDATION OE
>> >> >--------------------------------------------'.
>> >> >  !enddefine.
>> >> >*===========================================================.
>> >> Same remarks as for previous.
>> >>
>> >>
>> >> >*===========================================================.
>> >> >*===========================================================.
>> >> >define valid_ma (listvar=!charend('/') /interval=!charend('/') /
>> skip=
>> >> >!default ("None") !charend('/') / exclusiv= !default ("None")
>> !cmdend)
>> >> >  !if (!skip="None") !then !let !conditie="(1=2)" !else !let
>> >> > !conditie=!skip !ifend
>> >> >echo
>> >>
>> >>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >>
>> >> >
>> >> >echo !quote(!concat(' Validating MA variables: ', !listvar)).
>> >> >comp validerr=0.
>> >> >count qq=!listvar  (!interval).
>> >> >if ((qq=0 ) & ~!conditie) validerr=1.
>> >> >exe.
>> >> >echo !quote(!concat('ERROR: the following variables should NOT be
>> >> >missing unless SKIP condition: ', !skip)).
>> >> >do if validerr>0.
>> >> >print / 'id=' id .
>> >> >end if.
>> >> >exe.
>> >> >del var validerr .
>> >> >echo
>> >>
>> >>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >>
>> >> >if ((qq<>0 ) & (!conditie)) validerr=1.
>> >> >exe.
>> >> >echo !quote(!concat('ERROR: the following variables should be
>> missing
>> >> >when NOT SKIP condition: ', !skip)).
>> >> >do if validerr>0.
>> >> >print / 'id=' id .
>> >> >end if.
>> >> >exe.
>> >> >del var validerr .
>> >> >echo
>> >>
>> >>
>> >'-----------------------------------------------------------------------------------------------------------'.
>>
>> >>
>> >> >
>> >> >  !if (!exclusiv="None") !then !let !test="(1=2)" !else !let
>> >> > !test=!exclusiv !ifend
>> >> >do if !test & qq>1.
>> >> >comp validerr=1.
>> >> >end if.
>> >> >  !let !flag=0
>> >> >do if validerr>0.
>> >> >echo 'THE FOLLOWING IDS DOES NOT HOLD FOR THE EXCLUSIVITY
>> CONDITION'.
>> >> >print / 'id=' id .
>> >> >end if.
>> >> >exe.
>> >> >del var validerr .
>> >> >del var qq .
>> >> >echo '----------------------------END
>> >>
>> >>
>> >VALIDATION-----------------------------------------------------------------'.
>>
>> >>
>> >> >!enddefine.
>> >> >*===========================================================.
>> >> Same remarks again.
>> >>
>> >>
>> >> >*===========================================================.
>> >> >thank you very much and once again sorry for any trouble,
>> >>
>> >> Good luck, and I think I've given you something to go on with.
>> >> Richard
>> >>
>> >>
>> >
>> >
>> > --
>> > Vlad Simion
>> > Data Analyst
>> > Tel:      +40 720130611
>> >
>> >
>>
>>
>
>
> --
> Vlad Simion
> Data Analyst
> Tel:      +40 720130611
>
>




--
Vlad Simion
Data Analyst
Tel:      +40 720130611