Dear all,
I would like to create a weight variable, based on the values of the 3 variables type product area. In this case since the values are 2,3 and 4 the possible combinations are only 24, but in real life I have more variables and much more combinations. So for each combination I would like to assign a weight value to the variable weight_var. I created some DO loops that create the right combinations between the 3 variables type product area ant their values, but I cannot figure out a way to introduce in a proper way my list of weights: here in the macro call I put some values in the loop4, but it did not work since simply overwrite the sentence. I need something with these logic: If var=1 and var2=1 and var3=1 weight_var=40. If var=1 and var2=2 and var3=1 weight_var=22. Any suggestion? Am I following a proper type of macro code or should check other commands? Thanks data list list /type product area. begin data 1 1 1 2 3 2 1 2 4 end data. DEFINE my_weight (!POS !CHAREND ('/') /!POS !CHAREND ('/') /!POS !CHAREND ('/') /loop1=!charend ('/') /loop2=!charend ('/') /loop3=!charend ('/') /loop4= !CMDEND). !do !I !IN (!loop1) !DO !J !IN (!loop2) !DO !z !IN (!loop3) !DO !y !IN (!loop4) if !1=!i and !2=!j and !3=!z weight_var=!y. exe. !DOEND !DOEND !DOEND !DOEND !ENDDEFINE. my_weight type / product / area / loop1=1 2 / loop2= 1 2 3 / loop3= 1 2 3 4 / loop4 = 40 22 35 . |
One approach would be to create a table file for use with MATCH files.
The new file would have a column for each variable plus a column for the weight. It is possible that a "backbone" could be created via INPUT Program to enumerate all of the combinations. However, it appears that you may end up with something very tedious and hard to understand. Please provide much more detail. What is the context in which this question arises? What is the goal of the effort? How many variables are there and how many values can they take on? How are the weights decided? What will the weights be used for? for variables is 1 2 3 the same as 3 2 1 or 2 1 3?
Art Kendall
Social Research Consultants |
What is the context in which this question arises? Questionnaire data
What is the goal of the effort? Instead of writing 400 times: If var=1 and var2=1 and var3=1 weight_var=40. If var=1 and var2=2 and var3=1 weight_var=22, etc. I would like a macro doing it. How many variables are there and how many values can they take on? Usually 4-5 variable with 3-4 value each How are the weights decided? I have the N for each combination in the real population. I can calculate the sample size for each combinations doing an aggregate in the sample (questionnaire data) What will the weights be used for? To weight cases… for variables is 1 2 3 the same as 3 2 1 or 2 1 3? The aim is to cover all the posible matematical combinations, 1 2 3 vs 3 2 1 it’s the same |
Administrator
|
In reply to this post by raw
Do as Art stated, build another dataset containing the values of the variables driving the weighting and MATCH as a TABLE to your SORTED raw data.
What you are doing here is completely illogical WRT the macro code. For insight add the following in the innermost part of the macro (where you compute weight_var. ECHO !QUOTE(!CONCAT('I=',!I,", J=",!J," ,Z=", !Z, " ,Y= ", !Y ). Also, you DON'T want to pass the data 24 times (GET RID OF THE EXECUTE COMMAND). In fact. Go through ALL of your code and BANISH EXECUTE. The only time you actually require EXECUTE is when you are building SELECT from $CASENUM or dealing with LAG variables. Even then you can sometimes get away without using it. If you don't want to go with the table look up then your only recourse is to create a boatload of IF statements. Macro is NOT going to be of much use here at all since you have no way of systematically building the target values and associating them with the desired values of the weight driving variables. Any such scheme is bound to be quite hideous, clumsy and a big waste of time (to say nothing of a complete NIGHTMARE to properly maintain).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by raw
"How are the weights decided? I have the N for each combination in the real population. I can calculate the sample size for each combinations doing an aggregate in the sample (questionnaire data) "
Please be specific! Are these weights governed by a FUNCTION of these sample sizes? If so then why tediously assign them with IF statements at all if they can simply be computed off of the aggregates? "I would like a macro doing it." Clearly that is not going to happen.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
The weight is assignated by this ratio: real population cluster size/ sample cluster size. Samle is my questionnaire data. If the weight factor for cluster 1 is 10, 10 people in the cluster 1 of my questionnaire weight as 100.
Unfortunately I don’t have a DB for population, but only the size of each combination, for instance type 2 from area 2 and product 1: 1000 cases. That will be my cluster 1. I have to compare this cluster with my questionnaire data and say that for the same combination I have 10 cases: the weight factor will be 100. (1000/10). In the output I can copy and paste all the generated combinations and manually input only the weight_var factor, I thought that there was a way to fix the final step: adding the proper weight_var, because as you can see SPSS has a predictible logic when making all the posible combinations. Knowing it I thought that it was possible putting in the macro call the large list of desired weight_var values. if type = 1 and product = 1 and area = 1 weight_var= 40. if type = 1 and product = 1 and area = 1 weight_var= 22. if type = 1 and product = 1 and area = 1 weight_var= 35. if type = 1 and product = 1 and area = 2 weight_var= 40. if type = 1 and product = 1 and area = 2 weight_var= 22. if type = 1 and product = 1 and area = 2 weight_var= 35. if type = 1 and product = 1 and area = 3 weight_var= 40. if type = 1 and product = 1 and area = 3 weight_var= 22. if type = 1 and product = 1 and area = 3 weight_var= 35. if type = 1 and product = 1 and area = 4 weight_var= 40. if type = 1 and product = 1 and area = 4 weight_var= 22. if type = 1 and product = 1 and area = 4 weight_var= 35. if type = 1 and product = 2 and area = 1 weight_var= 40. Etc. Etc. |
Administrator
|
Am I misunderstanding something? Doesn't the following produce the same result as your series of IF statements (as you have them ordered)?
IF ANY(type,1,2) and ANY(product,1,2,3) and ANY(area,1,2,3,4) weight_var = 35. If it can be assumed that variables type, product and area all have valid, within-range values, you could even reduce it to a simple COMPUTE, as follows: COMPUTE weight_var = 35. But I suspect this is not what you want!
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
Seems like we have a case of COPY/PASTE without modifying the appropriate code which most likely would vary for each line.
BITE THE BULLET and simply do a bloody MATCH with a table. Much easier to maintain than all of that syntax. OTOH, OP should specify the EXACT steps he/she is doing to calculate these weights.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by raw
"Usually 4-5 variable with 3-4 value each"
Then you can get another TABLE that contains the obtained sample sizes to use in MATCH FILES by doing a four or five way CROSSTAB using the /WRITE option. Then read that output back in.
Art Kendall
Social Research Consultants |
Administrator
|
Why would one opt to do that over a simple AGGREGATE?
--
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bruce Weaver
Yes, my goal is different!
The aim of the copy & past was only showing that the macro can do exactly what I want, except computing the right weight_var value. I mean, if we can predict the logic of SPSS in writing the loops (it's evident looking to my copy and past), the only missing link was adding a proper weight_var value. I will check other methods as you all suggested, such as matching, aggregating, or crosstabs: these concepts are easy and I want to preserve your time, so don't worry about these options. the interesting thing was knowing IF SPSS macros could be more flexible in DO loops, for instance assigning a prearranged value to a variable, for each loop combination. I guess they aren't. |
Administrator
|
Allow me to demonstrate. Note that this will only work for 2. You would need to create slightly different macros for 3, 4, ... variables or design a much more elaborate parser. Won't do that for free. I can't think of any good reason to NOT use the table match approach (unless you are into S& M) . Think of the person having to maintain your code after you leave the project for whatever reason. Compare to your original and note the error of your ways. Your logic wouldn't work in any computer language due to incorrect indexing into the desired vector of weights. -- Compare and take note of the macro names ;-). DATA LIST FREE / a b other. BEGIN DATA 1 1 110 1 2 120 1 3 130 2 1 210 2 2 220 2 3 230 END DATA DEFINE !PleaseDontDoThisForNumerousReasons (Var1 !TOKENS(1) / Var2 !TOKENS(1) / WtVar !TOKENS(1) / Vals1 !CHAREND('/') / Vals2 !CHAREND('/') / Wts !CMDEND ) !LET !CpyWts=!Wts !DO !V1 !IN (!Vals1) !DO !V2 !IN (!VALS2) IF (!Var1 EQ !V1 AND !Var2 EQ !V2) !WtVar=!HEAD(!CpyWts). !LET !CpyWts=!TAIL(!CpyWts) !DOEND !DOEND !ENDDEFINE . PRESERVE. SET PRINTBACK ON MPRINT ON . !PleaseDontDoThisForNumerousReasons Var1 =a Var2=b WtVar = WgtAB Vals1= 1 2 / Vals2= 1 2 3 / Wts=11 12 13 21 22 23. LIST. RESTORE. *Result * . /* PleaseDontDoThisForNumerousReasons /* Var1 =a Var2=b WtVar = WgtAB Vals1= 1 2 / Vals2= 1 2 3 / Wts=11 12 13 21 22 23. /* 278 0 M> /* 279 0 M> . /* 280 0 M> IF ( a EQ 1 AND b EQ 1 ) WgtAB = 11. /* 281 0 M> IF ( a EQ 1 AND b EQ 2 ) WgtAB = 12. /* 282 0 M> IF ( a EQ 1 AND b EQ 3 ) WgtAB = 13. /* 283 0 M> IF ( a EQ 2 AND b EQ 1 ) WgtAB = 21. /* 284 0 M> IF ( a EQ 2 AND b EQ 2 ) WgtAB = 22. /* 285 0 M> IF ( a EQ 2 AND b EQ 3 ) WgtAB = 23. /* . DEFINE !DoThisInstead (Vars !CHAREND('/') / Raw !TOKENS(1) / WtTable !TOKENS(1) ). DATASET ACTIVATE !Raw. SORT CASES BY !Vars . DATASET ACTIVATE !WtTable . SORT CASES BY !Vars . MATCH FILES FILE !Raw / TABLE !WtTable / BY !Vars . !ENDDEFINE . /* Original Raw data */. DATA LIST FREE / a b other. BEGIN DATA 1 1 110 1 2 120 1 3 130 2 1 210 2 2 220 2 3 230 END DATA DATASET NAME RawData . /* Weight Table*/. DATA LIST FREE / a b wt. BEGIN DATA 1 1 11 1 2 12 1 3 13 2 1 21 2 2 22 2 3 23 END DATA. DATASET NAME wts . PRESERVE. SET MPRINT ON . !DoThisInstead Vars a b / Raw=RawData WtTable= Wts . LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by David Marso
Of course AGGREGATE with several BREAK variables and MODE=ADDVARIABLES would be the thw way to go. That would eliminate for the MATCH FILES tables for sample sizes. So the OP would only have to create the table for the pop sizes.
Be sure to check that there are no samples size cells that do not exist in the set of pop cells. And that there are no pop cells that do not have any cases.
Art Kendall
Social Research Consultants |
In reply to this post by David Marso
sorry for uploading the topic, but I am thinking about a generalization of the problem.
Imagine I had the same 3 variables of the example, a b other, but instead of creating a weight, I want only create a different segment for each possible combinations. (an ascending ordinal value would be enough, that is segment=1 for the first possible combination, segment=2 for the 2nd possible combination) In a a real life situation the original dataset would have also duplicates of each combinations DATA LIST FREE / a b other. BEGIN DATA 1 1 110 1 2 120 1 3 130 1 3 130 1 1 110 2 1 210 2 2 220 2 2 220 2 3 230 2 3 230 END DATA I would delete the duplicates, generating an aggregated file, then doing this comp segment=$CASENUM. and matching the results to the original file. Do you think it's a good method or are they best ways or macros you suggest? |
To create these segment numbers you could use the RANK command on the aggregated combination variable other:
RANK VARIABLES=other (A) /RANK /PRINT=YES /TIES=CONDENSE. /PRogman
|
thanks this solution is great for a unique study, but in a recurring study with more variables the rank could change every time: I receive data each month, if in the next month the data do not contain a possible combination, the rank would change, and so the value labels that I define would not match.
(the same problems occurs also with autorecode) I would like a solution that allows me to control the target values, as in normal IF statements. |
Administrator
|
given {a,b,c,d} 1:9 .
COMPUTE f=a*1000+b*100+c*10+d. generalize from there. See TEMPLATE subcommand in AUTORECODE. See TABLE subcommand in MATCH FILES (YES, we are back to that!!!!!!!). --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |