Dear All,
Please suggest the way to deal with the following task. I have database with cases from 5 different cities and I need to run same statistical procedures for each of city. To make it simple, let's say I want to run t-testA and t-testB for city 1, then for city 2 and so on. So I thought of doing something like that loop #n = 1 to 5. (in this example max for #n is 5 because cities are coded 1 to 5) filter = #n. t-testA. t-testB. end loop. This specific approach would not work because loop is used for transformation command only. If I am right LOOP is "going" from case to case in the data base repeating procedure for current case only and that is why it is used only for transformations. In my case I need a command that will go through entire dataset selecting cases from city 1 and run two t-tests, then again though entire dataset selecting cases from city 2 ... and so on 5 times. What command could you suggest for such situation? Sincerely, Eduard. |
sort cases by city. split file layered by city. or if you want each city in a separate table split file separate by city. However, given your example, is it possible that you want a more complex anova than just separate t-test? Art Kendall Social Research Consultants On 5/22/2011 5:44 AM, EduSaR wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARDDear All, Please suggest the way to deal with the following task. I have database with cases from 5 different cities and I need to run same statistical procedures for each of city. To make it simple, let's say I want to run t-testA and t-testB for city 1, then for city 2 and so on. So I thought of doing something like that loop #n = 1 to 5. (in this example max for #n is 5 because cities are coded 1 to 5) filter = #n. t-testA. t-testB. end loop. This specific approach would not work because loop is used for transformation command only. If I am right LOOP is "going" from case to case in the data base repeating procedure for current case only and that is why it is used only for transformations. In my case I need a command that will go through entire dataset selecting cases from city 1 and run two t-tests, then again though entire dataset selecting cases from city 2 ... and so on 5 times. What command could you suggest for such situation? Sincerely, Eduard. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Repeated-Statistical-Procedure-for-different-subsets-tp4416436p4416436.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Dear Art,
Thank you for prompt reply Yes analysis is more complicated that is why I am not satisfied with only SPLIT. Even more to that, I have a set of dependent and independent variables and to avoid writing a code of the same stat procedures for each combination of variables and then correcting ALL of them (any correction to the procedure syntax code) I wanted to write one procedure and apply it to different combinations of variables and for different subsets.
Actually in my question I should have mentioned that I am looking for a COMMAND that will introduce different variables from the list of variables as well, f.e. var1 var2 var3 ... var10
t-test var1 by sex. then t-test var2 by sex and so on. Thank you, Eduard. On Sun, May 22, 2011 at 14:12, Art Kendall <[hidden email]> wrote:
|
or anova variables = var1, var2, var37 to var 41, var99 by sex(1,2) city(1,5), race(1,4)... with income, height, body_weight... or some other kind of General Linear Model. Given that you have enough cases, which kind would depend on how many dependent variables you have and their level of measurement, and how many independent variables and their level of measurement. Are your subsets completely separate? Are you possibly thinking of using macros or Python to to shortcut writing a lot of syntax? It is difficult to be more specific in my suggestions without details such as how many cases you have, whether your study is experimental, quasi-experimental, or just reporting, what questions you are trying to answer, whether you are trying to set up a procedure for ongoing analysis by other people, what is the nature of and level of measurement of your variables, etc. SPLIT may be part of the answer, crossing or nesting may be part of the answer, repeated measures may be part of the answer, etc. Art Kendall Social Research Consultants On 5/22/2011 8:33 AM, Eduard Salahov wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by EduSaR
This is what SPLIT FILES is for. Besides
the syntax, you will find it on the Data menu.
Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: EduSaR <[hidden email]> To: [hidden email] Date: 05/22/2011 03:47 AM Subject: [SPSSX-L] Repeated Statistical Procedure for different subsets Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear All, Please suggest the way to deal with the following task. I have database with cases from 5 different cities and I need to run same statistical procedures for each of city. To make it simple, let's say I want to run t-testA and t-testB for city 1, then for city 2 and so on. So I thought of doing something like that loop #n = 1 to 5. (in this example max for #n is 5 because cities are coded 1 to 5) filter = #n. t-testA. t-testB. end loop. This specific approach would not work because loop is used for transformation command only. If I am right LOOP is "going" from case to case in the data base repeating procedure for current case only and that is why it is used only for transformations. In my case I need a command that will go through entire dataset selecting cases from city 1 and run two t-tests, then again though entire dataset selecting cases from city 2 ... and so on 5 times. What command could you suggest for such situation? Sincerely, Eduard. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Repeated-Statistical-Procedure-for-different-subsets-tp4416436p4416436.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by EduSaR
The extension commands SPSSINC SPLIT FILES
and SPSSINC PROCESS FILES can be used to partition a dataset and then apply
a sequence of commands to each partition. These require the Python
Essentials and can be obtained from the SPSS Community site at
www.ibm.com/developerworks/spssdevcentral To build sets of combinations of variables would be most easily done with a Python program, but I hope this is not just a fishing expedition. Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Eduard Salahov <[hidden email]> To: [hidden email] Date: 05/22/2011 06:36 AM Subject: Re: [SPSSX-L] Repeated Statistical Procedure for different subsets Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear Art, Thank you for prompt reply Yes analysis is more complicated that is why I am not satisfied with only SPLIT. Even more to that, I have a set of dependent and independent variables and to avoid writing a code of the same stat procedures for each combination of variables and then correcting ALL of them (any correction to the procedure syntax code) I wanted to write one procedure and apply it to different combinations of variables and for different subsets. Actually in my question I should have mentioned that I am looking for a COMMAND that will introduce different variables from the list of variables as well, f.e. var1 var2 var3 ... var10 t-test var1 by sex. then t-test var2 by sex and so on. Thank you, Eduard. On Sun, May 22, 2011 at 14:12, Art Kendall <Art@...> wrote: if you want the cities in one table sort cases by city. split file layered by city. or if you want each city in a separate table split file separate by city. However, given your example, is it possible that you want a more complex anova than just separate t-test? Art Kendall Social Research Consultants On 5/22/2011 5:44 AM, EduSaR wrote: Dear All, Please suggest the way to deal with the following task. I have database with cases from 5 different cities and I need to run same statistical procedures for each of city. To make it simple, let's say I want to run t-testA and t-testB for city 1, then for city 2 and so on. So I thought of doing something like that loop #n = 1 to 5. (in this example max for #n is 5 because cities are coded 1 to 5) filter = #n. t-testA. t-testB. end loop. This specific approach would not work because loop is used for transformation command only. If I am right LOOP is "going" from case to case in the data base repeating procedure for current case only and that is why it is used only for transformations. In my case I need a command that will go through entire dataset selecting cases from city 1 and run two t-tests, then again though entire dataset selecting cases from city 2 ... and so on 5 times. What command could you suggest for such situation? Sincerely, Eduard. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Repeated-Statistical-Procedure-for-different-subsets-tp4416436p4416436.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Art Kendall
Thank you, Art,
Meantime i was searching in the net and realized that the best way would be to learn more about macros or Python (never used them before). It is experimental study with two groups with repeated analyses in the middle of the data collection and I am trying to develop procedures while it is being conducted.
at this stage data is arriving in portions (case by case or variables by variables) so I am implementing then into database right away and run the basic (so far) procedures. But my impression that it is time to learn how to shorten some syntax procedures especially when they are identical for different combinations, f.e.
1. compute difference between measurements abs and % 2. on results of 1 to compute set of index variables 3. then run independent and paired samples comparisons so it seems that I must learn macros or Python now. On Sun, May 22, 2011 at 16:05, Art Kendall <[hidden email]> wrote:
|
Python has pretty much made macros obsolete.
However, given that you have an experiment (i.e., have random assignment to treatment) and repeated measures, is highly probable that a good general linear model that takes into account planned comparisons, crossing, nesting, missing data, and repeats would need things like t-tests only as post-hoc results. You really should think through the overall analysis first and work from the top down rather than from the bottom up. The basic procedures at this time should most likely be limited to quality assurance purposes. "findings" based on incomplete data are very likely to be spurious and just confuse thinking by inducing things like anchoring effects. One thing that helps some people is to build a simulation that has the kind of data you expect in the long run and develop your syntax for the overall model using that. Simulating also helps getting a handle on the thought process of the analysis. Art Kendall Social Research Consultants On 5/22/2011 9:44 AM, Eduard Salahov wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Art,
I used preliminary analysis just to get early impression on the quality of data to identify cases that looks strange and in no way share it with researches to avoid confusions or other unnecessary intentions
Thank you very much for your advice! and i probably should think of using simulation at this stage too. Many thanks. On Sun, May 22, 2011 at 17:12, Art Kendall <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |