Hello,
I'be been struggling over this problem: I need to know how many values a variable has for all cases in a file. This is a scale variable(Dest_code), though I think it's actually a nominal one because it's not continuous. But it's not defined as a nominal so there is no values/labels assigned in variable view. Which means I can't pull up the value column in the variable view to see how many values are there. So I run a frequency for this variable and now I can see how many values are there by manually counting them (see attached picture). There are 28 values, including 0. <http://spssx-discussion.1045642.n5.nabble.com/file/t341427/Screen_Shot_2018-04-12_at_10.png> But I want to find a way to calculate this, because I need to know this number for a few hundred subgroups, it's impossible to count manually for a few hundred subgroups! There is a function that allows me to do that? Many thanks for your attention to this! -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
You are saying, how there are DISTINCT (UNIQUE) values, aren't you?
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
I kind of think this will work but I haven’t tried it (or ever needed to do it) and others may have better ideas. Ok, so my idea is to use aggregate as in Aggregate outfile=*/break=group dest_code/ncount=nu. * that will give you for each group what you got from frequencies.
* next. The recode shows that a value is present one or more times. Recode ncount(lo thru hi=1). Aggregate outfile=*/break=group/nvalues=sum(ncount). Gene Maguin From: SPSSX(r) Discussion <[hidden email]>
On Behalf Of Kirill Orlov You are saying, how there are DISTINCT (UNIQUE) values, aren't you? ===================== To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
|
In reply to this post by saharazh
Are you after something like this?
data list list /v1 gr. begin data 7 1 6 2 4 1 6 2 7 1 3 2 6 3 5 3 8 3 7 2 7 1 8 2 4 3 7 2 3 1 1 2 4 2 7 1 3 1 8 1 7 1 3 2 7 3 7 2 4 1 0 2 7 3 6 2 8 1 3 2 7 3 8 2 7 1 7 2 4 2 3 2 1 1 6 2 5 3 3 1 7 1 3 2 8 1 5 2 4 3 1 2 6 1 5 2 3 3 7 1 end data. dataset name z. DATASET DECLARE zz. AGGREGATE /OUTFILE='zz' /BREAK=gr V1 /N_BREAK=N. DATASET ACTIVATE zz. DATASET DECLARE zzz. AGGREGATE /OUTFILE='zzz' /BREAK=gr /N_BREAK=N. list. gr N_BREAK 1 6 2 8 3 6 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
here is another way.
data list list /v1 gr. begin data 7 1 6 2 4 1 6 2 7 1 3 2 6 3 5 3 8 3 7 2 7 1 8 2 4 3 7 2 3 1 1 2 4 2 7 1 3 1 8 1 7 1 3 2 7 3 7 2 4 1 0 2 7 3 6 2 8 1 3 2 7 3 8 2 7 1 7 2 4 2 3 2 1 1 6 2 5 3 3 1 7 1 3 2 8 1 5 2 4 3 1 2 6 1 5 2 3 3 7 1 end data. RANK VARIABLES=v1 (A) /RANK /PRINT=YES /TIES=LOW. formats v1 gr rv1 (f2). split file by gr. descriptive variables = v1/statistics = max. ----- Art Kendall Social Research Consultants -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by Kirill Orlov
AGGREGATE is one way. Another that may be more convenient is this if the subgroups are cases defined by logical expressions (explanation at the end). * Run this code once to define the function named counts. Be sure to preserve the indentation. Note that it creates a variable named condition, which should not exist in the dataset. begin program. import spssdata def counts(condition, varname): """count number of distinct values for variable varname among cases satisfying condition condition is a logical expression""" spss.Submit("""compute condition = %s.""" % condition) print "*****", condition, ":", len(set(item[0] for item in spssdata.Spssdata([varname, "condition"], names=False) if item[1] == 1)) end program. * Use this function. For example, among cases where jobcat is not 1, what are its distinct values? begin program. counts("jobcat ne 1", "jobcat") end program. Since you have "hundreds of subgroups", the most convenient method will depend a lot on how those groups are defined. Explanation of the Python code. It uses the Spssdata class in the spssdata module to read the cases where the condition is satisfied. The values of jobcat are stored as a set, which means there will be only one set member for any particular value. Then the len function calculates the size of the set. On Thu, Apr 12, 2018 at 12:42 PM, Kirill Orlov <[hidden email]> wrote:
|
In reply to this post by saharazh
Thank you for replying to me. Really appreciate it. I guess I can't fully understand your solutions, and I now learnt giving you a sample dataset might be the most efficient way, so here you go, plus a more complete description for the problem I have.
Sample dataset: Group Dest_code (means for destination) 1 102 2 102 3 104 4 106 3 203 3 204 7 206 4 211 2 213 7 102 This is a simplified version of my data. In reality I have a few hundred subgroups for the variable "Group". And my ultimate goal is to calculate how many destinations each subgroup within the Group goes. Can you be so kind to have another look to see if the solutions you've given still apply? Thanks so much!!! ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Please explain how these subgroups are defined. On Thu, Apr 12, 2018 at 3:02 PM saharazh <[hidden email]> wrote: Thank you for replying to me. Really appreciate it. I guess I can't fully understand your solutions, and I now learnt giving you a sample dataset might be the most efficient way, so here you go, plus a more complete description for the problem I have. -- |
In reply to this post by saharazh
It's a scale variable, although it should be a nominal one since it's not intrinsically continuous. But I suppose what you meant is if this variable Group is logical expressions, which I haven't figure out what it means yet. I'm pulling my hair out...
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by saharazh
A big thank you to all who have replied to me. Really appreciate it. I'm not proficient in using syntax or Python yet (though I'm keen on learning), so it'll take me a while to figure out what you guys mean.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by saharazh
Yes, that's correct.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by saharazh
The previously posted AGGREGATE solutions still apply. Have bothered to try
them? AGGREGATE OUTFILE * / BREAK Group Dest_code/ Count=N. AGGREGATE OUTFILE * / BREAK Group / Count=N. The first AGGREGATE simply accumulates counts for each group x dest_code combination. The second AGGREGATE counts the number of distinct values within each group. RTFM about AGGREGATE and verify this for yourself. saharazh wrote > Thank you for replying to me. Really appreciate it. I guess I can't fully > understand your solutions, and I now learnt giving you a sample dataset > might be the most efficient way, so here you go, plus a more complete > description for the problem I have. > > Sample dataset: > > Group Dest_code (means for destination) > 1 102 > 2 102 > 3 104 > 4 106 > 3 203 > 3 204 > 7 206 > 4 211 > 2 213 > 7 102 > > This is a simplified version of my data. In reality I have a few hundred > subgroups for the variable "Group". And my ultimate goal is to calculate > how many destinations each subgroup within the Group goes. > > Can you be so kind to have another look to see if the solutions you've > given still apply? > > Thanks so much!!! > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by saharazh
Thanks so much! I'll try it.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by saharazh
It worked! Super! Your explanation about the double aggregate really helps.
Thank you all for contributing your ideas. Much appreciated. I'll try the Python solution and report here if it works. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |