|
Hi. I'd greatly appreciate some help with this issue. I'm organising for a
litter count where litter items are categorised into 83 types (e.g. small plastic bottles (SPB), large plastic bottles (LPB), cigarette packets (CP), etc). Furthermore, I need to record brands within each litter type and the number of items of each brand (e.g. 11 small plastic bottles with the Coke brand (SPB_COKE), 6 small plastic bottles with the Pepsi brand (SPB_PEPSI), 4 large plastic bottles with the Coke brand (LPB_COKE), 3 cigarette packets with the camel brand (CP_CAMEL) and so forth). The litter count is conducted at 983 sites around the country. This data is a bit difficult to represent using SPSS's 'flat' structure. I presume that the best way of doing so would be to have each case as a site where litter was counted. Then, each brand within each litter type is a separate variable. For example: SITE SPB_COKE SPB_PEPSI LPB_COKE LPB_PEPSI ... CP_CAMEL 1 11 6 4 0 3 2 9 4 0 2 0 3 0 1 0 5 2 ... The problem with this is that it will result in a potentially huge number of variables, and it seems like there must be a better way of representing this information. During the last litter count we recorded hundereds of brands. Multiplying these by 83 different litter types results in a data array that will be difficult to manage and work with. Furthermore, when I am asked to provide totals by brand, I'll have to trawl through the data file and find all variables with the brand 'COKE' appended to the item type name, and create new variables such as 'Total_COKE' which are the computed sum of all values in all of the 'COKE' variables (i.e. Total_COKE = SPB_COKE + LPB_COKE and so forth). I just want to know if there's a better way of representing this information. As mentioned, I need to be able to provide totals, not just of brands (e.g. total COKE regardless of item type) but also of item types (e.g. total SPBs regardless of brand). There's got to be a simpler way of storing this information. I'd be very grateful if anyone could help me, or even let me know if there ISN'T a better way so I will stop puzzling over it and just accept my fate! Thanks very much. |
|
At 06:14 PM 5/4/2007, Ben wrote:
>I'm organising for a litter count where litter items are categorised >into 83 types (e.g. small plastic bottles (SPB), large plastic bottles >(LPB), cigarette packets (CP), etc). Furthermore, I need to record >brands within each litter type and the number of items of each brand >(e.g. 11 small plastic bottles with the Coke brand (SPB_COKE), 6 small >plastic bottles with the Pepsi brand (SPB_PEPSI), 4 large plastic >bottles with the Coke brand (LPB_COKE), 3 cigarette packets with the >camel brand (CP_CAMEL) and so forth). The litter count is conducted at >983 sites around the country. So, you're studying litter in Australia? I could find you plenty here in Providence, Rhode Island, if you need a comparison group. >This data is a bit difficult to represent using SPSS's 'flat' >structure. I presume that the best way of doing so would be to have >each case as a site where litter was counted. Then, each brand within >each litter type is a separate variable. > >For example: > >SITE SPB_COKE SPB_PEPSI LPB_COKE LPB_PEPSI ... CP_CAMEL >1 11 6 4 0 3 >2 9 4 0 2 0 >3 0 1 0 5 2 >... > >The problem with this is that it will result in a potentially huge >number of variables, and it seems like there must be a better way of >representing this information. During the last litter count we >recorded hundereds of brands. Multiplying these by 83 different litter >types results in a data array that will be difficult to manage and >work with. What you're talking about is called 'wide' data organization, and you've correctly listed its disadvantages. I recommend - many would recommend - 'long' data organization, spreading over many records rather than many variables. In your case, to represent what you've given above would take four variables: Site Litter_Cat Brand Count 1 SPB Coke 11 1 SPB Pepsi 6 1 LPB Coke 4 1 LPB Pepsi 0 <probably, no record> ... 1 CP Camel 3 2 SPB Coke 9 ('Litter_Cat' is your litter categories, not to be confused with kitty-litter.) >Furthermore, when I am asked to provide totals by brand, I'll have to >trawl through the data file and find all variables with the brand >'COKE' appended to the item type name, and create new variables such >as 'Total_COKE' which are the computed sum of all values in all of the >'COKE' variables (i.e. Total_COKE = SPB_COKE + LPB_COKE and so forth). And this would be a piece of cake with the structure I've outlined above, and AGGREGATE. >I'd be very grateful if anyone could help me, or even let me know if >there ISN'T a better way so I will stop puzzling over it and just >accept my fate! I think this should do you just fine. It'll give you a huge explosion in number of cases; but SPSS handles a great many cases without trouble. Go for it! Richard |
|
In reply to this post by Ben-142
On Sat, 5 May 2007 02:34:03 -0400, Richard Ristow <[hidden email]>
wrote: >At 06:14 PM 5/4/2007, Ben wrote: > >>I'm organising for a litter count where litter items are categorised >>into 83 types (e.g. small plastic bottles (SPB), large plastic bottles >>(LPB), cigarette packets (CP), etc). Furthermore, I need to record >>brands within each litter type and the number of items of each brand >>(e.g. 11 small plastic bottles with the Coke brand (SPB_COKE), 6 small >>plastic bottles with the Pepsi brand (SPB_PEPSI), 4 large plastic >>bottles with the Coke brand (LPB_COKE), 3 cigarette packets with the >>camel brand (CP_CAMEL) and so forth). The litter count is conducted at >>983 sites around the country. > >So, you're studying litter in Australia? I could find you plenty here >in Providence, Rhode Island, if you need a comparison group. > > >>This data is a bit difficult to represent using SPSS's 'flat' >>structure. I presume that the best way of doing so would be to have >>each case as a site where litter was counted. Then, each brand within >>each litter type is a separate variable. >> >>For example: >> >>SITE SPB_COKE SPB_PEPSI LPB_COKE LPB_PEPSI ... CP_CAMEL >>1 11 6 4 0 3 >>2 9 4 0 2 0 >>3 0 1 0 5 2 >>... >> >>The problem with this is that it will result in a potentially huge >>number of variables, and it seems like there must be a better way of >>representing this information. During the last litter count we >>recorded hundereds of brands. Multiplying these by 83 different litter >>types results in a data array that will be difficult to manage and >>work with. > >What you're talking about is called 'wide' data organization, and >you've correctly listed its disadvantages. I recommend - many would >recommend - 'long' data organization, spreading over many records >rather than many variables. In your case, to represent what you've >given above would take four variables: > >Site Litter_Cat Brand Count > 1 SPB Coke 11 > 1 SPB Pepsi 6 > 1 LPB Coke 4 > 1 LPB Pepsi 0 <probably, no record> >... > 1 CP Camel 3 > 2 SPB Coke 9 > >('Litter_Cat' is your litter categories, not to be confused with >kitty-litter.) > >>Furthermore, when I am asked to provide totals by brand, I'll have to >>trawl through the data file and find all variables with the brand >>'COKE' appended to the item type name, and create new variables such >>as 'Total_COKE' which are the computed sum of all values in all of the >>'COKE' variables (i.e. Total_COKE = SPB_COKE + LPB_COKE and so forth). > >And this would be a piece of cake with the structure I've outlined >above, and AGGREGATE. > >>I'd be very grateful if anyone could help me, or even let me know if >>there ISN'T a better way so I will stop puzzling over it and just >>accept my fate! > >I think this should do you just fine. It'll give you a huge explosion >in number of cases; but SPSS handles a great many cases without >trouble. > >Go for it! >Richard Richard, that's great. So simple! I'm kicking myself for not thinking of it. In fact I've been playing around with this sort of data, and found that if I weight cases by 'Count', frequency tables and crosstabulation (e.g. brand by litter type) provide pretty much all the information I need. Thanks so much. Ben. |
| Free forum by Nabble | Edit this page |
