SPSSX Discussion

Usage Data

Classic

List

Threaded

6 messages Options

Perkins, Tiffany

Usage Data

Hello, I am trying to figure out the easiest way to handle this data:

I have 6 variables that represent different aspects of website usage for 32,000 people.

For example, one variable is health. Another is lifestyle. Another is family. Usage on a variable can range from 0 (no clicks on that category) to thousands of clicks on a category.

However, what I'd like to show is the different variable combination groups...totaling to the N of the sample.

For example, how many people launched health only? Lifestyle only? Family only? Health and lifestyle only? Health and family only? Health, lifestyle and family only?‎.....and so forth and so on....until I have isolated all 32,000 users within one of the groups‎.

What‎ is the best way too do this analysis?

Thanks in advance!

Tiffany

Tiffany Perkins-Munn, Ph.D.
Adjunct Professor of Psychology
William Paterson University
Wayne, New Jersey
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Rich Ulrich

Re: Usage Data

The fancy way is to use log-linear modeling for 6 variables. Otherwise:

With only 6 variables, there are only 64 ( = 2^6 ) exclusive categories,
so you can create a variable with 64 categories and tabulate them.

I would probably look at the most frequent 3 or 4 variables in order
to focus on 8 or 16 categories. Then I would look closely at whether
there is anything interesting happening with the ones left out.

COMPUTE top3= 100*var1 + 10*var3 + var6 .

--
Rich Ulrich

Date: Wed, 16 Jul 2014 18:48:48 +0000
From: [hidden email]
Subject: Usage Data
To: [hidden email]

Hello, I am trying to figure out the easiest way to handle this data:

I have 6 variables that represent different aspects of website usage for 32,000 people.

For example, one variable is health. Another is lifestyle. Another is family. Usage on a variable can range from 0 (no clicks on that category) to thousands of clicks on a category.

However, what I'd like to show is the different variable combination groups...totaling to the N of the sample.

What‎ is the best way too do this analysis?

Thanks in advance!

Tiffany

Tiffany Perkins-Munn, Ph.D.
Adjunct Professor of Psychology
William Paterson University
Wayne, New Jersey
[hidden email]

Steve Peck

Re: Usage Data

how about a cluster analysis?

On 7/16/2014 7:28 PM, Rich Ulrich wrote:

The fancy way is to use log-linear modeling for 6 variables. Otherwise:

With only 6 variables, there are only 64 ( = 2^6 ) exclusive categories,
so you can create a variable with 64 categories and tabulate them.

I would probably look at the most frequent 3 or 4 variables in order
to focus on 8 or 16 categories. Then I would look closely at whether
there is anything interesting happening with the ones left out.

COMPUTE top3= 100*var1 + 10*var3 + var6 .

--
Rich Ulrich

Date: Wed, 16 Jul 2014 18:48:48 +0000
From: [hidden email]
Subject: Usage Data
To: [hidden email]

Hello, I am trying to figure out the easiest way to handle this data:

I have 6 variables that represent different aspects of website usage for 32,000 people.

For example, one variable is health. Another is lifestyle. Another is family. Usage on a variable can range from 0 (no clicks on that category) to thousands of clicks on a category.

However, what I'd like to show is the different variable combination groups...totaling to the N of the sample.

For example, how many people launched health only? Lifestyle only? Family only? Health and lifestyle only? Health and family only? Health, lifestyle and family only?‎.....and so forth and so on....until I have isolated all 32,000 users within one of the groups‎.

What‎ is the best way too do this analysis?

Thanks in advance!

Tiffany

Tiffany Perkins-Munn, Ph.D.
Adjunct Professor of Psychology
William Paterson University
Wayne, New Jersey
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

-- 
Stephen C. Peck
Assistant Research Scientist
Achievement Research Lab
Research Center for Group Dynamics
Institute for Social Research
University of Michigan
426 Thompson Street, # 5136
Ann Arbor, MI  48109-1290
(734) 647-3683; fax (734) 936-7370
http://www.rcgd.isr.umich.edu/garp/
[hidden email]

Ruben Geert van den Berg

Re: Usage Data

In reply to this post by Perkins, Tiffany

I think there's two steps involved here:

1) dichotomize the numbers of clicks. You proposed 0 versus 1(+) clicks per category but you could also consider median splits which can be easily created with RANK. For some examples, see http://www.spss-tutorials.com/rank/.

2) if you apply suitable value labels to these dichotomous variabels, you can combine them by concatenating their VALUELABELS into a (new) long string variable. Optionally, AUTORECODE that. For an example and some backgrounds on this approach, see http://www.spss-tutorials.com/combine-dichotomous-variables/.

HTH,

Ruben

Art Kendall

Re: Usage Data

In reply to this post by Perkins, Tiffany

I am not sure exactly what you are looking but for one way of reading your post, you could coarsen your measures and use MULT RESPONSE or CTABLES.

One way to coarsen is to do something like
RECODE health lifestyle family (0=0) (1 thru hi=1)(else=copy) into health2 lifestyle2 family2.

Other recodes would give you different ways to coarsen your measures.
e.g., RECODE health lifestyle family (0=0) (1 thru 10=1)(11 thru hi=2)(else=copy) into health2 lifestyle2 family2.

Art Kendall
Social Research Consultants

Art Kendall

Re: Usage Data

In reply to this post by Perkins, Tiffany

Art Kendall
Social Research Consultants