Detecting response patterns

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Detecting response patterns

Snuffy Dog
Dear Good Listers,
 
I am looking for code to calculate a value for variables Solution1, Solution2 and Solution3.
An input program has been included.  The way in which each solution variable is computed is
described below as are two rules which need to be followed.
 
I would be grateful for any assistance which can be provided. This is another tricky problem that is far beyond my humble skills.
 
Kind regards,
 
Jonahtan
 
**************************************************************************************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
END DATA.
 
**************************************************************************************************************************.
*   There are three solution variables and each records a specific response pattern which may occur more than once with mutiple reponse
*   patterns possible for the same ID. 
*
*   COMPUTE SOLUTION 1 - If there are more than 7 consecutive integers of value '1' repeated in succession across v1-v34 solution1 = 1
*   ELSE solution1 = 0.
*   COMPUTE SOLUTION 2 - If there are more than 5 consecutive integers of value '4' repeated in succession across v1-v34 solution2 = 1
*   ELSE solution2 = 0.
*   COMPUTE SOLUTION 3 - If there are more than 2 consecutive integers of value '3' repeated in succession across v1-v34 solution3 = 1
*   ELSE solution3 = 0.

*   RULE A: IF any of these consecutive sets of strings (e.g., 444444) occurs more than once for the same ID then the count of these response
*   sets should also be recorded as the solution variable value which is, in essence, a count of the occurances of each response pattern. For
*   example, for ID=8 there are 3 sets of consecutive integers of value '3' in seperate successions across v1-v34 and this is why solution = 3.
*  
*   RULE B: Also note that  it is possible for there to be more than one type of response pattern for each ID, for example ID=1 has two
*   response patterns.
**************************************************************************************************************************.
Reply | Threaded
Open this post in threaded view
|

Automatic reply: Detecting response patterns

Cheryl Boglarsky
Banned User








I will be out of the office until Wednesday, March 20, 2013, with limited access to email. However, please know that your message is very important to me and I will respond when I return. 

 

 

 

Thank you.

 

Sincerely,

Cheryl

_____________________________________________________

Cheryl A. Boglarsky, Ph.D.

Human Synergistics, Inc.

39819 Plymouth Road

Plymouth, MI 48170

734.459.1030

[hidden email] 

 

 

This message includes legally privileged and confidential information that is intended only for the use of the recipient named above. All readers of this message, other than the intended recipient, are hereby notified that any dissemination, modification, distribution or reproduction of this e-mail is strictly forbidden.

Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

David Marso
Administrator
In reply to this post by Snuffy Dog
See VECTOR.
See LOOP.
--

Snuffy Dog wrote
Dear Good Listers,

I am looking for code to calculate a value for variables Solution1,
Solution2 and Solution3.
An input program has been included.  The way in which each solution
variable is computed is
described below as are two rules which need to be followed.

I would be grateful for any assistance which can be provided. This is
another tricky problem that is far beyond my humble skills.

Kind regards,

Jonahtan

**************************************************************************************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
END DATA.

**************************************************************************************************************************.
*   There are three solution variables and each records a specific response
pattern which may occur more than once with mutiple reponse
*   patterns possible for the same ID.
*
*   COMPUTE SOLUTION 1 - If there are more than 7 consecutive integers of
value '1' repeated in succession across v1-v34 solution1 = 1
*   ELSE solution1 = 0.
*   COMPUTE SOLUTION 2 - If there are more than 5 consecutive integers of
value '4' repeated in succession across v1-v34 solution2 = 1
*   ELSE solution2 = 0.
*   COMPUTE SOLUTION 3 - If there are more than 2 consecutive integers of
value '3' repeated in succession across v1-v34 solution3 = 1
*   ELSE solution3 = 0.
*
*   RULE A: IF any of these consecutive sets of strings (e.g., 444444)
occurs more than once for the same ID then the count of these response
*   sets should also be recorded as the solution variable value which is,
in essence, a count of the occurances of each response pattern. For
*   example, for ID=8 there are 3 sets of consecutive integers of value '3'
in seperate successions across v1-v34 and this is why solution = 3.
*
*   RULE B: Also note that  it is possible for there to be more than one
type of response pattern for each ID, for example ID=1 has two
*   response patterns.
**************************************************************************************************************************.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

David Marso
Administrator
OTOH:  The following will enumerate all sequences (since we've been down the VECTOR/LOOP path before).
I will leave it to you to implement the rather trivial logic to sniff out the logical conditions.
--
VARSTOCASES  /MAKE long FROM v1 TO v34.
SPLIT FILE BY ID long.
COMPUTE @=1.
CREATE Counter=CSUM(@).
DO IF ID=LAG(ID) AND Counter LE LAG(COUNTER).
COMPUTE @=LAG(@)+1.
ELSE.
COMPUTE @=LAG(@).
END IF.
AGGREGATE OUTFILE * / BREAK ID @ long/ NC=MAX(Counter).

David Marso wrote
See VECTOR.
See LOOP.
--

Snuffy Dog wrote
Dear Good Listers,

I am looking for code to calculate a value for variables Solution1,
Solution2 and Solution3.
An input program has been included.  The way in which each solution
variable is computed is
described below as are two rules which need to be followed.

I would be grateful for any assistance which can be provided. This is
another tricky problem that is far beyond my humble skills.

Kind regards,

Jonahtan

**************************************************************************************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
END DATA.

**************************************************************************************************************************.
*   There are three solution variables and each records a specific response
pattern which may occur more than once with mutiple reponse
*   patterns possible for the same ID.
*
*   COMPUTE SOLUTION 1 - If there are more than 7 consecutive integers of
value '1' repeated in succession across v1-v34 solution1 = 1
*   ELSE solution1 = 0.
*   COMPUTE SOLUTION 2 - If there are more than 5 consecutive integers of
value '4' repeated in succession across v1-v34 solution2 = 1
*   ELSE solution2 = 0.
*   COMPUTE SOLUTION 3 - If there are more than 2 consecutive integers of
value '3' repeated in succession across v1-v34 solution3 = 1
*   ELSE solution3 = 0.
*
*   RULE A: IF any of these consecutive sets of strings (e.g., 444444)
occurs more than once for the same ID then the count of these response
*   sets should also be recorded as the solution variable value which is,
in essence, a count of the occurances of each response pattern. For
*   example, for ID=8 there are 3 sets of consecutive integers of value '3'
in seperate successions across v1-v34 and this is why solution = 3.
*
*   RULE B: Also note that  it is possible for there to be more than one
type of response pattern for each ID, for example ID=1 has two
*   response patterns.
**************************************************************************************************************************.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

Andy W
In reply to this post by Snuffy Dog
Here is how I would go about it. In a nutshell I concatenate all the values into a string variable and use the string index functions to find the patterns and take out that substring if the pattern is found. You just need to loop the max numbers of patterns that can actually be found (I believe I actually have one extra iteration for each loop except the last).

The code is repetitive enough that it could certainly be wrapped up in a macro. As far as I can tell, your solutions you posted in the data list are not correct, unless there are other stipulations. E.g. the first row only has 1 instance of two three's in a row (being mutually exclusive), and the fifth row has 3 examples of five consecutive 4's. I added an additional row to show it will count multiple instances of 7 ones in a row.

*****************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
11111111111111123334414541254325444201
END DATA.

*SOLUTION 1.

string vall (A34).
compute vall = " ".
do repeat v = v1 to v34.
compute vall = CONCAT(RTRIM(vall),STRING(v,F1.0)).
end repeat.
exe.

string tempv (A34).
compute tempv = vall.
compute sol1 = 0.
*Search for 7 consectutive ones - then take out if found, then search again.
loop #i = 1 to 5.
    compute #find = INDEX(tempv,"1111111").
    if #find > 0 sol1 = sol1 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find - 1),RTRIM(substr(tempv,#find+7))).
end loop.
exe.


*SOLUTION 2.

compute tempv = vall.
compute sol2 = 0.
*Search for 5 consectutive 4s - then take out if found, then search again.
loop #i = 1 to 7.
    compute #find = INDEX(tempv,"44444").
    if #find > 0 sol2 = sol2 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find - 1),RTRIM(substr(tempv,#find+5))).
end loop.
exe.

*SOLUTION 3.

compute tempv = vall.
compute sol3 = 0.
*Search for 2 consectutive threes - then take out if found, then search again.
loop #i = 1 to 17.
    compute #find = INDEX(tempv,"33").
    if #find > 0 sol3 = sol3 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find - 1),RTRIM(substr(tempv,#find+2))).
end loop.
exe.
*****************************************************.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

David Marso
Administrator
         s1     s2     s3
1 1.00 .00 1.00
2 .00 .00 .00
3 .00 .00 .00
4 .00 .00 .00
5 .00 2.00 .00
6 .00 1.00 .00
7 .00 .00 .00
8 .00 .00 3.00
Personally, I opted to not jump down the rabbit hole (this time).
In my own production code, rather than butcher the data with V2C I would just build counters and run through the vector without the concatenation.  OTOH: Such threads can linger and suck up my time.
BUT! YMMV.
---
Andy W wrote
Here is how I would go about it. In a nutshell I concatenate all the values into a string variable and use the string index functions to find the patterns and take out that substring if the pattern is found. You just need to loop the max numbers of patterns that can actually be found (I believe I actually have one extra iteration for each loop except the last).

The code is repetitive enough that it could certainly be wrapped up in a macro. As far as I can tell, your solutions you posted in the data list are not correct, unless there are other stipulations. E.g. the first row only has 1 instance of two three's in a row (being mutually exclusive), and the fifth row has 3 examples of five consecutive 4's. I added an additional row to show it will count multiple instances of 7 ones in a row.

*****************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
11111111111111123334414541254325444201
END DATA.

*SOLUTION 1.

string vall (A34).
compute vall = " ".
do repeat v = v1 to v34.
compute vall = CONCAT(RTRIM(vall),STRING(v,F1.0)).
end repeat.
exe.

string tempv (A34).
compute tempv = vall.
compute sol1 = 0.
*Search for 7 consectutive ones - then take out if found, then search again.
loop #i = 1 to 5.
    compute #find = INDEX(tempv,"1111111").
    if #find > 0 sol1 = sol1 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find - 1),RTRIM(substr(tempv,#find+7))).
end loop.
exe.


*SOLUTION 2.

compute tempv = vall.
compute sol2 = 0.
*Search for 5 consectutive 4s - then take out if found, then search again.
loop #i = 1 to 7.
    compute #find = INDEX(tempv,"44444").
    if #find > 0 sol2 = sol2 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find - 1),RTRIM(substr(tempv,#find+5))).
end loop.
exe.

*SOLUTION 3.

compute tempv = vall.
compute sol3 = 0.
*Search for 2 consectutive threes - then take out if found, then search again.
loop #i = 1 to 17.
    compute #find = INDEX(tempv,"33").
    if #find > 0 sol3 = sol3 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find - 1),RTRIM(substr(tempv,#find+2))).
end loop.
exe.
*****************************************************.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

Jon K Peck
In reply to this post by Andy W
Here is a simple Python solution using the SPSSINC TRANS extension command from the SPSS Community site (www.ibm.com/developerworks/spssdevcentral).
First, it reads the variables as a single string.  Then it uses a tiny function with SPSSINC TRANS to count the number of occurrences of the specified pattern in that string variable.
re.findall returns a list of all the matches of the pattern, and the len function returns the length of the lsit.


DATA LIST / ID 1 v1v34 (A34) solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
11111111111111123334414541254325444201
END DATA.
dataset name data.

begin program.
import re
def countpattern(v, pattern):
   return len(re.findall(pattern, v))
end program.

spssinc trans result=pattcount
/formula "countpattern(v1v34, '1111111')".




Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Andy W <[hidden email]>
To:        [hidden email],
Date:        03/19/2013 07:35 AM
Subject:        Re: [SPSSX-L] Detecting response patterns
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Here is how I would go about it. In a nutshell I concatenate all the values
into a string variable and use the string index functions to find the
patterns and take out that substring if the pattern is found. You just need
to loop the max numbers of patterns that can actually be found (I believe I
actually have one extra iteration for each loop except the last).

The code is repetitive enough that it could certainly be wrapped up in a
macro. As far as I can tell, your solutions you posted in the data list are
not correct, unless there are other stipulations. E.g. the first row only
has 1 instance of two three's in a row (being mutually exclusive), and the
fifth row has 3 examples of five consecutive 4's. I added an additional row
to show it will count multiple instances of 7 ones in a row.

*****************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
11111111111111123334414541254325444201
END DATA.

*SOLUTION 1.

string vall (A34).
compute vall = " ".
do repeat v = v1 to v34.
compute vall = CONCAT(RTRIM(vall),STRING(v,F1.0)).
end repeat.
exe.

string tempv (A34).
compute tempv = vall.
compute sol1 = 0.
*Search for 7 consectutive ones - then take out if found, then search again.
loop #i = 1 to 5.
   compute #find = INDEX(tempv,"1111111").
   if #find > 0 sol1 = sol1 + 1.
   if #find > 0 tempv = CONCAT(substr(tempv,1,#find -
1),RTRIM(substr(tempv,#find+7))).
end loop.
exe.


*SOLUTION 2.

compute tempv = vall.
compute sol2 = 0.
*Search for 5 consectutive 4s - then take out if found, then search again.
loop #i = 1 to 7.
   compute #find = INDEX(tempv,"44444").
   if #find > 0 sol2 = sol2 + 1.
   if #find > 0 tempv = CONCAT(substr(tempv,1,#find -
1),RTRIM(substr(tempv,#find+5))).
end loop.
exe.

*SOLUTION 3.

compute tempv = vall.
compute sol3 = 0.
*Search for 2 consectutive threes - then take out if found, then search
again.
loop #i = 1 to 17.
   compute #find = INDEX(tempv,"33").
   if #find > 0 sol3 = sol3 + 1.
   if #find > 0 tempv = CONCAT(substr(tempv,1,#find -
1),RTRIM(substr(tempv,#find+2))).
end loop.
exe.
*****************************************************.




-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Detecting-response-patterns-tp5718847p5718853.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

David Marso
Administrator
Nifty Jon!
--
Thought of the day:
IBM needs to take all these nifty snake handler tricks (python extensions) and build them into the syntax language proper.  Perhaps that would derail development resources from vital revolutionary endeavors such as making the Model Viewer remotely useful.
--
Jon K Peck wrote
Here is a simple Python solution using the SPSSINC TRANS extension command
from the SPSS Community site (www.ibm.com/developerworks/spssdevcentral).
First, it reads the variables as a single string.  Then it uses a tiny
function with SPSSINC TRANS to count the number of occurrences of the
specified pattern in that string variable.
re.findall returns a list of all the matches of the pattern, and the len
function returns the length of the lsit.


DATA LIST / ID 1 v1v34 (A34) solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
11111111111111123334414541254325444201
END DATA.
dataset name data.

begin program.
import re
def countpattern(v, pattern):
   return len(re.findall(pattern, v))
end program.

spssinc trans result=pattcount
/formula "countpattern(v1v34, '1111111')".




Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:   Andy W <[hidden email]>
To:     [hidden email],
Date:   03/19/2013 07:35 AM
Subject:        Re: [SPSSX-L] Detecting response patterns
Sent by:        "SPSSX(r) Discussion" <[hidden email]>



Here is how I would go about it. In a nutshell I concatenate all the
values
into a string variable and use the string index functions to find the
patterns and take out that substring if the pattern is found. You just
need
to loop the max numbers of patterns that can actually be found (I believe
I
actually have one extra iteration for each loop except the last).

The code is repetitive enough that it could certainly be wrapped up in a
macro. As far as I can tell, your solutions you posted in the data list
are
not correct, unless there are other stipulations. E.g. the first row only
has 1 instance of two three's in a row (being mutually exclusive), and the
fifth row has 3 examples of five consecutive 4's. I added an additional
row
to show it will count multiple instances of 7 ones in a row.

*****************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
11111111111111123334414541254325444201
END DATA.

*SOLUTION 1.

string vall (A34).
compute vall = " ".
do repeat v = v1 to v34.
compute vall = CONCAT(RTRIM(vall),STRING(v,F1.0)).
end repeat.
exe.

string tempv (A34).
compute tempv = vall.
compute sol1 = 0.
*Search for 7 consectutive ones - then take out if found, then search
again.
loop #i = 1 to 5.
    compute #find = INDEX(tempv,"1111111").
    if #find > 0 sol1 = sol1 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find -
1),RTRIM(substr(tempv,#find+7))).
end loop.
exe.


*SOLUTION 2.

compute tempv = vall.
compute sol2 = 0.
*Search for 5 consectutive 4s - then take out if found, then search again.
loop #i = 1 to 7.
    compute #find = INDEX(tempv,"44444").
    if #find > 0 sol2 = sol2 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find -
1),RTRIM(substr(tempv,#find+5))).
end loop.
exe.

*SOLUTION 3.

compute tempv = vall.
compute sol3 = 0.
*Search for 2 consectutive threes - then take out if found, then search
again.
loop #i = 1 to 17.
    compute #find = INDEX(tempv,"33").
    if #find > 0 sol3 = sol3 + 1.
    if #find > 0 tempv = CONCAT(substr(tempv,1,#find -
1),RTRIM(substr(tempv,#find+2))).
end loop.
exe.
*****************************************************.




-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Detecting-response-patterns-tp5718847p5718853.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

Andy W
I concur regex's within native SPSS code would be wonderful (and this is a perfect example of how they are really powerful compared to native SPSS functionality).

I also agree about jumping down the rabbit hole (I certainly waste too much time of my time here doing that ...)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

David Marso
Administrator
Rabbit Hole -->
Upside: The Mad Hatter cooks up some really powerful tea!
Downside:  That kwazy wabbit is hard to catch and too tough for stew!
-----
Andy W wrote
I concur regex's within native SPSS code would be wonderful (and this is a perfect example of how they are really powerful compared to native SPSS functionality).

I also agree about jumping down the rabbit hole (I certainly waste too much time of my time here doing that ...)
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

Rich Ulrich
In reply to this post by Snuffy Dog
I will point out that your counting rules alone are underspecified in
a particular way, by giving an example.

If there are six 3's in sequence, this could be considered (a) two sets
of three 3s; (b) four sets of  three 3s (starting in positions 1, 2, 3
or 4); or (c) one set of "at least three 3s."   The example does rule
out my solution b, but not my solution c.

I wonder if the Python solution allows for selecting between a and b.

--
Rich Ulrich


Date: Tue, 19 Mar 2013 21:06:38 +1100
From: [hidden email]
Subject: Detecting response patterns
To: [hidden email]

Dear Good Listers,
 
I am looking for code to calculate a value for variables Solution1, Solution2 and Solution3.
An input program has been included.  The way in which each solution variable is computed is
described below as are two rules which need to be followed.
 
I would be grateful for any assistance which can be provided. This is another tricky problem that is far beyond my humble skills.
 
Kind regards,
 
Jonahtan
 
**************************************************************************************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
END DATA.
 
**************************************************************************************************************************.
*   There are three solution variables and each records a specific response pattern which may occur more than once with mutiple reponse
*   patterns possible for the same ID. 
*
*   COMPUTE SOLUTION 1 - If there are more than 7 consecutive integers of value '1' repeated in succession across v1-v34 solution1 = 1
*   ELSE solution1 = 0.
*   COMPUTE SOLUTION 2 - If there are more than 5 consecutive integers of value '4' repeated in succession across v1-v34 solution2 = 1
*   ELSE solution2 = 0.
*   COMPUTE SOLUTION 3 - If there are more than 2 consecutive integers of value '3' repeated in succession across v1-v34 solution3 = 1
*   ELSE solution3 = 0.

*   RULE A: IF any of these consecutive sets of strings (e.g., 444444) occurs more than once for the same ID then the count of these response
*   sets should also be recorded as the solution variable value which is, in essence, a count of the occurances of each response pattern. For
*   example, for ID=8 there are 3 sets of consecutive integers of value '3' in seperate successions across v1-v34 and this is why solution = 3.
*  
*   RULE B: Also note that  it is possible for there to be more than one type of response pattern for each ID, for example ID=1 has two
*   response patterns.
**************************************************************************************************************************.
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

Jon K Peck
I don't know why anyone would want (b).  The Python code produces(a).  It would also allow you to do (c) just by making the pattern be
'333+'
in the formula subcommand.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Rich Ulrich <[hidden email]>
To:        [hidden email],
Date:        03/19/2013 11:51 AM
Subject:        Re: [SPSSX-L] Detecting response patterns
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I will point out that your counting rules alone are underspecified in
a particular way, by giving an example.

If there are six 3's in sequence, this could be considered (a) two sets
of three 3s; (b) four sets of  three 3s (starting in positions 1, 2, 3
or 4); or (c) one set of "at least three 3s."   The example does rule
out my solution b, but not my solution c.

I wonder if the Python solution allows for selecting between a and b.

--
Rich Ulrich


Date: Tue, 19 Mar 2013 21:06:38 +1100
From: [hidden email]
Subject: Detecting response patterns
To: [hidden email]

Dear Good Listers,
 
I am looking for code to calculate a value for variables Solution1, Solution2 and Solution3.
An input program has been included.  The way in which each solution variable is computed is
described below as are two rules which need to be followed.
 
I would be grateful for any assistance which can be provided. This is another tricky problem that is far beyond my humble skills.
 
Kind regards,
 
Jonahtan
 
**************************************************************************************************************************.
DATA LIST / ID 1 v1 TO v34 2-35 solution1 TO solution3 36-38.
BEGIN DATA
11111511111111223334414541254325444103
21111511111155222224414541254325444000
31111511113111222224414541254325444000
41111511211111225224414541254325444000
51111444444111222224444431254444444020
61111443444111222224444431254444444010
71111511511111222224414541254325444000
81111511333111223334414541333325444003
END DATA.

**************************************************************************************************************************.
*   There are three solution variables and each records a specific response pattern which may occur more than once with mutiple reponse

*   patterns possible for the same ID.  
*
*   COMPUTE SOLUTION 1 - If there are more than 7 consecutive integers of value '1' repeated in succession across v1-v34 solution1 = 1

*   ELSE solution1 = 0.
*   COMPUTE SOLUTION 2 - If there are more than 5 consecutive integers of value '4' repeated in succession across v1-v34 solution2 = 1

*   ELSE solution2 = 0.
*   COMPUTE SOLUTION 3 - If there are more than 2 consecutive integers of value '3' repeated in succession across v1-v34 solution3 = 1

*   ELSE solution3 = 0.
*  
*   RULE A: IF any of these consecutive sets of strings (e.g., 444444) occurs more than once for the same ID then the count of these response

*   sets should also be recorded as the solution variable value which is, in essence, a count of the occurances of each response pattern. For
*   example, for ID=8 there are 3 sets of consecutive integers of value '3' in seperate successions across v1-v34 and this is why solution = 3.
*  

*   RULE B: Also note that  it is possible for there to be more than one type of response pattern for each ID, for example ID=1 has two
*   response patterns.
**************************************************************************************************************************.

Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

Rich Ulrich
Thanks for the info about finding "3 or more".

You would want (b) if your task was to do something
with "every variable having 3 followed by at least two more
of them."   I remember being annoyed by a word processor
(used on data) that always did its searches this way.

--
Rich Ulrich


To: [hidden email]
CC: [hidden email]
Subject: Re: [SPSSX-L] Detecting response patterns
From: [hidden email]
Date: Tue, 19 Mar 2013 12:24:55 -0600

I don't know why anyone would want (b).  The Python code produces(a).  It would also allow you to do (c) just by making the pattern be
'333+'
in the formula subcommand.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Rich Ulrich <[hidden email]>
To:        [hidden email],
Date:        03/19/2013 11:51 AM
Subject:        Re: [SPSSX-L] Detecting response patterns
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I will point out that your counting rules alone are underspecified in
a particular way, by giving an example.

If there are six 3's in sequence, this could be considered (a) two sets
of three 3s; (b) four sets of  three 3s (starting in positions 1, 2, 3
or 4); or (c) one set of "at least three 3s."   The example does rule
out my solution b, but not my solution c.

I wonder if the Python solution allows for selecting between a and b.

--
Rich Ulrich
... snip original
Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

Snuffy Dog

Dear David, Andy, John and Rich,

 

I must say I'm surprised by the range of solutions provided. These were very useful indeed, I liked your solution Andy. You were quite right Andy I did make a mistake on solution 3 on the first row. This problem that I posted, which is designed to identify problematic response patterns in surveys, highlights a problem that Rich refers to, where depending on when you start the counting sequence its ambiguous as to how many instances are there in a parcelled string set. Does 3333 represent two strings of 3 or only one. Consistency artefacts in surveys are probably too infrequent to detect using latent variable methods and the usual methods of detecting common method variance are problematic and probably statistically underidentified in a confirmatory factor analytic context, thus the more basic counting approach I'm trying to operationalize.

 

I did want to add that I feel so very obliged for the extraordinary help I receive form people on this list and I don't know where many of you find the motivation and time. John and David have been helping me out with these sorts of problems for many years now. I don't want any of you to go to any trouble and I'm quite happy for people to ignore these questions if they look too burdensome. For the assistance that is provided I am very grateful. I used to program frequenrly in spss and at the time I started to get quite good at it, but reached a point with it about 5 years ago where I burnt out and couldn't write code anymore without going crazy. Now I find my skills deficient on those rare occasions when I can only get my work down by using a program like spss and yet have to push forward with it trying to hold the craziness at bay. Please don't go down any rabbit warrens on my account.

Best wishes,

Jonahtan.


On Wed, Mar 20, 2013 at 12:16 PM, Rich Ulrich <[hidden email]> wrote:
Thanks for the info about finding "3 or more".

You would want (b) if your task was to do something
with "every variable having 3 followed by at least two more
of them."   I remember being annoyed by a word processor
(used on data) that always did its searches this way.

--
Rich Ulrich


To: [hidden email]
CC: [hidden email]

Subject: Re: [SPSSX-L] Detecting response patterns
From: [hidden email]
Date: Tue, 19 Mar 2013 12:24:55 -0600


I don't know why anyone would want (b).  The Python code produces(a).  It would also allow you to do (c) just by making the pattern be
'333+'
in the formula subcommand.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: <a href="tel:720-342-5621" target="_blank" value="+17203425621">720-342-5621




From:        Rich Ulrich <[hidden email]>
To:        [hidden email],
Date:        03/19/2013 11:51 AM
Subject:        Re: [SPSSX-L] Detecting response patterns
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I will point out that your counting rules alone are underspecified in
a particular way, by giving an example.

If there are six 3's in sequence, this could be considered (a) two sets
of three 3s; (b) four sets of  three 3s (starting in positions 1, 2, 3
or 4); or (c) one set of "at least three 3s."   The example does rule
out my solution b, but not my solution c.

I wonder if the Python solution allows for selecting between a and b.

--
Rich Ulrich
... snip original

Reply | Threaded
Open this post in threaded view
|

Re: Detecting response patterns

David Marso
Administrator
You are very welcome Jonahtan (very interesting name I initially parsed it as Jonathan).
BTW : There is no H in Jon (That's why I sometimes call him JoNoH ;-)
Anyway, as you have seen there are usually many ways to skin a cat.
You would be amazed at how I might start with something coded a given way and end up completely different after a few iterations.  I have one mess which started as a SLOW 4 level nested loop in matrix and presently is one line of really FAST code.  Had to analyze the hell out of it but then BAM!
Re Time and motivation:  I have always had an instinctive desire to teach others.  
Time?  My posts are typically 5 minute breaks from coding insanity (I can rarely stare at MATRIX and MACRO code for more than 2-3 hours without taking a break so I drop by and throw down a few hints).
Yeah, it's definitely use it or lose it!  For awhile (about 7 years ago) I didn't have access to a computer to run SPSS and my skills became a bit rusty for awhile (that's after 10+ years of really intense mastery on a daily basis).  Now they are as solid as they ever were (maybe even better than when I was at SPSS consulting).  
I suspect my scripting skills might need a bit of polish since I haven't need to work with them much lately.
Looking forward to diving face first into python and GPL in the coming weeks for another project.  
BUT think of it this way.  It's like riding a bicycle.  You never really completely forget how to do it.
Rabbit holes can sometimes be interesting!
David

Snuffy Dog wrote
Dear David, Andy, John and Rich,



I must say I'm surprised by the range of solutions provided. These were
very useful indeed, I liked your solution Andy. You were quite right Andy I
did make a mistake on solution 3 on the first row. This problem that I
posted, which is designed to identify problematic response patterns in
surveys, highlights a problem that Rich refers to, where depending on when
you start the counting sequence its ambiguous as to how many instances are
there in a parcelled string set. Does 3333 represent two strings of 3 or
only one. Consistency artefacts in surveys are probably too infrequent to
detect using latent variable methods and the usual methods of detecting
common method variance are problematic and probably statistically
underidentified in a confirmatory factor analytic context, thus the more
basic counting approach I'm trying to operationalize.



I did want to add that I feel so very obliged for the extraordinary help I
receive form people on this list and I don't know where many of you find
the motivation and time. John and David have been helping me out with these
sorts of problems for many years now. I don't want any of you to go to any
trouble and I'm quite happy for people to ignore these questions if they
look too burdensome. For the assistance that is provided I am very
grateful. I used to program frequenrly in spss and at the time I started to
get quite good at it, but reached a point with it about 5 years ago where I
burnt out and couldn't write code anymore without going crazy. Now I find
my skills deficient on those rare occasions when I can only get my work
down by using a program like spss and yet have to push forward with it
trying to hold the craziness at bay. Please don't go down any rabbit
warrens on my account.

Best wishes,

Jonahtan.

On Wed, Mar 20, 2013 at 12:16 PM, Rich Ulrich <[hidden email]> wrote:

> Thanks for the info about finding "3 or more".
>
> You would want (b) if your task was to do something
> with "every variable having 3 followed by at least two more
> of them."   I remember being annoyed by a word processor
> (used on data) that always did its searches this way.
>
> --
> Rich Ulrich
>
> ------------------------------
> To: [hidden email]
> CC: [hidden email]
>
> Subject: Re: [SPSSX-L] Detecting response patterns
> From: [hidden email]
> Date: Tue, 19 Mar 2013 12:24:55 -0600
>
>
> I don't know why anyone would want (b).  The Python code produces(a).  It
> would also allow you to do (c) just by making the pattern be
> '333+'
> in the formula subcommand.
>
>
> Jon Peck (no "h") aka Kim
> Senior Software Engineer, IBM
> [hidden email]
> phone: 720-342-5621
>
>
>
>
> From:        Rich Ulrich <[hidden email]>
> To:        [hidden email],
> Date:        03/19/2013 11:51 AM
> Subject:        Re: [SPSSX-L] Detecting response patterns
> Sent by:        "SPSSX(r) Discussion" <[hidden email]>
> ------------------------------
>
>
>
> I will point out that your counting rules alone are underspecified in
> a particular way, by giving an example.
>
> If there are six 3's in sequence, this could be considered (a) two sets
> of three 3s; (b) four sets of  three 3s (starting in positions 1, 2, 3
> or 4); or (c) one set of "at least three 3s."   The example does rule
> out my solution b, but not my solution c.
>
> I wonder if the Python solution allows for selecting between a and b.
>
> --
> Rich Ulrich
>  ... snip original
>
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"