How to calculate missing values in a string with multiple responses and semi-colons

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

huang jialin
Hi Rick,

Yes. What I want to do is to count the missing values before the point the responses stop. 

Thanks.

Sincerely,
Jialin 


On Mon, Nov 28, 2011 at 3:53 PM, Rick Oliver <[hidden email]> wrote:
I guess I misunderstood what you were trying to accomplish. I thought you wanted to know the point at which they stopped responding.

What is it that you want? The first solution I sent you returns total missing values. It could be adapted to count missing values prior to the point at which responses stop. Is that what you want?


Rick Oliver
Senior Information Developer
Business Analytics (SPSS)
E-mail: [hidden email]





From:        huang jialin <[hidden email]>
To:        Rick Oliver/Chicago/IBM@IBMUS
Cc:        [hidden email]
Date:        11/28/2011 03:49 PM
Subject:        Re: How to calculate missing values in a string with multiple responses and semi-colons




Hi Rick,

Sorry for misunderstanding. I think you are right. But it includes the missing values in the middle of string, how to count the missing values alone?

Thanks.

Sincerely,
Jialin


On Mon, Nov 28, 2011 at 3:43 PM, Rick Oliver <[hidden email]> wrote:
No, this syntax does not count them separately, it counts each contiguous sequence of numeric digits as a single value. Counting them separately would actually be easier.



From:        
huang jialin <[hidden email]>
To:        
[hidden email]
Date:        11/28/2011 03:23 PM
Subject:        Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




Rick,

Thanks for your email. I tried your solution, but it does not do the exact things I want. As some items have multiple responses, the syntax you have break them down and count them separately.

Thank you.

Sincerely,
Jialin 


On Mon, Nov 28, 2011 at 3:14 PM, Rick Oliver <
[hidden email]> wrote:
data list list /stringvar (a120).

begin data

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

end data.

string #temp (a120).

compute #temp=replace(stringvar, ";"," ").

compute count=0.

if char.length(ltrim(#temp))>0 count=1.

loop #i=1 to char.length(#temp).

if substr(#temp, #i, 1)=' ' count=count+1.

end loop.

execute.




Rick Oliver
Senior Information Developer

Business Analytics (SPSS)
E-mail:
[hidden email]




From:        
Rick Oliver/Chicago/IBM
To:        
huang jialin <[hidden email]>, [hidden email]
Date:        
11/28/2011 01:06 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons




Oops. That's not right. never mind.


Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
[hidden email]




From:        
Rick Oliver/Chicago/IBM
To:        
huang jialin <[hidden email]>
Cc:        
[hidden email]
Date:        
11/28/2011 12:53 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons




Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
[hidden email]

****sample data***.

data list list /stringvar (a120).

begin data

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

end data.

write outfile='c:\temp\temp.txt' /stringvar.

execute.

***real code starts here. just read the original text data file this way***.

data list list (";")  file='c:\temp\temp.txt' /numvar1 to numvar50.

compute numvalid=nvalid(numvar1 to numvar50).

execute.





From:        
huang jialin <[hidden email]>
To:        
[hidden email]
Date:        
11/28/2011 12:19 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




Hi Gene,

Thanks for your quick response. I will rephrase my question. That is, I only want to count the missing values before the last integer. Am I making sense now?

Sorry for the confusion.

Sincerely,
Jialin 

On Mon, Nov 28, 2011 at 12:08 PM, Gene Maguin <
[hidden email]> wrote:
Ok, I missed accounting for the situation where the first item is missing. This will do that.

 

Compute nmissing=0.

If (substr(V,1,1) eq ‘;’) nmissing=nmissing+1. 

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

I don’t know what you mean by this: ‘Second, I do not want to count the missing values when there is no response at all’. How do you tell the difference between a missing value and ‘no response at all’.

 

 

 

From: huang jialin [mailto:[hidden email]]
Sent:
Monday, November 28, 2011 12:54 PM
To:
Gene Maguin
Cc:
[hidden email]
Subject:
Re: How to calculate missing values in a string with multiple responses and semi-colons

 

Hi Gene,

 

Thanks for your email. There are still two questions needed to be solved. First, if the first item is missing, there is only one semicolon in the beginning of the string. Thus, it may not fit the comparison of pair of semicolons. Second, I do not want to count the missing values when there is no response at all. How can I do it?

 

Thank you very much.

 

Sincerely,

Jialin

 

 

On Mon, Nov 28, 2011 at 11:42 AM, Gene Maguin <[hidden email]> wrote:

It looks like adjacent semicolons (;;) mean a missing data value. I think all you need to do is to count pairs of semicolons. Let’s say the that the variable, V, is A150.

 

Compute nmissing=0.

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

>>If you copy this text, make sure the quotes are straight and not curly. I think spss does not like curly.

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
Monday, November 28, 2011 12:23 PM
To:
[hidden email]
Subject:
How to calculate missing values in a string with multiple responses and semi-colons

 

Hello,

 

I have a variable in a dataset formatted as following:

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

 

There is 50 items in total, and they are separated by 49 semi-colons. For certain items, they contains multiple responses. I want to calculate how many items was missed before the responses stopped.

 

I tried to use length(trim(var)) to calculate the length of string, but it only turned out to be the number of responses. It is the same as using CHAR.LENGTH(var).

 

How can I got the number of missing items? 

 

I would appreciate your helps. 

 

Sincerely,

Jialin

 

 


Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

Rick Oliver-3
There's probably a better way...


data list list /stringvar (a120).
begin data
1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;4;1;4;2;1;5;;;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2
2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;
end data.
write outfile='c:\temp\temp.txt' /stringvar.
execute.
data list list (";") file='c:\temp\temp.txt' /var1 to var50.
execute.
compute #totalmiss=nmiss(var1 to var50).
vector v=var1 to var50.
loop #i=1 to 50.
if not(missing(v(#i))) #lastvar=#i.
end loop.
compute finalmiss=#lastvar-(50-#totalmiss).
execute.


Rick Oliver
Senior Information Developer
Business Analytics (SPSS)
E-mail: [hidden email]






From:        huang jialin <[hidden email]>
To:        [hidden email]
Date:        11/28/2011 03:58 PM
Subject:        Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hi Rick,

Yes. What I want to do is to count the missing values before the point the responses stop. 

Thanks.

Sincerely,
Jialin 


On Mon, Nov 28, 2011 at 3:53 PM, Rick Oliver <oliverr@...> wrote:
I guess I misunderstood what you were trying to accomplish. I thought you wanted to know the point at which they stopped responding.

What is it that you want? The first solution I sent you returns total missing values. It could be adapted to count missing values prior to the point at which responses stop. Is that what you want?



Rick Oliver
Senior Information Developer
Business Analytics (SPSS)
E-mail:
oliverr@...




From:        huang jialin <huangpsych@...>
To:        
Rick Oliver/Chicago/IBM@IBMUS
Cc:        
[hidden email]
Date:        
11/28/2011 03:49 PM
Subject:        Re: How to calculate missing values in a string with multiple responses and semi-colons




Hi Rick,

Sorry for misunderstanding. I think you are right. But it includes the missing values in the middle of string, how to count the missing values alone?

Thanks.

Sincerely,
Jialin


On Mon, Nov 28, 2011 at 3:43 PM, Rick Oliver <
oliverr@...> wrote:
No, this syntax does not count them separately, it counts each contiguous sequence of numeric digits as a single value. Counting them separately would actually be easier.




From:        
huang jialin <huangpsych@...>
To:        
[hidden email]
Date:        
11/28/2011 03:23 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




Rick,

Thanks for your email. I tried your solution, but it does not do the exact things I want. As some items have multiple responses, the syntax you have break them down and count them separately.

Thank you.

Sincerely,
Jialin 


On Mon, Nov 28, 2011 at 3:14 PM, Rick Oliver <
oliverr@...> wrote:
data list list /stringvar (a120).

begin data

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

end data.

string #temp (a120).

compute #temp=replace(stringvar, ";"," ").

compute count=0.

if char.length(ltrim(#temp))>0 count=1.

loop #i=1 to char.length(#temp).

if substr(#temp, #i, 1)=' ' count=count+1.

end loop.

execute.




Rick Oliver
Senior Information Developer

Business Analytics (SPSS)
E-mail:
oliverr@...




From:        
Rick Oliver/Chicago/IBM
To:        
huang jialin <huangpsych@...>, [hidden email]
Date:        
11/28/2011 01:06 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons




Oops. That's not right. never mind.


Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
oliverr@...




From:        
Rick Oliver/Chicago/IBM
To:        
huang jialin <huangpsych@...>
Cc:        
[hidden email]
Date:        
11/28/2011 12:53 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons




Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
oliverr@...

****sample data***.

data list list /stringvar (a120).

begin data

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

end data.

write outfile='c:\temp\temp.txt' /stringvar.

execute.

***real code starts here. just read the original text data file this way***.

data list list (";")  file='c:\temp\temp.txt' /numvar1 to numvar50.

compute numvalid=nvalid(numvar1 to numvar50).

execute.





From:        
huang jialin <huangpsych@...>
To:        
[hidden email]
Date:        
11/28/2011 12:19 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




Hi Gene,

Thanks for your quick response. I will rephrase my question. That is, I only want to count the missing values before the last integer. Am I making sense now?

Sorry for the confusion.

Sincerely,
Jialin 

On Mon, Nov 28, 2011 at 12:08 PM, Gene Maguin <
emaguin@...> wrote:
Ok, I missed accounting for the situation where the first item is missing. This will do that.

 

Compute nmissing=0.

If (substr(V,1,1) eq ‘;’) nmissing=nmissing+1. 

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

I don’t know what you mean by this: ‘Second, I do not want to count the missing values when there is no response at all’. How do you tell the difference between a missing value and ‘no response at all’.

 

 

 

From: huang jialin [mailto:huangpsych@...]
Sent:
Monday, November 28, 2011 12:54 PM
To:
Gene Maguin
Cc:
[hidden email]
Subject:
Re: How to calculate missing values in a string with multiple responses and semi-colons

 

Hi Gene,

 

Thanks for your email. There are still two questions needed to be solved. First, if the first item is missing, there is only one semicolon in the beginning of the string. Thus, it may not fit the comparison of pair of semicolons. Second, I do not want to count the missing values when there is no response at all. How can I do it?

 

Thank you very much.

 

Sincerely,

Jialin

 

 

On Mon, Nov 28, 2011 at 11:42 AM, Gene Maguin <emaguin@...> wrote:

It looks like adjacent semicolons (;;) mean a missing data value. I think all you need to do is to count pairs of semicolons. Let’s say the that the variable, V, is A150.

 

Compute nmissing=0.

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

>>If you copy this text, make sure the quotes are straight and not curly. I think spss does not like curly.

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
Monday, November 28, 2011 12:23 PM
To:
[hidden email]
Subject:
How to calculate missing values in a string with multiple responses and semi-colons

 

Hello,

 

I have a variable in a dataset formatted as following:

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

 

There is 50 items in total, and they are separated by 49 semi-colons. For certain items, they contains multiple responses. I want to calculate how many items was missed before the responses stopped.

 

I tried to use length(trim(var)) to calculate the length of string, but it only turned out to be the number of responses. It is the same as using CHAR.LENGTH(var).

 

How can I got the number of missing items? 

 

I would appreciate your helps. 

 

Sincerely,

Jialin

 

 

Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

Maguin, Eugene
In reply to this post by huang jialin

You may have already gotten a good solution to your question. If not, I offer a revised solution.

 

When I tested my proposal, I found that it did not work. I found the problem and offer a solution that looks to be correct. The problem was that I needed to trim first trailing blanks and then trailing semicolons—a double rtrim operation.

 

String #v(a120).

Compute #v=rtrim(rtrim(V),';').

Compute nmissing=0.

If (substr(#V,1,1) eq ';') nmissing=nmissing+1. 

Compute #j=char.length(#v).

Loop #i=1 to #j.

If (substr(#V,#i,2) eq ';;') nmissing=nmissing+1.

End loop.

execute.

 

Gene Maguin

 

 

From: huang jialin [mailto:[hidden email]]
Sent: Monday, November 28, 2011 1:15 PM
To: Gene Maguin
Cc: [hidden email]
Subject: Re: How to calculate missing values in a string with multiple responses and semi-colons

 

Hi Gene,

 

Thanks for your quick response. I will rephrase my question. That is, I only want to count the missing values before the last integer. Am I making sense now?

 

Sorry for the confusion.

 

Sincerely,

Jialin 

On Mon, Nov 28, 2011 at 12:08 PM, Gene Maguin <[hidden email]> wrote:

Ok, I missed accounting for the situation where the first item is missing. This will do that.

 

Compute nmissing=0.

If (substr(V,1,1) eq ‘;’) nmissing=nmissing+1. 

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

I don’t know what you mean by this: ‘Second, I do not want to count the missing values when there is no response at all’. How do you tell the difference between a missing value and ‘no response at all’.

 

 

 

From: huang jialin [mailto:[hidden email]]
Sent: Monday, November 28, 2011 12:54 PM
To: Gene Maguin
Cc: [hidden email]
Subject: Re: How to calculate missing values in a string with multiple responses and semi-colons

 

Hi Gene,

 

Thanks for your email. There are still two questions needed to be solved. First, if the first item is missing, there is only one semicolon in the beginning of the string. Thus, it may not fit the comparison of pair of semicolons. Second, I do not want to count the missing values when there is no response at all. How can I do it?

 

Thank you very much.

 

Sincerely,

Jialin

 

 

On Mon, Nov 28, 2011 at 11:42 AM, Gene Maguin <[hidden email]> wrote:

It looks like adjacent semicolons (;;) mean a missing data value. I think all you need to do is to count pairs of semicolons. Let’s say the that the variable, V, is A150.

 

Compute nmissing=0.

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

>>If you copy this text, make sure the quotes are straight and not curly. I think spss does not like curly.

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent: Monday, November 28, 2011 12:23 PM
To: [hidden email]
Subject: How to calculate missing values in a string with multiple responses and semi-colons

 

Hello,

 

I have a variable in a dataset formatted as following:

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

 

There is 50 items in total, and they are separated by 49 semi-colons. For certain items, they contains multiple responses. I want to calculate how many items was missed before the responses stopped.

 

I tried to use length(trim(var)) to calculate the length of string, but it only turned out to be the number of responses. It is the same as using CHAR.LENGTH(var).

 

How can I got the number of missing items? 

 

I would appreciate your helps. 

 

Sincerely,

Jialin

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

huang jialin
In reply to this post by Rick Oliver-3
Hi Rick,

It works. Thanks a lot.

Sincerely,
Jialin 

On Mon, Nov 28, 2011 at 4:24 PM, Rick Oliver <[hidden email]> wrote:
There's probably a better way...


data list list /stringvar (a120).
begin data
1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;4;1;4;2;1;5;;;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2
2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;
end data.
write outfile='c:\temp\temp.txt' /stringvar.
execute.
data list list (";") file='c:\temp\temp.txt' /var1 to var50.
execute.
compute #totalmiss=nmiss(var1 to var50).
vector v=var1 to var50.
loop #i=1 to 50.
if not(missing(v(#i))) #lastvar=#i.
end loop.
compute finalmiss=#lastvar-(50-#totalmiss).
execute.


Rick Oliver
Senior Information Developer
Business Analytics (SPSS)
E-mail: [hidden email]






From:        huang jialin <[hidden email]>
To:        [hidden email]
Date:        11/28/2011 03:58 PM
Subject:        Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hi Rick,

Yes. What I want to do is to count the missing values before the point the responses stop. 

Thanks.

Sincerely,
Jialin 


On Mon, Nov 28, 2011 at 3:53 PM, Rick Oliver <[hidden email]> wrote:
I guess I misunderstood what you were trying to accomplish. I thought you wanted to know the point at which they stopped responding.

What is it that you want? The first solution I sent you returns total missing values. It could be adapted to count missing values prior to the point at which responses stop. Is that what you want?



Rick Oliver
Senior Information Developer
Business Analytics (SPSS)
E-mail:
[hidden email]




From:        huang jialin <[hidden email]>
To:        
Rick Oliver/Chicago/IBM@IBMUS
Cc:        
[hidden email]
Date:        
11/28/2011 03:49 PM
Subject:        Re: How to calculate missing values in a string with multiple responses and semi-colons




Hi Rick,

Sorry for misunderstanding. I think you are right. But it includes the missing values in the middle of string, how to count the missing values alone?

Thanks.

Sincerely,
Jialin


On Mon, Nov 28, 2011 at 3:43 PM, Rick Oliver <
[hidden email]> wrote:
No, this syntax does not count them separately, it counts each contiguous sequence of numeric digits as a single value. Counting them separately would actually be easier.




From:        
huang jialin <[hidden email]>
To:        
[hidden email]
Date:        
11/28/2011 03:23 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




Rick,

Thanks for your email. I tried your solution, but it does not do the exact things I want. As some items have multiple responses, the syntax you have break them down and count them separately.

Thank you.

Sincerely,
Jialin 


On Mon, Nov 28, 2011 at 3:14 PM, Rick Oliver <
[hidden email]> wrote:
data list list /stringvar (a120).

begin data

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

end data.

string #temp (a120).

compute #temp=replace(stringvar, ";"," ").

compute count=0.

if char.length(ltrim(#temp))>0 count=1.

loop #i=1 to char.length(#temp).

if substr(#temp, #i, 1)=' ' count=count+1.

end loop.

execute.




Rick Oliver
Senior Information Developer

Business Analytics (SPSS)
E-mail:
[hidden email]




From:        
Rick Oliver/Chicago/IBM
To:        
huang jialin <[hidden email]>, [hidden email]
Date:        
11/28/2011 01:06 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons




Oops. That's not right. never mind.


Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
[hidden email]




From:        
Rick Oliver/Chicago/IBM
To:        
huang jialin <[hidden email]>
Cc:        
[hidden email]
Date:        
11/28/2011 12:53 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons




Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
[hidden email]

****sample data***.

data list list /stringvar (a120).

begin data

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

end data.

write outfile='c:\temp\temp.txt' /stringvar.

execute.

***real code starts here. just read the original text data file this way***.

data list list (";")  file='c:\temp\temp.txt' /numvar1 to numvar50.

compute numvalid=nvalid(numvar1 to numvar50).

execute.





From:        
huang jialin <[hidden email]>
To:        
[hidden email]
Date:        
11/28/2011 12:19 PM
Subject:        
Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




Hi Gene,

Thanks for your quick response. I will rephrase my question. That is, I only want to count the missing values before the last integer. Am I making sense now?

Sorry for the confusion.

Sincerely,
Jialin 

On Mon, Nov 28, 2011 at 12:08 PM, Gene Maguin <
[hidden email]> wrote:
Ok, I missed accounting for the situation where the first item is missing. This will do that.

 

Compute nmissing=0.

If (substr(V,1,1) eq ‘;’) nmissing=nmissing+1. 

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

I don’t know what you mean by this: ‘Second, I do not want to count the missing values when there is no response at all’. How do you tell the difference between a missing value and ‘no response at all’.

 

 

 

From: huang jialin [mailto:[hidden email]]
Sent:
Monday, November 28, 2011 12:54 PM
To:
Gene Maguin
Cc:
[hidden email]
Subject:
Re: How to calculate missing values in a string with multiple responses and semi-colons

 

Hi Gene,

 

Thanks for your email. There are still two questions needed to be solved. First, if the first item is missing, there is only one semicolon in the beginning of the string. Thus, it may not fit the comparison of pair of semicolons. Second, I do not want to count the missing values when there is no response at all. How can I do it?

 

Thank you very much.

 

Sincerely,

Jialin

 

 

On Mon, Nov 28, 2011 at 11:42 AM, Gene Maguin <[hidden email]> wrote:

It looks like adjacent semicolons (;;) mean a missing data value. I think all you need to do is to count pairs of semicolons. Let’s say the that the variable, V, is A150.

 

Compute nmissing=0.

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

>>If you copy this text, make sure the quotes are straight and not curly. I think spss does not like curly.

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
Monday, November 28, 2011 12:23 PM
To:
[hidden email]
Subject:
How to calculate missing values in a string with multiple responses and semi-colons

 

Hello,

 

I have a variable in a dataset formatted as following:

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

 

There is 50 items in total, and they are separated by 49 semi-colons. For certain items, they contains multiple responses. I want to calculate how many items was missed before the responses stopped.

 

I tried to use length(trim(var)) to calculate the length of string, but it only turned out to be the number of responses. It is the same as using CHAR.LENGTH(var).

 

How can I got the number of missing items? 

 

I would appreciate your helps. 

 

Sincerely,

Jialin

 

 


Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

huang jialin
In reply to this post by Maguin, Eugene
Hi Gene,

This solution works for me. Thanks for your helps.

Sincerely,
Jialin 


On Mon, Nov 28, 2011 at 4:25 PM, Gene Maguin <[hidden email]> wrote:

You may have already gotten a good solution to your question. If not, I offer a revised solution.

 

When I tested my proposal, I found that it did not work. I found the problem and offer a solution that looks to be correct. The problem was that I needed to trim first trailing blanks and then trailing semicolons—a double rtrim operation.

 

String #v(a120).

Compute #v=rtrim(rtrim(V),';').

Compute nmissing=0.

If (substr(#V,1,1) eq ';') nmissing=nmissing+1. 

Compute #j=char.length(#v).

Loop #i=1 to #j.

If (substr(#V,#i,2) eq ';;') nmissing=nmissing+1.

End loop.

execute.

 

Gene Maguin

 

 

From: huang jialin [mailto:[hidden email]]
Sent: Monday, November 28, 2011 1:15 PM


To: Gene Maguin
Cc: [hidden email]
Subject: Re: How to calculate missing values in a string with multiple responses and semi-colons

 

Hi Gene,

 

Thanks for your quick response. I will rephrase my question. That is, I only want to count the missing values before the last integer. Am I making sense now?

 

Sorry for the confusion.

 

Sincerely,

Jialin 

On Mon, Nov 28, 2011 at 12:08 PM, Gene Maguin <[hidden email]> wrote:

Ok, I missed accounting for the situation where the first item is missing. This will do that.

 

Compute nmissing=0.

If (substr(V,1,1) eq ‘;’) nmissing=nmissing+1. 

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

I don’t know what you mean by this: ‘Second, I do not want to count the missing values when there is no response at all’. How do you tell the difference between a missing value and ‘no response at all’.

 

 

 

From: huang jialin [mailto:[hidden email]]
Sent: Monday, November 28, 2011 12:54 PM
To: Gene Maguin
Cc: [hidden email]
Subject: Re: How to calculate missing values in a string with multiple responses and semi-colons

 

Hi Gene,

 

Thanks for your email. There are still two questions needed to be solved. First, if the first item is missing, there is only one semicolon in the beginning of the string. Thus, it may not fit the comparison of pair of semicolons. Second, I do not want to count the missing values when there is no response at all. How can I do it?

 

Thank you very much.

 

Sincerely,

Jialin

 

 

On Mon, Nov 28, 2011 at 11:42 AM, Gene Maguin <[hidden email]> wrote:

It looks like adjacent semicolons (;;) mean a missing data value. I think all you need to do is to count pairs of semicolons. Let’s say the that the variable, V, is A150.

 

Compute nmissing=0.

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+  compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘  ‘).

+  break.

End if.

End loop.

 

>>If you copy this text, make sure the quotes are straight and not curly. I think spss does not like curly.

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent: Monday, November 28, 2011 12:23 PM
To: [hidden email]
Subject: How to calculate missing values in a string with multiple responses and semi-colons

 

Hello,

 

I have a variable in a dataset formatted as following:

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

 

There is 50 items in total, and they are separated by 49 semi-colons. For certain items, they contains multiple responses. I want to calculate how many items was missed before the responses stopped.

 

I tried to use length(trim(var)) to calculate the length of string, but it only turned out to be the number of responses. It is the same as using CHAR.LENGTH(var).

 

How can I got the number of missing items? 

 

I would appreciate your helps. 

 

Sincerely,

Jialin

 

 

 


Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

Art Kendall
In reply to this post by Art Kendall
side note: if you review the whole conversation, you can see the value of the list.� As time went on, the group worked out an understanding of what your question is.

Why the earlier post? Rick Oliver built on my suggestion that the string be written out to a disk file.� He pointed out that DATA LIST LIST could specify a separator when reading a text file. Something I had forgotten.� I pointed out that in fact there could be 2 different separator characters, and asked if the list knew of a way to specify a tab character as the separator character.

What I now think you question is:
There is a string of semicolon separated fields.� The fields can contain 0 to 3 digits. Each field contains 3 variables in a multiple response set.
You actually have 50 sets of 3 variables.
You want to have 3 kinds of user missing values.
� � � 1 kind e.g., -999 to indicate missing at the end, i.e., trailing
� � � another kind e.g., -998 to indicate a variable that is null but is within the set of variables with responses not trailing
� � a third kind e.g., -997 to indicate a variable set that has some content but but fewer than 3 responses in a set


Maybe your syntax would include something like
Missing values v1 to v150 (-999 thru -1).
value labels v1 to v150
� � � 1 'something'
� � � 2 'something else'
� � � 3 'another label'
� -999 'did not get to question'
� -998 'read but not answered'
� -997 'partially answered question'.

Is this understanding correct?

Do you now have a working set of syntax?

Art Kendall
Social Research Consultants
On 11/28/2011 4:00 PM, huang jialin wrote:
Hi Art,

Thanks for the reply. Would you explain how it relates to my question?�

BTW, there are goofy characters in the email. Is it because my email?

Thanks.

Jialin

On Mon, Nov 28, 2011 at 1:49 PM, Art Kendall <[hidden email]> wrote:
Nice one. I forgot that one can specify the delimiter. I can see how to use semicolon, comma, pipe or most likely any printing character,
is there a way to specify a tab (ctrl-h in ASCII)?


Lo and behold!!
this works even with 2 specified separators this syntax if for tilda and x  but x and X works too.

new file.
set blanks 0.
data list list ("~","x")Â file='c:\project\tilda x separated.txt' /numvar1 to numvar3.
compute numvalid=nvalid(numvar1 to numvar3).
compute ID = $casenum.
list /variables = ID numvar1 to numvar3.
-----
the data
-----
123x456x789
xxx
123x456x789
123xx
123~456~789
~~~
123~456x789
123~x


Art Kendall
Social Research Consultants


On 11/28/2011 1:53 PM, Rick Oliver wrote:
Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]


****sample data***.
data list list /stringvar (a120).
begin data
1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2
2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;
end data.
write outfile='c:\temp\temp.txt' /stringvar.
execute.
***real code starts here. just read the original text data file this way***.
data list list (";") Â file='c:\temp\temp.txt' /numvar1 to numvar50.
compute numvalid=nvalid(numvar1 to numvar50).
execute.



From: Â Â Â Â huang jialin [hidden email]
To: Â Â Â Â [hidden email]
Date: Â Â Â Â 11/28/2011 12:19 PM
Subject:     Re: How to calculate missing values in a string with multiple        responses and semi-colons
Sent by: Â Â Â Â "SPSSX(r) Discussion" [hidden email]




Hi Gene,

Thanks for your quick response. I will rephrase my question. That is, I only want to count the missing values before the last integer. Am I making sense now?

Sorry for the confusion.

Sincerely,
JialinÂ

On Mon, Nov 28, 2011 at 12:08 PM, Gene Maguin <[hidden email]> wrote:
Ok, I missed accounting for the situation where the first item is missing. This will do that.

Â

Compute nmissing=0.

If (substr(V,1,1) eq ‘;’) nmissing=nmissing+1.Â

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+Â compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘Â ‘).

+Â break.

End if.

End loop.

Â

I don’t know what you mean by this: ‘Second, I do not want to count the missing values when there is no response at all’. How do you tell the difference between a missing value and ‘no response at all’.

Â

Â

Â

From: huang jialin [mailto:[hidden email]]
Sent:
Monday, November 28, 2011 12:54 PM
To:
Gene Maguin
Cc:
[hidden email]
Subject:
Re: How to calculate missing values in a string with multiple responses and semi-colons

Â

Hi Gene,

Â

Thanks for your email. There are still two questions needed to be solved. First, if the first item is missing, there is only one semicolon in the beginning of the string. Thus, it may not fit the comparison of pair of semicolons. Second, I do not want to count the missing values when there is no response at all. How can I do it?

Â

Thank you very much.

Â

Sincerely,

Jialin

Â

Â

On Mon, Nov 28, 2011 at 11:42 AM, Gene Maguin <[hidden email]> wrote:

It looks like adjacent semicolons (;;) mean a missing data value. I think all you need to do is to count pairs of semicolons. Let’s say the that the variable, V, is A150.

Â

Compute nmissing=0.

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+Â compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘Â ‘).

+Â break.

End if.

End loop.

Â

>>If you copy this text, make sure the quotes are straight and not curly. I think spss does not like curly.

Â

Â

Â

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
Monday, November 28, 2011 12:23 PM
To:
[hidden email]
Subject:
How to calculate missing values in a string with multiple responses and semi-colons

Â

Hello,

Â

I have a variable in a dataset formatted as following:

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

Â

There is 50 items in total, and they are separated by 49 semi-colons. For certain items, they contains multiple responses. I want to calculate how many items was missed before the responses stopped.

Â

I tried to use length(trim(var)) to calculate the length of string, but it only turned out to be the number of responses. It is the same as using CHAR.LENGTH(var).

Â

How can I got the number of missing items?Â

Â

I would appreciate your helps.Â

Â

Sincerely,

Jialin

Â

Â

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

huang jialin
Hi Art,

Thanks for sharing your thoughts. I have a working syntax now.

I think you are totally correct in terms of three types of missing values, although I do not intend to distinguish the second (read but not answered) and third (partially answered). Because most items have one single response, except those items with three responses. There is no partial scoring for items with multiple responses. They are either 0 or 1 scored.   

Your comments are valuable as well. It gives me further understanding of the dataset. 

Thank you again.

Sincerely,
Jialin 


On Tue, Nov 29, 2011 at 6:45 AM, Art Kendall <[hidden email]> wrote:
side note: if you review the whole conversation, you can see the value of the list.  As time went on, the group worked out an understanding of what your question is.

Why the earlier post? Rick Oliver built on my suggestion that the string be written out to a disk file.  He pointed out that DATA LIST LIST could specify a separator when reading a text file. Something I had forgotten.  I pointed out that in fact there could be 2 different separator characters, and asked if the list knew of a way to specify a tab character as the separator character.

What I now think you question is:
There is a string of semicolon separated fields.  The fields can contain 0 to 3 digits. Each field contains 3 variables in a multiple response set.
You actually have 50 sets of 3 variables.
You want to have 3 kinds of user missing values.
    1 kind e.g., -999 to indicate missing at the end, i.e., trailing
    another kind e.g., -998 to indicate a variable that is null but is within the set of variables with responses not trailing
   a third kind e.g., -997 to indicate a variable set that has some content but but fewer than 3 responses in a set


Maybe your syntax would include something like
Missing values v1 to v150 (-999 thru -1).
value labels v1 to v150
    1 'something'
    2 'something else'
    3 'another label'
 -999 'did not get to question'
 -998 'read but not answered'
 -997 'partially answered question'.

Is this understanding correct?

Do you now have a working set of syntax?


Art Kendall
Social Research Consultants
On 11/28/2011 4:00 PM, huang jialin wrote:
Hi Art,

Thanks for the reply. Would you explain how it relates to my question? 

BTW, there are goofy characters in the email. Is it because my email?

Thanks.

Jialin

On Mon, Nov 28, 2011 at 1:49 PM, Art Kendall <[hidden email]> wrote:
Nice one. I forgot that one can specify the delimiter. I can see how to use semicolon, comma, pipe or most likely any printing character,
is there a way to specify a tab (ctrl-h in ASCII)?


Lo and behold!!
this works even with 2 specified separators this syntax if for tilda and x  but x and X works too.

new file.
set blanks 0.
data list list ("~","x")Â file='c:\project\tilda x separated.txt' /numvar1 to numvar3.
compute numvalid=nvalid(numvar1 to numvar3).
compute ID = $casenum.
list /variables = ID numvar1 to numvar3.
-----
the data
-----
123x456x789
xxx
123x456x789
123xx
123~456~789
~~~
123~456x789
123~x


Art Kendall
Social Research Consultants


On 11/28/2011 1:53 PM, Rick Oliver wrote:
Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]


****sample data***.
data list list /stringvar (a120).
begin data
1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2
2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;
end data.
write outfile='c:\temp\temp.txt' /stringvar.
execute.
***real code starts here. just read the original text data file this way***.
data list list (";") Â file='c:\temp\temp.txt' /numvar1 to numvar50.
compute numvalid=nvalid(numvar1 to numvar50).
execute.



From: Â Â Â Â huang jialin [hidden email]
To: Â Â Â Â [hidden email]
Date: Â Â Â Â 11/28/2011 12:19 PM
Subject:     Re: How to calculate missing values in a string with multiple        responses and semi-colons
Sent by: Â Â Â Â "SPSSX(r) Discussion" [hidden email]




Hi Gene,

Thanks for your quick response. I will rephrase my question. That is, I only want to count the missing values before the last integer. Am I making sense now?

Sorry for the confusion.

Sincerely,
JialinÂ

On Mon, Nov 28, 2011 at 12:08 PM, Gene Maguin <[hidden email]> wrote:
Ok, I missed accounting for the situation where the first item is missing. This will do that.

Â

Compute nmissing=0.

If (substr(V,1,1) eq ‘;’) nmissing=nmissing+1.Â

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+Â compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘Â ‘).

+Â break.

End if.

End loop.

Â

I don’t know what you mean by this: ‘Second, I do not want to count the missing values when there is no response at all’. How do you tell the difference between a missing value and ‘no response at all’.

Â

Â

Â

From: huang jialin [mailto:[hidden email]]
Sent:
Monday, November 28, 2011 12:54 PM
To:
Gene Maguin
Cc:
[hidden email]
Subject:
Re: How to calculate missing values in a string with multiple responses and semi-colons

Â

Hi Gene,

Â

Thanks for your email. There are still two questions needed to be solved. First, if the first item is missing, there is only one semicolon in the beginning of the string. Thus, it may not fit the comparison of pair of semicolons. Second, I do not want to count the missing values when there is no response at all. How can I do it?

Â

Thank you very much.

Â

Sincerely,

Jialin

Â

Â

On Mon, Nov 28, 2011 at 11:42 AM, Gene Maguin <[hidden email]> wrote:

It looks like adjacent semicolons (;;) mean a missing data value. I think all you need to do is to count pairs of semicolons. Let’s say the that the variable, V, is A150.

Â

Compute nmissing=0.

Loop #i=1 to 149.

Do If (substr(V,#i,2) eq ‘;;’).

+Â compute nmissing=nmissing+1.

Else if (substr(V,#i,2) eq ‘Â ‘).

+Â break.

End if.

End loop.

Â

>>If you copy this text, make sure the quotes are straight and not curly. I think spss does not like curly.

Â

Â

Â

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
Monday, November 28, 2011 12:23 PM
To:
[hidden email]
Subject:
How to calculate missing values in a string with multiple responses and semi-colons

Â

Hello,

Â

I have a variable in a dataset formatted as following:

1;1;5;1;4;3;;;2;2;134;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;5;3;;1;3;145;2;3;;5;5;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;1;3;1;5;1;4;4;2;1;;2;3;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;;3;;1;4;;4;2;3;5;;3;;4;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;5;1;3;1;;1;2;3;;2;2;;3;;;3;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

3;4;3;1;3;3;145;1;3;5;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;4;1;4;2;1;5;;3;234;2;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

1;1;5;1;4;2;1;1;2;2;234;4;5;4;1;145;2;2;3;125;3;5;1;2;2;5;234;3;2;3;2;2;5;3;4;;135;;4;4;135;;1;2;;234;;;;2

2;4;3;1;3;3;145;2;3;4;2;1;1;3;235;1;3;5;235;3;3;4;125;5;1;2;3;1;2;4;3;5;4;135;3;4;3;2;2;5;1;5;245;2;1;3;124;4;2;

Â

There is 50 items in total, and they are separated by 49 semi-colons. For certain items, they contains multiple responses. I want to calculate how many items was missed before the responses stopped.

Â

I tried to use length(trim(var)) to calculate the length of string, but it only turned out to be the number of responses. It is the same as using CHAR.LENGTH(var).

Â

How can I got the number of missing items?Â

Â

I would appreciate your helps.Â

Â

Sincerely,

Jialin

Â

Â

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

David Marso
Administrator
In reply to this post by Art Kendall
 "is there a way to specify a tab (ctrl-h in ASCII)? "

DATA LIST LIST (";" TAB) / A B C.
BEGIN DATA
1;2 3
3 1;4
end data.
list.


<quote author="Art Kendall">
Nice one.�  I forgot that one can specify the
      delimiter. I can see how to use semicolon, comma, pipe or
    most likely any printing character,
    is there a way to specify a tab (ctrl-h in ASCII)?
    <SNIP>
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

Art Kendall
When I put David's syntax into version 20, the editor recognizes and
colors the syntax correctly.
However, when I edit the data, delete the multiple spaces and key <tab>
it gets
> >Warning # 1114
> >An embedded blank has been found under a numeric format.  The result
> has been
> >set to the system-missing value.
> >Command line: 46  Current case: 1  Current splitfile group: 1
> >Field contents: '2   3'
> >Record number: 1  Starting column: 3  Record length: 7
Does this happen to other list members? with other versions?

Art




On 11/29/2011 11:12 AM, David Marso wrote:

>   "is there a way to specify a tab (ctrl-h in ASCII)?"
>
> DATA LIST LIST (";" TAB) / A B C.
> BEGIN DATA
> 1;2     3
> 3       1;4
> end data.
> list.
>
>
>
> Nice one.�  I forgot that one can specify the
>        delimiter. I can see how to use semicolon, comma, pipe or
>      most likely any printing character,
>      is there a way to specify a tab (ctrl-h in ASCII)?
>      <SNIP>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-calculate-missing-values-in-a-string-with-multiple-responses-and-semi-colons-tp5029508p5032879.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: How to calculate missing values in a string with multiple responses and semi-colons

Rick Oliver-3
I would not recommend using tabs in the syntax editor, particularly for inline data. An external data source with tab characters should be read without any problems.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Art Kendall <[hidden email]>
To:        [hidden email]
Date:        11/29/2011 10:59 AM
Subject:        Re: How to calculate missing values in a string with multiple              responses and semi-colons
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




When I put David's syntax into version 20, the editor recognizes and
colors the syntax correctly.
However, when I edit the data, delete the multiple spaces and key <tab>
it gets
> >Warning # 1114
> >An embedded blank has been found under a numeric format.  The result
> has been
> >set to the system-missing value.
> >Command line: 46  Current case: 1  Current splitfile group: 1
> >Field contents: '2   3'
> >Record number: 1  Starting column: 3  Record length: 7
Does this happen to other list members? with other versions?

Art




On 11/29/2011 11:12 AM, David Marso wrote:
>   "is there a way to specify a tab (ctrl-h in ASCII)?"
>
> DATA LIST LIST (";" TAB) / A B C.
> BEGIN DATA
> 1;2     3
> 3       1;4
> end data.
> list.
>
>
>
> Nice one.�  I forgot that one can specify the
>        delimiter. I can see how to use semicolon, comma, pipe or
>      most likely any printing character,
>      is there a way to specify a tab (ctrl-h in ASCII)?
>      <SNIP>
>
> --
> View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/How-to-calculate-missing-values-in-a-string-with-multiple-responses-and-semi-colons-tp5029508p5032879.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


12