string aggregate problem

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

string aggregate problem

Maguin, Eugene

I think there’s something I don’t understand about how aggregate functions so I need some help. A little background: I’m reading excel sheets, one after another, saving, concantenating the files, and massaging the data to restructure it into an spss friendly structure shown below.

 

Example data (everything is strings ranging in width from a5 to a4000.)

 

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

Grant 2

                Kumar

                                Co-PI

                                                Math

                                                                Yes         Blah2, blah2, etc

 

Here’s where I want to get to.

 

Grant 1 Susan    PI            Med      no           Blah1, blah1, etc

Grant 2 Kumar   Co-PI     Math     yes         Blah2, blah2, etc

 

I carried the proposal value down across its blank values so that the break variable would  have a valid value for all records.  Blank is a valid string value, which destroys things. So I loaded the blanks with a little four character value, ‘abck’ and declared that value to be missing for all fields. Standard aggregate command: aggregate outfile=ss/break=proposal/people, etc = first(people, etc).

What I got back was

 

Grant 1 abck       abck       abck       no           abck

Grant 2 abck       abck       abck       yes         abck

 

So, what I take from this is that in an aggregate proc string missing values work for an a5 string but not for an ax string, x>some value. But as I understand the documentation string missing values work for an ax string of any length and there are no caveats that I see in the aggregate documentation.

 

One last point, another way to do what I want is to read and massage the data from each excel sheet and then apply the string missing value and aggregate. Doing so yields correct results with no problems, the difference being that there is no break variable.

 

I prefer the read and save then process and aggregate because I think that will fit better with the problem of reading multiple 50 or excel files.

 

Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

Jon Peck
From the CSR...

Missing values for string variables cannot exceed 8 bytes. (There is no limit on the defined width of the
string variable, but defined missing values cannot exceed 8 bytes.)

This is a general restriction.  Declaring a missing value longer than 8 bytes on MISSING VALUES will give an error message.

On Tue, Jan 9, 2018 at 7:51 AM, Maguin, Eugene <[hidden email]> wrote:

I think there’s something I don’t understand about how aggregate functions so I need some help. A little background: I’m reading excel sheets, one after another, saving, concantenating the files, and massaging the data to restructure it into an spss friendly structure shown below.

 

Example data (everything is strings ranging in width from a5 to a4000.)

 

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

Grant 2

                Kumar

                                Co-PI

                                                Math

                                                                Yes         Blah2, blah2, etc

 

Here’s where I want to get to.

 

Grant 1 Susan    PI            Med      no           Blah1, blah1, etc

Grant 2 Kumar   Co-PI     Math     yes         Blah2, blah2, etc

 

I carried the proposal value down across its blank values so that the break variable would  have a valid value for all records.  Blank is a valid string value, which destroys things. So I loaded the blanks with a little four character value, ‘abck’ and declared that value to be missing for all fields. Standard aggregate command: aggregate outfile=ss/break=proposal/people, etc = first(people, etc).

What I got back was

 

Grant 1 abck       abck       abck       no           abck

Grant 2 abck       abck       abck       yes         abck

 

So, what I take from this is that in an aggregate proc string missing values work for an a5 string but not for an ax string, x>some value. But as I understand the documentation string missing values work for an ax string of any length and there are no caveats that I see in the aggregate documentation.

 

One last point, another way to do what I want is to read and massage the data from each excel sheet and then apply the string missing value and aggregate. Doing so yields correct results with no problems, the difference being that there is no break variable.

 

I prefer the read and save then process and aggregate because I think that will fit better with the problem of reading multiple 50 or excel files.

 

Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

Jon Peck
One way to solve this without regard to missing values is
1 - propagate the Proposal (as you have already done)
2- aggregate by Proposal using as the function MAX for each variable.  MAX (and MIN) work with strings as well as numerics, and blank compares low to any nonblank value.

On Tue, Jan 9, 2018 at 8:06 AM, Jon Peck <[hidden email]> wrote:
From the CSR...

Missing values for string variables cannot exceed 8 bytes. (There is no limit on the defined width of the
string variable, but defined missing values cannot exceed 8 bytes.)

This is a general restriction.  Declaring a missing value longer than 8 bytes on MISSING VALUES will give an error message.

On Tue, Jan 9, 2018 at 7:51 AM, Maguin, Eugene <[hidden email]> wrote:

I think there’s something I don’t understand about how aggregate functions so I need some help. A little background: I’m reading excel sheets, one after another, saving, concantenating the files, and massaging the data to restructure it into an spss friendly structure shown below.

 

Example data (everything is strings ranging in width from a5 to a4000.)

 

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

Grant 2

                Kumar

                                Co-PI

                                                Math

                                                                Yes         Blah2, blah2, etc

 

Here’s where I want to get to.

 

Grant 1 Susan    PI            Med      no           Blah1, blah1, etc

Grant 2 Kumar   Co-PI     Math     yes         Blah2, blah2, etc

 

I carried the proposal value down across its blank values so that the break variable would  have a valid value for all records.  Blank is a valid string value, which destroys things. So I loaded the blanks with a little four character value, ‘abck’ and declared that value to be missing for all fields. Standard aggregate command: aggregate outfile=ss/break=proposal/people, etc = first(people, etc).

What I got back was

 

Grant 1 abck       abck       abck       no           abck

Grant 2 abck       abck       abck       yes         abck

 

So, what I take from this is that in an aggregate proc string missing values work for an a5 string but not for an ax string, x>some value. But as I understand the documentation string missing values work for an ax string of any length and there are no caveats that I see in the aggregate documentation.

 

One last point, another way to do what I want is to read and massage the data from each excel sheet and then apply the string missing value and aggregate. Doing so yields correct results with no problems, the difference being that there is no break variable.

 

I prefer the read and save then process and aggregate because I think that will fit better with the problem of reading multiple 50 or excel files.

 

Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



--
Jon K Peck
[hidden email]




--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

David Marso-2
In reply to this post by Maguin, Eugene
One would normally use the MAX function for this within the AGGREGATE command.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

Maguin, Eugene
In reply to this post by Jon Peck

Jon, David: Thank you. Max works. I would never, ever, have thought of max/min as being able to process string vars. I think of them both as number functions only.

 

I think I need a little education on string missing values. I did read the two sentences in the Missing values documentation about 8 bytes and that is why I recoded blank to ‘abck’. Maybe wrongly, I figure a4 is 4 bytes, so with ‘abck’ I’m below the 8 byte limit. In the missing values statement itself, I declared <varlist>(‘abck’). Every variable showed ‘abck’ as the missing value designator. So I assumed I had fixed the problem.

 

But here’s the other curiosity, let’s say. The first time through I went sheet by sheet so I would have had this from sheet 1, for example.

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

 

I did the recode, missing values statement, and aggregate with no break variable and first function and this worked, giving the single record result. I’m not wanting to argue about this but help me understand why the missing recode works when there is no break variable and does ot work when there is a break variable.

 

Thanks, Gene Maguin

 

 

 

 

 

 

 

 

From: Jon Peck [mailto:[hidden email]]
Sent: Tuesday, January 9, 2018 10:17 AM
To: Maguin, Eugene <[hidden email]>
Cc: SPSS List <[hidden email]>
Subject: Re: [SPSSX-L] string aggregate problem

 

One way to solve this without regard to missing values is

1 - propagate the Proposal (as you have already done)

2- aggregate by Proposal using as the function MAX for each variable.  MAX (and MIN) work with strings as well as numerics, and blank compares low to any nonblank value.

 

On Tue, Jan 9, 2018 at 8:06 AM, Jon Peck <[hidden email]> wrote:

From the CSR...

 

Missing values for string variables cannot exceed 8 bytes. (There is no limit on the defined width of the

string variable, but defined missing values cannot exceed 8 bytes.)

 

This is a general restriction.  Declaring a missing value longer than 8 bytes on MISSING VALUES will give an error message.

 

On Tue, Jan 9, 2018 at 7:51 AM, Maguin, Eugene <[hidden email]> wrote:

I think there’s something I don’t understand about how aggregate functions so I need some help. A little background: I’m reading excel sheets, one after another, saving, concantenating the files, and massaging the data to restructure it into an spss friendly structure shown below.

 

Example data (everything is strings ranging in width from a5 to a4000.)

 

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

Grant 2

                Kumar

                                Co-PI

                                                Math

                                                                Yes         Blah2, blah2, etc

 

Here’s where I want to get to.

 

Grant 1 Susan    PI            Med      no           Blah1, blah1, etc

Grant 2 Kumar   Co-PI     Math     yes         Blah2, blah2, etc

 

I carried the proposal value down across its blank values so that the break variable would  have a valid value for all records.  Blank is a valid string value, which destroys things. So I loaded the blanks with a little four character value, ‘abck’ and declared that value to be missing for all fields. Standard aggregate command: aggregate outfile=ss/break=proposal/people, etc = first(people, etc).

What I got back was

 

Grant 1 abck       abck       abck       no           abck

Grant 2 abck       abck       abck       yes         abck

 

So, what I take from this is that in an aggregate proc string missing values work for an a5 string but not for an ax string, x>some value. But as I understand the documentation string missing values work for an ax string of any length and there are no caveats that I see in the aggregate documentation.

 

One last point, another way to do what I want is to read and massage the data from each excel sheet and then apply the string missing value and aggregate. Doing so yields correct results with no problems, the difference being that there is no break variable.

 

I prefer the read and save then process and aggregate because I think that will fit better with the problem of reading multiple 50 or excel files.

 

Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



 

--

Jon K Peck
[hidden email]



 

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

Jon Peck
Yes, your missing code is legit, but if the variable value is more than 8 bytes, it will only consider the first 8 bytes in testing for missing.  But with MAX, you wouldn't need to use a missing code.

Missing values would be treated as regular values in the break variable(s), so I'm not clear about what does not work with a break variable.


On Tue, Jan 9, 2018 at 9:51 AM, Maguin, Eugene <[hidden email]> wrote:

Jon, David: Thank you. Max works. I would never, ever, have thought of max/min as being able to process string vars. I think of them both as number functions only.

 

I think I need a little education on string missing values. I did read the two sentences in the Missing values documentation about 8 bytes and that is why I recoded blank to ‘abck’. Maybe wrongly, I figure a4 is 4 bytes, so with ‘abck’ I’m below the 8 byte limit. In the missing values statement itself, I declared <varlist>(‘abck’). Every variable showed ‘abck’ as the missing value designator. So I assumed I had fixed the problem.

 

But here’s the other curiosity, let’s say. The first time through I went sheet by sheet so I would have had this from sheet 1, for example.

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

 

I did the recode, missing values statement, and aggregate with no break variable and first function and this worked, giving the single record result. I’m not wanting to argue about this but help me understand why the missing recode works when there is no break variable and does ot work when there is a break variable.

 

Thanks, Gene Maguin

 

 

 

 

 

 

 

 

From: Jon Peck [mailto:[hidden email]]
Sent: Tuesday, January 9, 2018 10:17 AM
To: Maguin, Eugene <[hidden email]>
Cc: SPSS List <[hidden email]>
Subject: Re: [SPSSX-L] string aggregate problem

 

One way to solve this without regard to missing values is

1 - propagate the Proposal (as you have already done)

2- aggregate by Proposal using as the function MAX for each variable.  MAX (and MIN) work with strings as well as numerics, and blank compares low to any nonblank value.

 

On Tue, Jan 9, 2018 at 8:06 AM, Jon Peck <[hidden email]> wrote:

From the CSR...

 

Missing values for string variables cannot exceed 8 bytes. (There is no limit on the defined width of the

string variable, but defined missing values cannot exceed 8 bytes.)

 

This is a general restriction.  Declaring a missing value longer than 8 bytes on MISSING VALUES will give an error message.

 

On Tue, Jan 9, 2018 at 7:51 AM, Maguin, Eugene <[hidden email]> wrote:

I think there’s something I don’t understand about how aggregate functions so I need some help. A little background: I’m reading excel sheets, one after another, saving, concantenating the files, and massaging the data to restructure it into an spss friendly structure shown below.

 

Example data (everything is strings ranging in width from a5 to a4000.)

 

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

Grant 2

                Kumar

                                Co-PI

                                                Math

                                                                Yes         Blah2, blah2, etc

 

Here’s where I want to get to.

 

Grant 1 Susan    PI            Med      no           Blah1, blah1, etc

Grant 2 Kumar   Co-PI     Math     yes         Blah2, blah2, etc

 

I carried the proposal value down across its blank values so that the break variable would  have a valid value for all records.  Blank is a valid string value, which destroys things. So I loaded the blanks with a little four character value, ‘abck’ and declared that value to be missing for all fields. Standard aggregate command: aggregate outfile=ss/break=proposal/people, etc = first(people, etc).

What I got back was

 

Grant 1 abck       abck       abck       no           abck

Grant 2 abck       abck       abck       yes         abck

 

So, what I take from this is that in an aggregate proc string missing values work for an a5 string but not for an ax string, x>some value. But as I understand the documentation string missing values work for an ax string of any length and there are no caveats that I see in the aggregate documentation.

 

One last point, another way to do what I want is to read and massage the data from each excel sheet and then apply the string missing value and aggregate. Doing so yields correct results with no problems, the difference being that there is no break variable.

 

I prefer the read and save then process and aggregate because I think that will fit better with the problem of reading multiple 50 or excel files.

 

Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



 

--

Jon K Peck
[hidden email]



 

--

Jon K Peck
[hidden email]




--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

Maguin, Eugene

Ok.

After doing the proposal carrydown, blank to ‘abck’ recode, and ‘abck’ as missing, this computes [aggregate outfile=*/vars=first(vars).]

 

Proposal          people  role      dept     firstgrant         abstract

Grant 1            abck     abck     abck     abck     abck

Grant 1            Susan   abck     abck     abck     abck

Grant 1            abck     PI        abck     abck     abck

Grant 1            abck     abck     Med     abck     abck

Grant 1            abck     abck     abck     no        abck

Grant 1            abck     abck     abck     abck     Blah1, blah1, etc

Grant 2            abck     abck     abck     abck     abck

Grant 2            Kumar abck     abck     abck     abck

Grant 2            abck     PI        abck     abck     abck

Grant 2            abck     abck     Med     abck     abck

Grant 2            abck     abck     abck     yes       abck

Grant 2            abck     abck     abck     abck     Blah1, blah1, etc

 

computes [aggregate outfile=*/break=proposal/vars=first(vars).] to this.

Grant 1 abck       abck       abck       no           abck

Grant 2 abck       abck       abck       yes         abck

 

The only difference that I can distinguish is that firstgrant is a5 and the other variables are >a8

 

Thanks.

 

 

 

 

 

 

 

From: Jon Peck [mailto:[hidden email]]
Sent: Tuesday, January 9, 2018 12:11 PM
To: Maguin, Eugene <[hidden email]>
Cc: SPSS List <[hidden email]>
Subject: Re: [SPSSX-L] string aggregate problem

 

Yes, your missing code is legit, but if the variable value is more than 8 bytes, it will only consider the first 8 bytes in testing for missing.  But with MAX, you wouldn't need to use a missing code.

 

Missing values would be treated as regular values in the break variable(s), so I'm not clear about what does not work with a break variable.

 

 

On Tue, Jan 9, 2018 at 9:51 AM, Maguin, Eugene <[hidden email]> wrote:

Jon, David: Thank you. Max works. I would never, ever, have thought of max/min as being able to process string vars. I think of them both as number functions only.

 

I think I need a little education on string missing values. I did read the two sentences in the Missing values documentation about 8 bytes and that is why I recoded blank to ‘abck’. Maybe wrongly, I figure a4 is 4 bytes, so with ‘abck’ I’m below the 8 byte limit. In the missing values statement itself, I declared <varlist>(‘abck’). Every variable showed ‘abck’ as the missing value designator. So I assumed I had fixed the problem.

 

But here’s the other curiosity, let’s say. The first time through I went sheet by sheet so I would have had this from sheet 1, for example.

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

 

I did the recode, missing values statement, and aggregate with no break variable and first function and this worked, giving the single record result. I’m not wanting to argue about this but help me understand why the missing recode works when there is no break variable and does ot work when there is a break variable.

 

Thanks, Gene Maguin

 

 

 

 

 

 

 

 

From: Jon Peck [mailto:[hidden email]]
Sent: Tuesday, January 9, 2018 10:17 AM
To: Maguin, Eugene <
[hidden email]>
Cc: SPSS List <
[hidden email]>
Subject: Re: [SPSSX-L] string aggregate problem

 

One way to solve this without regard to missing values is

1 - propagate the Proposal (as you have already done)

2- aggregate by Proposal using as the function MAX for each variable.  MAX (and MIN) work with strings as well as numerics, and blank compares low to any nonblank value.

 

On Tue, Jan 9, 2018 at 8:06 AM, Jon Peck <[hidden email]> wrote:

From the CSR...

 

Missing values for string variables cannot exceed 8 bytes. (There is no limit on the defined width of the

string variable, but defined missing values cannot exceed 8 bytes.)

 

This is a general restriction.  Declaring a missing value longer than 8 bytes on MISSING VALUES will give an error message.

 

On Tue, Jan 9, 2018 at 7:51 AM, Maguin, Eugene <[hidden email]> wrote:

I think there’s something I don’t understand about how aggregate functions so I need some help. A little background: I’m reading excel sheets, one after another, saving, concantenating the files, and massaging the data to restructure it into an spss friendly structure shown below.

 

Example data (everything is strings ranging in width from a5 to a4000.)

 

Proposal              people  role        dept      firstgrant             abstract

Grant 1

                Susan

                                PI

                                                Med

                                                                no           Blah1, blah1, etc

Grant 2

                Kumar

                                Co-PI

                                                Math

                                                                Yes         Blah2, blah2, etc

 

Here’s where I want to get to.

 

Grant 1 Susan    PI            Med      no           Blah1, blah1, etc

Grant 2 Kumar   Co-PI     Math     yes         Blah2, blah2, etc

 

I carried the proposal value down across its blank values so that the break variable would  have a valid value for all records.  Blank is a valid string value, which destroys things. So I loaded the blanks with a little four character value, ‘abck’ and declared that value to be missing for all fields. Standard aggregate command: aggregate outfile=ss/break=proposal/people, etc = first(people, etc).

What I got back was

 

Grant 1 abck       abck       abck       no           abck

Grant 2 abck       abck       abck       yes         abck

 

So, what I take from this is that in an aggregate proc string missing values work for an a5 string but not for an ax string, x>some value. But as I understand the documentation string missing values work for an ax string of any length and there are no caveats that I see in the aggregate documentation.

 

One last point, another way to do what I want is to read and massage the data from each excel sheet and then apply the string missing value and aggregate. Doing so yields correct results with no problems, the difference being that there is no break variable.

 

I prefer the read and save then process and aggregate because I think that will fit better with the problem of reading multiple 50 or excel files.

 

Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



 

--

Jon K Peck
[hidden email]



 

--

Jon K Peck
[hidden email]



 

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

Bruce Weaver
Administrator
In reply to this post by David Marso-2
Like this?  

* Generate Gene's sample data.
DATA LIST LIST / Proposal people role  dept firstgrant (5A10) abstract
(A20).
BEGIN DATA
"Grant 1"
"" Susan
"" "" PI
"" "" "" Med
"" "" "" ""  "no"    "Blah1, blah1, etc"
"Grant 2"
"" Kumar
"" "" Co-PI
"" "" ""  Math
"" "" "" ""  "Yes" "Blah2, blah2, etc"
END DATA.

IF Proposal EQ "" Proposal = LAG(Proposal).

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES
  /BREAK=Proposal
  /people role dept firstgrant abstract=MAX(people role dept firstgrant
abstract)
.

MATCH FILES FILE=* /BY Proposal /FIRST=Flag.
SELECT IF Flag.
LIST.

OUTPUT from LIST:

Proposal   people     role       dept       firstgrant abstract            
Flag
 
Grant 1    Susan      PI         Med        no         Blah1, blah1, etc    
1
Grant 2    Kumar      Co-PI      Math       Yes        Blah2, blah2, etc    
1


David Marso-2 wrote
> One would normally use the MAX function for this within the AGGREGATE
> command.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

David Marso
Administrator
Even more to the point:

DATASET DECLARE agg .
AGGREGATE OUTFILE agg /BREAK grant
 /people role  dept firstgrant  abstract=MAX  (people role  dept firstgrant
abstract) .
DATASET ACTIVATE agg .


Bruce Weaver wrote

> Like this?  
>
> * Generate Gene's sample data.
> DATA LIST LIST / Proposal people role  dept firstgrant (5A10) abstract
> (A20).
> BEGIN DATA
> "Grant 1"
> "" Susan
> "" "" PI
> "" "" "" Med
> "" "" "" ""  "no"    "Blah1, blah1, etc"
> "Grant 2"
> "" Kumar
> "" "" Co-PI
> "" "" ""  Math
> "" "" "" ""  "Yes" "Blah2, blah2, etc"
> END DATA.
>
> IF Proposal EQ "" Proposal = LAG(Proposal).
>
> AGGREGATE
>   /OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES
>   /BREAK=Proposal
>   /people role dept firstgrant abstract=MAX(people role dept firstgrant
> abstract)
> .
>
> MATCH FILES FILE=* /BY Proposal /FIRST=Flag.
> SELECT IF Flag.
> LIST.
>
> OUTPUT from LIST:
>
> Proposal   people     role       dept       firstgrant abstract            
> Flag
>  
> Grant 1    Susan      PI         Med        no         Blah1, blah1, etc    
> 1
> Grant 2    Kumar      Co-PI      Math       Yes        Blah2, blah2, etc    
> 1
>
>
> David Marso-2 wrote
>> One would normally use the MAX function for this within the AGGREGATE
>> command.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> --
> Bruce Weaver

> bweaver@

> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: string aggregate problem

Bruce Weaver
Administrator
Yep...I like it.  


David Marso wrote

> Even more to the point:
>
> DATASET DECLARE agg .
> AGGREGATE OUTFILE agg /BREAK grant
>  /people role  dept firstgrant  abstract=MAX  (people role  dept
> firstgrant
> abstract) .
> DATASET ACTIVATE agg .
>
>
> Bruce Weaver wrote
>> Like this?  
>>
>> * Generate Gene's sample data.
>> DATA LIST LIST / Proposal people role  dept firstgrant (5A10) abstract
>> (A20).
>> BEGIN DATA
>> "Grant 1"
>> "" Susan
>> "" "" PI
>> "" "" "" Med
>> "" "" "" ""  "no"    "Blah1, blah1, etc"
>> "Grant 2"
>> "" Kumar
>> "" "" Co-PI
>> "" "" ""  Math
>> "" "" "" ""  "Yes" "Blah2, blah2, etc"
>> END DATA.
>>
>> IF Proposal EQ "" Proposal = LAG(Proposal).
>>
>> AGGREGATE
>>   /OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES
>>   /BREAK=Proposal
>>   /people role dept firstgrant abstract=MAX(people role dept firstgrant
>> abstract)
>> .
>>
>> MATCH FILES FILE=* /BY Proposal /FIRST=Flag.
>> SELECT IF Flag.
>> LIST.
>>
>> OUTPUT from LIST:
>>
>> Proposal   people     role       dept       firstgrant abstract            
>> Flag
>>  
>> Grant 1    Susan      PI         Med        no         Blah1, blah1, etc    
>> 1
>> Grant 2    Kumar      Co-PI      Math       Yes        Blah2, blah2, etc    
>> 1
>>
>>
>> David Marso-2 wrote
>>> One would normally use the MAX function for this within the AGGREGATE
>>> command.
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>
>> bweaver@
>
>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to
> email me.
> ---
> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
> ne forte conculcent eas pedibus suis."
> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
> abyssum?"
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).