adjusting string width?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

adjusting string width?

Hoover, Matthew
Hello SPSS experts,

 

I have about 10 different data sets that I need to merge into one file.
This merge is adding records and not variables, so each data file has
the exact same variable names.  The problem is that most of the
variables are string format and for some reason there are different
widths for these variables in the corresponding data sets.  As you know,
to "add cases" the widths of the string variables have to match when
doing the merge.  Since there are about 30 variables in each file, it
gets to be quite a pain in the derriere to have to manually change each
variable to match a "master" format.

 

Is there a way that I can set a particular file as the standard file and
it will automatically adjust the variable lengths of the other files
when I do the merge?  Is there some other way that you know to add cases
when the variable lengths don't match?

 

Any help would be very appreciated!

 

Matt

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: adjusting string width?

Maguin, Eugene
Matthew,

Opening each file, adjusting the format on the variable view window, and
resaving is the default method. Somebody else may have a better solution but
doing it in syntax is not much easier because the drill is basically the
same but a bit more involved.

Get file='xxx'.

Rename (x1 to x30=s1 to s30).
String x1(a10) x2(a5) etc.
Do repeat x=x1 to x30/s=s1 to s30.
+  compute x=s.
End repeat.

Save outfile='xxx'/drop=s1 to s30.

You could make this into a macro but to me it's spuds or potatoes.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: adjusting string width?

ViAnn Beadle
In reply to this post by Hoover, Matthew
First, is there some reason why the string lengths are different? That
sounds suspicious to me and I would do some basic frequencies on these
variables from different files to see what's going on here.

Second, is there some primary source from which these files are being
extracted? If so, I'd step back and look at the extraction process to see if
the problem cannot be solved upstream.

Third, is there some reason why the original names must be retained. Why not
just create new variables from the existing variables in the same manner and
save the whole renaming process when the files are combined. That's a fairly
easy piece of syntax to write and execute against each file. It also has the
side-effect of preserving the original data since unless you go with the
maximum length string across all the files you run the risk of losing data
through truncation.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hoover, Matthew
Sent: Tuesday, December 18, 2007 8:45 AM
To: [hidden email]
Subject: adjusting string width?

Hello SPSS experts,



I have about 10 different data sets that I need to merge into one file.
This merge is adding records and not variables, so each data file has
the exact same variable names.  The problem is that most of the
variables are string format and for some reason there are different
widths for these variables in the corresponding data sets.  As you know,
to "add cases" the widths of the string variables have to match when
doing the merge.  Since there are about 30 variables in each file, it
gets to be quite a pain in the derriere to have to manually change each
variable to match a "master" format.



Is there a way that I can set a particular file as the standard file and
it will automatically adjust the variable lengths of the other files
when I do the merge?  Is there some other way that you know to add cases
when the variable lengths don't match?



Any help would be very appreciated!



Matt

=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: adjusting string width?

hillel vardi
In reply to this post by Hoover, Matthew
Shalom

A simple way to do what you need is to export all you file an reread them.
One way of doing it is to use the write command .
If all your files have the same variable name you will write all of them
using the same write command ,
because they are string you should  use the wides string in the write
command , and consider using trim to insure that all string begin on the
first column
Her is a general example using macro .


set      mprint yes.

define write_str  (!positional !tokens(1)) .
get      file=!concat(!eval(!1),  '.sav') .
write outfile=!concat(!eval(!1),'_n.sav')
        /str1 to str30(30a22) .
execute .
!enddefine .
write_str file_a.sav .
.
.
write_str file_z.
add files    file= file_a_n.sav /
              .
              .
             file= file_z_n.sav /


Hillel Vard
BGU

Hoover, Matthew wrote:

> Hello SPSS experts,
>
>
>
> I have about 10 different data sets that I need to merge into one file.
> This merge is adding records and not variables, so each data file has
> the exact same variable names.  The problem is that most of the
> variables are string format and for some reason there are different
> widths for these variables in the corresponding data sets.  As you know,
> to "add cases" the widths of the string variables have to match when
> doing the merge.  Since there are about 30 variables in each file, it
> gets to be quite a pain in the derriere to have to manually change each
> variable to match a "master" format.
>
>
>
> Is there a way that I can set a particular file as the standard file and
> it will automatically adjust the variable lengths of the other files
> when I do the merge?  Is there some other way that you know to add cases
> when the variable lengths don't match?
>
>
>
> Any help would be very appreciated!
>
>
>
> Matt
>
> ===================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: adjusting string width?

Katkowski, David
In reply to this post by ViAnn Beadle
If you paste this python script and follow instructions to setting the dir and length variables, you can set length of all string variables to be equal for all spss files in given directory. That should help your merge:


BEGIN PROGRAM.
import sys, os, spss, spssaux
"""This script sets length of all string variables to 100 for all files in the specified directory. It then saves
each file with the same name, but with "_rfmt" appended to it. This was done to preserve the original files. The only
difference between the original and modified file is that all string variable will be at the end, rather than beginning
of the dataset.
You have to set the following variables:
dir - directory where your files reside
length - number of characters for all string variables

To run this from syntax editor, wrap it in a BEGIN PROGRAM/END PROGRAM"""

try:
    filelist = []
    dir = "c:/"
    length = "150"
    filelist = os.listdir(dir)
    for file in filelist:
        if file[-3:] == 'sav':
            spss.Submit("get file '%s%s'." %(dir, file))
            strvarlist = []
            dataCursor = spss.Cursor()
            AllData = dataCursor.fetchall()
            varDict = spssaux.VariableDict()
            for e in varDict:
                if e.VariableType > 0:
                    strvarlist.append(e.VariableName)
            dataCursor.close()
            for e in strvarlist:
                spss.Submit("""String %s_rfmt (A%s).
                compute %s_rfmt = %s.
                exe.
                delete variables %s.
                rename variables (%s_rfmt = %s).""" %(e,length,e,e,e,e,e))
            spss.Submit("save outfile '" + dir + file[0:-4] + "_rfmt.sav'.")
except:
    dataCursor.close()
    raise
END PROGRAM.

-David


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of ViAnn Beadle
Sent: Tuesday, December 18, 2007 11:25 AM
To: [hidden email]
Subject: Re: adjusting string width?

First, is there some reason why the string lengths are different? That
sounds suspicious to me and I would do some basic frequencies on these
variables from different files to see what's going on here.

Second, is there some primary source from which these files are being
extracted? If so, I'd step back and look at the extraction process to see if
the problem cannot be solved upstream.

Third, is there some reason why the original names must be retained. Why not
just create new variables from the existing variables in the same manner and
save the whole renaming process when the files are combined. That's a fairly
easy piece of syntax to write and execute against each file. It also has the
side-effect of preserving the original data since unless you go with the
maximum length string across all the files you run the risk of losing data
through truncation.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hoover, Matthew
Sent: Tuesday, December 18, 2007 8:45 AM
To: [hidden email]
Subject: adjusting string width?

Hello SPSS experts,



I have about 10 different data sets that I need to merge into one file.
This merge is adding records and not variables, so each data file has
the exact same variable names.  The problem is that most of the
variables are string format and for some reason there are different
widths for these variables in the corresponding data sets.  As you know,
to "add cases" the widths of the string variables have to match when
doing the merge.  Since there are about 30 variables in each file, it
gets to be quite a pain in the derriere to have to manually change each
variable to match a "master" format.



Is there a way that I can set a particular file as the standard file and
it will automatically adjust the variable lengths of the other files
when I do the merge?  Is there some other way that you know to add cases
when the variable lengths don't match?



Any help would be very appreciated!



Matt

=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.503 / Virus Database: 269.17.4/1188 - Release Date: 12/17/2007 2:13 PM


No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.503 / Virus Database: 269.17.4/1189 - Release Date: 12/18/2007 9:40 PM

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: adjusting string width?

Katkowski, David
In reply to this post by ViAnn Beadle
Of course, to run the python script, you must have the python integration plugin and have downloaded the spss and spssaux modules.

-David

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of ViAnn Beadle
Sent: Tuesday, December 18, 2007 11:25 AM
To: [hidden email]
Subject: Re: adjusting string width?

First, is there some reason why the string lengths are different? That
sounds suspicious to me and I would do some basic frequencies on these
variables from different files to see what's going on here.

Second, is there some primary source from which these files are being
extracted? If so, I'd step back and look at the extraction process to see if
the problem cannot be solved upstream.

Third, is there some reason why the original names must be retained. Why not
just create new variables from the existing variables in the same manner and
save the whole renaming process when the files are combined. That's a fairly
easy piece of syntax to write and execute against each file. It also has the
side-effect of preserving the original data since unless you go with the
maximum length string across all the files you run the risk of losing data
through truncation.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hoover, Matthew
Sent: Tuesday, December 18, 2007 8:45 AM
To: [hidden email]
Subject: adjusting string width?

Hello SPSS experts,



I have about 10 different data sets that I need to merge into one file.
This merge is adding records and not variables, so each data file has
the exact same variable names.  The problem is that most of the
variables are string format and for some reason there are different
widths for these variables in the corresponding data sets.  As you know,
to "add cases" the widths of the string variables have to match when
doing the merge.  Since there are about 30 variables in each file, it
gets to be quite a pain in the derriere to have to manually change each
variable to match a "master" format.



Is there a way that I can set a particular file as the standard file and
it will automatically adjust the variable lengths of the other files
when I do the merge?  Is there some other way that you know to add cases
when the variable lengths don't match?



Any help would be very appreciated!



Matt

=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.503 / Virus Database: 269.17.4/1188 - Release Date: 12/17/2007 2:13 PM


No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.503 / Virus Database: 269.17.4/1189 - Release Date: 12/18/2007 9:40 PM

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: adjusting string width?

Peck, Jon
In reply to this post by Hoover, Matthew
SPSS 16 has a new ALTER TYPE command that makes it much easier to change string widths than it used to be.  And with multiple dataset capabilities, you can just open the other dataset, run ALTER TYPE, and do the merge with that open dataset.

You could also write a Python program to open the dataset, adjust the widths, and issue the merge syntax.

Regards,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hoover, Matthew
Sent: Tuesday, December 18, 2007 8:45 AM
To: [hidden email]
Subject: [SPSSX-L] adjusting string width?

Hello SPSS experts,



I have about 10 different data sets that I need to merge into one file.
This merge is adding records and not variables, so each data file has
the exact same variable names.  The problem is that most of the
variables are string format and for some reason there are different
widths for these variables in the corresponding data sets.  As you know,
to "add cases" the widths of the string variables have to match when
doing the merge.  Since there are about 30 variables in each file, it
gets to be quite a pain in the derriere to have to manually change each
variable to match a "master" format.



Is there a way that I can set a particular file as the standard file and
it will automatically adjust the variable lengths of the other files
when I do the merge?  Is there some other way that you know to add cases
when the variable lengths don't match?



Any help would be very appreciated!



Matt

=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: adjusting string width?

Richard Ristow
In reply to this post by Hoover, Matthew
At 10:44 AM 12/18/2007, Hoover, Matthew wrote:

>I have different data sets that I need to merge into one file.
>This merge is adding records and not variables, so each data file
>has the same variable names.  The problem is that most of the
>variables are string format and there are different widths for these
>variables in the corresponding data sets.  As you know, to "add
>cases" the widths of the string variables have to match when doing the merge.

Well, you've had a number of solutions. They're workable; but, I'd
say, the best are clumsy.

So, to repeat an oft-repeated complaint:
I think that making different-length strings incompatible for ADD
FILES, et al, was a mistake, and I'm sorry it's never been corrected.

A natural alternative is to give the result variable the length of
the longest input. If there's worry this will be confusing (I think
it rarely will be), retain the present behavior by default, and add a
subcommand to make different-length strings compatible as described.

And it comes up pretty often.

At 11:24 AM 12/18/2007, ViAnn Beadle wrote:

>First, is there some reason why the string lengths are different?
>That sounds suspicious to me and I would do some basic frequencies
>on these variables from different files to see what's going on here.

Well, I'd check that myself, unless I knew all the data sources
pretty well. But I wouldn't be terribly suspicious.

Many data-entry programs and transmission routes (Excel, for one)
create SPSS string variables with the longest length observed in the
data. Very often that differs between batches of the data.

It's driven me bats.

Sometimes, in Excel for one, you can insert a 'template' line with
dummy values having the longest length expected, but it's a pain:
you're only too likely to guess too small for one or more of the
variables; and then, if you have to change the template, you have to
change it exactly the same way in every input spreadsheet.

Sigh. This has been an unpaid non-profit rant. We now return you to
your regularly scheduled list traffic.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD