Some .sav files triple string vars

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Some .sav files triple string vars

ChrisKeran
I use the Match Files often and haven't figured out why some of my .sav files
automatically triple the "BY" variable, in my case MemberID (a string var).
The .sav files I'm using are never more than a few years old or several SPSS
versions old. I am on v26 but some of my colleagues are on v24 and v23.

The problem this creates, of course, is that the Match Files command gives
an error, because although the MemberID was the same length (say A6) in all
.sav files, when it reports the error, some of the .sav files now have a
MemberID with A18, so it hoses the match.

FYI, as of right now, the status bar shows Unicode: ON and my syntax files
all get this comment (automatically) at the top "* Encoding: UTF-8" when I
save the syntax file. I know about this tripling for code page and UTF-16
text data files, but these are all .sav files.

Please enlighten my tired brain!



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Some .sav files triple string vars

Jon Peck
The sav files are not in Unicode, so in converting them to Unicode consistent with the mode you are running in, string widths have to be tripled in order to guarantee no data loss even though ALTER TYPE can later minimize them.

If you have a bunch of  such sav files, the easiest  way to  deal with this  is to use the STATS ADJUST WIDTHS extension command to synchronize the  widths.  It takes a  batch of files specified with a regular regular  expression such as c:/mydata/*.sav and an adjustment method and fixes all the files.

You can alternatively use ALTER TYPE to fix these manually, but that is error prone and tedious.

The  STAR JOIN procedure does not require equal widths for strings across files, but it has its own problems.

From the STATS ADJUST WIDTHS help...

STATS ADJUST WIDTHS VARIABLES=variables*
WIDTH=MAX** or MIN or FIRST
MAXWIDTH=integer
DSNAMEROOT=rootname
EXACTWIDTH=integer

/FILES list of file names and wildcards and dataset names*

/OUTFILE RESAVE=NO** or YES
SUFFIX=“string
DIRECTORY=“directory
OVERWRITE=NO** or YES
CLOSE=NO** or YES

/HELP


On Thu, Jan 30, 2020 at 1:22 PM ChrisKeran <[hidden email]> wrote:
I use the Match Files often and haven't figured out why some of my .sav files
automatically triple the "BY" variable, in my case MemberID (a string var).
The .sav files I'm using are never more than a few years old or several SPSS
versions old. I am on v26 but some of my colleagues are on v24 and v23.

The problem this creates, of course, is that the Match Files command gives
an error, because although the MemberID was the same length (say A6) in all
.sav files, when it reports the error, some of the .sav files now have a
MemberID with A18, so it hoses the match.

FYI, as of right now, the status bar shows Unicode: ON and my syntax files
all get this comment (automatically) at the top "* Encoding: UTF-8" when I
save the syntax file. I know about this tripling for code page and UTF-16
text data files, but these are all .sav files.

Please enlighten my tired brain!



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD