DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

Art Kendall
I client sent me data which was supposed to be the same in both .xlsx and .xls.

When I compared the two, I was surprised to find that there were "differences" for some    long string variables. However to the eye the strings appeared identical!

I then opened the .xlsx file in Excel and saved it as .xls.

When I compared the .xls file that was sent and the .xls file that I saved, no differences were found.

So, be aware there might be apparent differences from DATASET COMPARE when "identical" data comes from two versions of Excel.
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

Jon K Peck
Curious.  Try changing the string format to AHEX.  That will let you see non-printing character codes.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Art Kendall <[hidden email]>
To:        [hidden email]
Date:        09/25/2014 10:11 AM
Subject:        [SPSSX-L] DATASET COMPARE  same data in xlsx and xls show differences that eyeball as identical
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I client sent me data which was supposed to be the same in both .xlsx and
.xls.

When I compared the two, I was surprised to find that there were
"differences" for some    long string variables. However to the eye the
strings appeared identical!

I then opened the .xlsx file in Excel and saved it as .xls.

When I compared the .xls file that was sent and the .xls file that I saved,
no differences were found.

So, be aware there might be apparent differences from DATASET COMPARE when
"identical" data comes from two versions of Excel.



-----
Art Kendall
Social Research Consultants
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/DATASET-COMPARE-same-data-in-xlsx-and-xls-show-differences-that-eyeball-as-identical-tp5727403.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

Art Kendall
In reply to this post by Art Kendall
The strings are padded with spaces "20".
for SOME strings the second padding character is linefeed "0A"
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

Art Kendall
One workaround for a single string variable would be

string LineFeed(a1).
compute Linefeed = string(10,PIB1).
compute Address = replace(Address,LineFeed," ").

This can easily be put in a do repeat or loop for a single unwanted character.

If I were not under a time crunch, I would use this as as an opportunity to get up to speed with regex.



Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

Jon K Peck
No need to compute a casewise variable.
compute Address = replace(Address, string(10, pib1),' ').

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Art Kendall <[hidden email]>
To:        [hidden email]
Date:        09/26/2014 07:40 AM
Subject:        Re: [SPSSX-L] DATASET COMPARE  same data in xlsx and xls show differences that eyeball as identical
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




One workaround for a single string variable would be

string LineFeed(a1).
compute Linefeed = string(10,PIB1).
compute Address = replace(Address,LineFeed," ").

This can easily be put in a do repeat or loop for a single unwanted
character.

If I were not under a time crunch, I would use this as as an opportunity to
get up to speed with regex.







-----
Art Kendall
Social Research Consultants
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/DATASET-COMPARE-same-data-in-xlsx-and-xls-show-differences-that-eyeball-as-identical-tp5727403p5727428.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

Art Kendall
Yes I considered doing it in one compute.

However, I am not sure I would remember what it was for.

Also, others who use the archives might find it more readable.

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

Richard Ristow
At 12:41 PM 9/26/2014, Art Kendall wrote:
>Yes I considered doing it in one compute.
>
>However, I am not sure I would remember what it was for.
>
>Also, others who use the archives might find it more readable.

Scratch variable?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: DATASET COMPARE same data in xlsx and xls show differences that eyeball as identical

Art Kendall
would also work.
Art Kendall
Social Research Consultants