Merging datasets -

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Merging datasets -

J.D. Haltigan
Does anyone know or can enlighten me as to why when one merges datasets in
v25, say you add a variable or more from another stored file, that the
engine always opens the keyed file as another dataset? If memory serves, it
used to be that just the desired variables and/or cases were brought in to
the current working dataset without the keyed dataset opening.

Thanks in advance for any insight.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

Jon Peck
In some cases, the ui generates code for the STAR JOIN command, or you can write this directly in the syntax.  STAR JOIN does not require the cases to be sorted, which means that it needs access to all the cases in each lookup file, so it would have to open them.  STAR JOIN has another nice feature: string variable keys do not have to be the same size.  (The STATS ADJUST WIDTHS extension command can synchronize the widths for MATCH/ADD).

You can, of course, still use MATCH and ADD FILES to the same effect.  They still work the way they always have and may give superior performance in some cases.

On Fri, Sep 21, 2018 at 6:24 PM J.D. Haltigan <[hidden email]> wrote:
Does anyone know or can enlighten me as to why when one merges datasets in
v25, say you add a variable or more from another stored file, that the
engine always opens the keyed file as another dataset? If memory serves, it
used to be that just the desired variables and/or cases were brought in to
the current working dataset without the keyed dataset opening.

Thanks in advance for any insight.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

Rick Oliver
I don't know if the UI generates STAR JOIN syntax anymore (v25). Generally speaking, performance with MATCH FILES is better, even when you take into account the time required to sort all the files by key variables and adjust the widths of string keys. But if you are writing your own syntax, and performance is not a major issue, STAR JOIN syntax can be simpler to write, particularly if you are familiar with SQL. It's not SQL, but it was designed to be similar to SQL.

On Sun, Sep 23, 2018 at 4:06 PM Jon Peck <[hidden email]> wrote:
In some cases, the ui generates code for the STAR JOIN command, or you can write this directly in the syntax.  STAR JOIN does not require the cases to be sorted, which means that it needs access to all the cases in each lookup file, so it would have to open them.  STAR JOIN has another nice feature: string variable keys do not have to be the same size.  (The STATS ADJUST WIDTHS extension command can synchronize the widths for MATCH/ADD).

You can, of course, still use MATCH and ADD FILES to the same effect.  They still work the way they always have and may give superior performance in some cases.

On Fri, Sep 21, 2018 at 6:24 PM J.D. Haltigan <[hidden email]> wrote:
Does anyone know or can enlighten me as to why when one merges datasets in
v25, say you add a variable or more from another stored file, that the
engine always opens the keyed file as another dataset? If memory serves, it
used to be that just the desired variables and/or cases were brought in to
the current working dataset without the keyed dataset opening.

Thanks in advance for any insight.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

J.D. Haltigan
It must be then, if you do not request cases to be sorted before the merge,
then the STAR JOIN is invoked as the syntax by the UI? In all cases where I
merge with the UI (it's just a more wrote method for me), and I do NOT
request cases to be sorted the keyed file is opened during the merge
process. It would seem to me then that to eliminate this behavior I would
just need to request cases be sorted which would invoke presumably the MATCH
FILES syntax by the UI? Whatever the case may be will try it nonetheless.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

David Marso-2
In reply to this post by J.D. Haltigan
Perhaps time to learn and consistently use syntax (beginning with Paste button).
Otherwise you  have no audit trail and good luck reproducing your results.
Oh yeah, journal file LOL. Try rebuilding anything reliable from that.
Anyone following you is SOL.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

Jon Peck
Don't forget the log blocks in the Viewer and the Notes tables, which include all the procedure syntax.

On Mon, Sep 24, 2018 at 10:59 AM David Marso <[hidden email]> wrote:
Perhaps time to learn and consistently use syntax (beginning with Paste button).
Otherwise you  have no audit trail and good luck reproducing your results.
Oh yeah, journal file LOL. Try rebuilding anything reliable from that.
Anyone following you is SOL.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

David Marso-2
In reply to this post by J.D. Haltigan
Is there a Syntax Harvester extension?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

Jon Peck
There is not as far as I know, but this would be pretty easy to put together using the Python scripting apis as long as everything is in a single Viewer window or file.  (Error messages in the logs would require a little thought.)  What would be the advantage over using the journal file?

The new UI as yet has no scripting capabilities, but it does integrate the GUI and syntax better than the old UI.

On Mon, Sep 24, 2018 at 11:15 AM David Marso <[hidden email]> wrote:
Is there a Syntax Harvester extension?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

David Marso-2
In reply to this post by J.D. Haltigan
"What would be the advantage over using the journal file?"
I looked at my jnl file and it starts in June 2015 ;-)

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

Jon Peck
My current V25 journal file, which starts in late 2017, is over 53000 lines long.  But it captures everything, and it's possible that I would need to go that far back to reconstruct some file or analysis.

In theory I could select all the contents and reexecute them except that the session dates are not properly commented out,  although error messages are.

Of  course, you could set preferences to overwrite rather than append.

On Mon, Sep 24, 2018 at 11:57 AM David Marso <[hidden email]> wrote:
"What would be the advantage over using the journal file?"
I looked at my jnl file and it starts in June 2015 ;-)

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

J.D. Haltigan
In reply to this post by Jon Peck
To be sure I do utilize detailed syntax from the editor for the reasons you
mention for all substantive iterations of analyses. Sometimes GUI is just
motorically more efficient...particularly if your corneas are tired of
gawking at syntax lines. Pasting in this case good wisdom nonetheless.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Merging datasets -

Jon Peck
For the benefit of you corneas, you can change the font size in the syntax edit via View > Fonts

On Mon, Sep 24, 2018 at 3:12 PM J.D. Haltigan <[hidden email]> wrote:
To be sure I do utilize detailed syntax from the editor for the reasons you
mention for all substantive iterations of analyses. Sometimes GUI is just
motorically more efficient...particularly if your corneas are tired of
gawking at syntax lines. Pasting in this case good wisdom nonetheless.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD