Developing new features for IBM SPSS

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Developing new features for IBM SPSS

Wilhelm Landerholm

I am currently developing a series of functions to IBM SPSS and are now looking for

help to identify the features that you, as a user, missing. All of the general things you wrestle with

like search problems, to data cleansing problems to more unique troubles are of interest.

 

Please, let me know what you're missing.

Thanks in advance.

 

Wilhelm Landerholm

[hidden email]

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

Garry Gelade

Wilhelm

 

I’d like to see model specification in the regression procedure updated. In GLM and GEE you can specify factors without creating dummy variables, and you can specify interactions wihout having to compute them.  The same ability in plain regression would be helpful.

 

Garry Gelade

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Wilhelm Landerholm
Sent: 15 September 2011 08:18
To: [hidden email]
Subject: Developing new features for IBM SPSS

 

I am currently developing a series of functions to IBM SPSS and are now looking for

help to identify the features that you, as a user, missing. All of the general things you wrestle with

like search problems, to data cleansing problems to more unique troubles are of interest.

 

Please, let me know what you're missing.

Thanks in advance.

 

Wilhelm Landerholm

[hidden email]

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

Christopher Stride
A few things of varying complexity:
  - computation of icc(1) statistics within the mixed models options -
ditto computation of rwg / rwg(j) statistics for agreement within subgroups
  - option to plot random effects in mixed models (like the plot(ranef)
option in R)
  - greater ease of producing stacked barcharts adding up to 100% split
by subgroups
  - remove the necessity of having variables of certain types to do
certain procedures that has started to creep in in the last few releases
e.g. graphing and particularly the new non-parametric tests. SPSS cannot
100% accurately assign measurement level, so all this does is create an
extra task for the user in having to reassign measurement level before
using these options
  - give generalised linear mixed models the old style SPSS output, or
at least the option to select it. I had been looking forward to not
having to migrate to other packages for this methodology, but I find it
unusable with the new style presentation of results in v19
  - make R compatability not confined to a particular version of R
  - Bring back the right click facility in dialog boxes that gave brief
info/help - this was so useful for teaching students

Is that enough to keep you busy? :-)

Garry Gelade said the following on 16/09/2011 14:56:

> Wilhelm
>
>
>
> I’d like to see model specification in the regression procedure updated.
> In GLM and GEE you can specify factors without creating dummy variables,
> and you can specify interactions wihout having to compute them.  The
> same ability in plain regression would be helpful.
>
>
>
> Garry Gelade
>
>
>
> *From:* SPSSX(r) Discussion [mailto:[hidden email]] *On Behalf
> Of *Wilhelm Landerholm
> *Sent:* 15 September 2011 08:18
> *To:* [hidden email]
> *Subject:* Developing new features for IBM SPSS
>
>
>
> I am currently developing a series of functions to IBM SPSS and are now
> looking for
>
> help to identify the features that you, as a user, missing. All of the
> general things you wrestle with
>
> like search problems, to data cleansing problems to more unique troubles
> are of interest.
>
>
>
> Please, let me know what you're missing.
>
> Thanks in advance.
>
>
>
> Wilhelm Landerholm
>
> [hidden email]
>
>
>
>
>


--
Dr Chris Stride, C. Stat, Statistician, Institute of Work Psychology,
University of Sheffield
Telephone: 0114 2223262
Fax: 0114 2727206

“Figure It Out”
Statistical Consultancy and Training Service for Social Scientists

Visit www.figureitout.org.uk for details of my consultancy services, and
forthcoming training courses, which are also available on an in-house basis:
- Data Management using SPSS syntax
- Multiple Regression using SPSS
- Multilevel Modelling using SPSS
- Structural Equation Modelling using MPlus

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

Jon K Peck
Please bear in mind that Wilhelm Landerholm is not an IBM employee or part of the SPSS group in IBM, so he cannot make changes to the SPSS products themselves.  That is not to say that he can't offer useful supplementary functions - the more the better, but several of the suggestions below could only be done by IBM staff.

Suggestions directed to SPSS Development can be sent to [hidden email].  That address is also linked in the Using This Site section of the SPSS Community site (www.ibm.com/developerworks/spssdevcentral).

Jon Peck (no "h")
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        Dr C B Stride <[hidden email]>
To:        [hidden email]
Date:        09/16/2011 08:49 AM
Subject:        Re: [SPSSX-L] Developing new features for IBM SPSS
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




A few things of varying complexity:
 - computation of icc(1) statistics within the mixed models options -
ditto computation of rwg / rwg(j) statistics for agreement within subgroups
 - option to plot random effects in mixed models (like the plot(ranef)
option in R)
 - greater ease of producing stacked barcharts adding up to 100% split
by subgroups
 - remove the necessity of having variables of certain types to do
certain procedures that has started to creep in in the last few releases
e.g. graphing and particularly the new non-parametric tests. SPSS cannot
100% accurately assign measurement level, so all this does is create an
extra task for the user in having to reassign measurement level before
using these options
 - give generalised linear mixed models the old style SPSS output, or
at least the option to select it. I had been looking forward to not
having to migrate to other packages for this methodology, but I find it
unusable with the new style presentation of results in v19
 - make R compatability not confined to a particular version of R
 - Bring back the right click facility in dialog boxes that gave brief
info/help - this was so useful for teaching students

Is that enough to keep you busy? :-)

Garry Gelade said the following on 16/09/2011 14:56:
> Wilhelm
>
>
>
> I’d like to see model specification in the regression procedure updated.
> In GLM and GEE you can specify factors without creating dummy variables,
> and you can specify interactions wihout having to compute them.  The
> same ability in plain regression would be helpful.
>
>
>
> Garry Gelade
>
>
>
> *From:* SPSSX(r) Discussion [
[hidden email]] *On Behalf
> Of *Wilhelm Landerholm
> *Sent:* 15 September 2011 08:18
> *To:* [hidden email]
> *Subject:* Developing new features for IBM SPSS
>
>
>
> I am currently developing a series of functions to IBM SPSS and are now
> looking for
>
> help to identify the features that you, as a user, missing. All of the
> general things you wrestle with
>
> like search problems, to data cleansing problems to more unique troubles
> are of interest.
>
>
>
> Please, let me know what you're missing.
>
> Thanks in advance.
>
>
>
> Wilhelm Landerholm
>
> [hidden email]
>
>
>
>
>


--
Dr Chris Stride, C. Stat, Statistician, Institute of Work Psychology,
University of Sheffield
Telephone: 0114 2223262
Fax: 0114 2727206

“Figure It Out”
Statistical Consultancy and Training Service for Social Scientists

Visit
www.figureitout.org.uk for details of my consultancy services, and
forthcoming training courses, which are also available on an in-house basis:
- Data Management using SPSS syntax
- Multiple Regression using SPSS
- Multilevel Modelling using SPSS
- Structural Equation Modelling using MPlus

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

Paul Oosterveld
In reply to this post by Wilhelm Landerholm
Currently I am looking for a way to work around the 4000 files per session
limit. I was surprised to find there was such a limit.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

Art Kendall
I have not come across this.  However, I have never had to have that many files open at once.  Is this what you mean- 4000 files open at once?
Please explain how you came across this. What kind of message did you get?


Also perhaps if you explain the approach that results in hitting this limit, some list members would be able to suggest a workaround.


Art Kendall
Social Research Consultants


On 9/23/2011 7:35 AM, Paul Oosterveld wrote:
Currently I am looking for a way to work around the 4000 files per session
limit. I was surprised to find there was such a limit.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

Paul Oosterveld
In reply to this post by Wilhelm Landerholm
I have an excel file with twitter messages, and for a textmining application I
need to write each text to a separate file. So I do not have 4000 files open
simultaneous, but I write them one at a time.
I get a message something like: the SPSS processor is terminating due to a
catastrophic error. Too many files have been defined for one session. The
limit is 3999 files.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

Art Kendall
Are you using something other than SPSS for your text mining? Are separate files the only way some other software separates text entities?

Is this something you need to do occasionally or is it for a production environment?
What kind of files are you producing? .txt .sav ?

In many OSs a file is considered by the OS as "in use" until the application, e.g., SPSS  actually releases it. Check whether this is a limitation by the OS.
So you may have 4000 output (not listing output) files open until they are released. 
Possibly a HOST command that closes a file so that the system knows your SPSS is finished with it would work? How else would the OS "know" you were not going to continue writing to it?

In my examples folder I have a syntax file to show an instance where an explicit EXECUTE was needed.  I wrote a text file but it was only an entry in the folder directory, it had no content when looked at with windows until I put in an execute.

Remember WRITE and XSAVE don't work until there is an implicit or explicit execute.

data list list/ mystring (a30).
begin data
"some text for testing"
end data.
dataset name original.
write outfile = 'c:\project\teststring.txt'
 /mystring.
*this is the oddball situation of needing an explicit EXECUTE.
execute.

GET DATA
  /TYPE=TXT
  /FILE="C:\project\teststring.txt"
  /DELCASE=LINE
  /DELIMITERS=""
  /ARRANGEMENT=DELIMITED
  /FIRSTCASE=1
  /IMPORTCASE=ALL
  /VARIABLES=
  V1 A30.
CACHE.
EXECUTE.
DATASET NAME readback WINDOW=FRONT.



HTH

Art Kendall
Social Research Consultants

On 9/23/2011 8:02 AM, Paul Oosterveld wrote:
I have an excel file with twitter messages, and for a textmining application I
need to write each text to a separate file. So I do not have 4000 files open
simultaneous, but I write them one at a time.
I get a message something like: the SPSS processor is terminating due to a
catastrophic error. Too many files have been defined for one session. The
limit is 3999 files.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

Paul Oosterveld
In reply to this post by Wilhelm Landerholm
On Fri, 23 Sep 2011 08:48:17 -0400, Art Kendall <[hidden email]> wrote:

><html>
>  <head>
>    <meta content="text/html; charset=ISO-8859-1"
>      http-equiv="Content-Type">
>  </head>
>  <body text="#000000" bgcolor="#FFFFFF">
>    <font size="+1">Are you using something other than SPSS for your
>      text mining? Are separate files the only way some other software
>      separates text entities?<br>
>      <br>
>      Is this something you need to do occasionally or is it for a
>      production environment?<br>
>      What kind of files are you producing? .txt .sav ?<br>
>      <br>
>      In many OSs a file is considered by the OS as "in use" until the
>      application, e.g., SPSS&nbsp; actually releases it. Check whether
this
>      is a limitation by the OS.<br>
>      So you may have 4000 output (not listing output) files open until
>      they are released.&nbsp; <br>
>      Possibly a HOST command that closes a file so that the system
>      knows your SPSS is finished with it would work? How else would the
>      OS "know" you were not going to continue writing to it?<br>
>      <br>
>      In my examples folder I have a syntax file to show an instance
>      where an explicit EXECUTE was needed.&nbsp; I wrote a text file but
it

>      was only an entry in the folder directory, it had no content when
>      looked at with windows until I put in an execute. <br>
>      <br>
>      Remember WRITE and XSAVE don't work until there is an implicit or
>      explicit execute.<br>
>      <br>
>    </font><font size="+1"><tt>data list list/ mystring (a30).<br>
>        begin data<br>
>        "some text for testing"<br>
>        end data.<br>
>        dataset name original.<br>
>        write outfile = 'c:\project\teststring.txt'<br>
>        &nbsp;/mystring.<br>
>        *this is the oddball situation of needing an explicit EXECUTE.<br>
>        execute.<br>
>        <br>
>        GET DATA<br>
>        &nbsp; /TYPE=TXT<br>
>        &nbsp; /FILE="C:\project\teststring.txt"<br>
>        &nbsp; /DELCASE=LINE<br>
>        &nbsp; /DELIMITERS=""<br>
>        &nbsp; /ARRANGEMENT=DELIMITED<br>
>        &nbsp; /FIRSTCASE=1<br>
>        &nbsp; /IMPORTCASE=ALL<br>
>        &nbsp; /VARIABLES=<br>
>        &nbsp; V1 A30.<br>
>        CACHE.<br>
>        EXECUTE.<br>
>        DATASET NAME readback WINDOW=FRONT.</tt></font><font size="+1"><br>
>      <br>
>    </font><br>
>    HTH<br>
>    <br>
>    Art Kendall<br>
>    Social Research Consultants<br>
>    <br>
>    On 9/23/2011 8:02 AM, Paul Oosterveld wrote:
>    <blockquote cite="mid:[hidden email]"
>      type="cite">
>      <pre wrap="">I have an excel file with twitter messages, and for a
textmining application I
>need to write each text to a separate file. So I do not have 4000 files
open
>simultaneous, but I write them one at a time.
>I get a message something like: the SPSS processor is terminating due to a
>catastrophic error. Too many files have been defined for one session. The
>limit is 3999 files.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
><a class="moz-txt-link-abbreviated"
href="mailto:[hidden email]">[hidden email]</a> (not
to SPSSX-L), with no body text except the

>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD
>
></pre>
>    </blockquote>
>  </body>
></html>
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

It seems that my previous reply was not posted (if not my excuses). I
reworked my syntax so that it can be run independent of my data file,
though.
I want to write separate text files and do a further analysis in R. I use a
macro for writing these files (I know Jon, shame on me). I was aware of the
need for an EXECUTE after a WRITE. May be I need an additional command to
close file handles. I use SPSS 18 on windows XP.
The following syntax makes my SPSS crash if the the maximum in the loop is
increased from 10 to 4000.

*this example only works if c:\temp\textfiles\ exists.
INPUT PROGRAM.
LOOP #i=1 TO 10.
- COMPUTE id=#i.
- string textvar(a10).
- compute textvar="abcdefg".
- END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
save outfile "c:\temp\textfile.sav".

*macro to write individual files.
*got this trick from Raymond Levesque's site.
DEFINE !writefile (id=!TOKENS(1)).
GET FILE='c:\temp\textfile.sav'.
SELECT IF (Id=!id).
write outfile !quote(!concat("C:\temp\textfiles\", !id, ".txt"))/textvar.
exe.
!ENDDEFINE.

*make file with macro calls.
GET FILE='c:\temp\textfile.sav'.
WRITE OUTFILE='c:\temp\macrocalls.sps' /'!writefile Id=', Id, ".".
EXECUTE.
*example of the macro call file line:
!writefile Id=    1.00.

INCLUDE 'c:\temp\macrocalls.sps'.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Developing new features for IBM SPSS

David Marso
Administrator
It starts in EXCEL correct?
Use VBA in EXCEL to to this instead of SPSS.
Psuedo code.
Get your data into an array text *Don't have on the top of my head *
See Line Input etc..

Dim F as integer
For I=1 to 4000
  F=Freefile
  Open 'C:\whatever\RootName_' & I & '.txt' For Output as #F
  Print #F, text(I)
  Close #F
Next

That Macro from Raynald is painful to look at.  How terribly inefficient!!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"