STATS TEXTANALYSIS

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

STATS TEXTANALYSIS

Jon Peck
I have updated the beta version of STATS TEXTANALYSIS.  You can get it here

The updates include much more control over stemming, more language support for stopwords and stemming, the ability to produce variables with the stemmed text and some other items.

--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: STATS TEXTANALYSIS

Kirill Orlov
Does it do stemmatizing/lemmatizing for Russian?


15.04.2021 23:41, Jon Peck пишет:
I have updated the beta version of STATS TEXTANALYSIS.  You can get it here

The updates include much more control over stemming, more language support for stopwords and stemming, the ability to produce variables with the stemmed text and some other items.


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: STATS TEXTANALYSIS

Jon Peck
Russian is supported for stopwords and stemming but not in the spell checker, although if you have a Russian file of words, you can have it use that.  I can't say how good a job it does.

On Fri, Apr 16, 2021 at 2:09 AM Kirill Orlov <[hidden email]> wrote:
Does it do stemmatizing/lemmatizing for Russian?


15.04.2021 23:41, Jon Peck пишет:
I have updated the beta version of STATS TEXTANALYSIS.  You can get it here

The updates include much more control over stemming, more language support for stopwords and stemming, the ability to produce variables with the stemmed text and some other items.




--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: STATS TEXTANALYSIS

Art Kendall
/Do you have Syntax to get 1) nltk 2)help for nltk?

I tried this/
STATS TEXTANALYSIS /help.
The nltk module is required in order to use this module.
    install using
    pip -m install nltk
    Then use nltk.download() to add specific packages including
    at least stopwords and names

    If numpy errors occur, update the numpy module using
    pip install  numpy==1.19.5
Extension command  STATS_TEXTANALYSIS  could not be loaded. The module or a
module that it requires may be missing, or there may be syntax errors in it.

/My attempt/
begin program Python3.
pip -m install nltk
end program.
  File "<string>", line 2
    pip -m install nltk
           ^
SyntaxError: invalid syntax




-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: STATS TEXTANALYSIS

Jon Peck
You run pip outside Python but installing under the python installation you are using with Statistics.  See the dialog help.

On Fri, Apr 16, 2021 at 9:16 AM Art Kendall <[hidden email]> wrote:
/Do you have Syntax to get 1) nltk 2)help for nltk?

I tried this/
STATS TEXTANALYSIS /help.
The nltk module is required in order to use this module.
    install using
    pip -m install nltk
    Then use nltk.download() to add specific packages including
    at least stopwords and names

    If numpy errors occur, update the numpy module using
    pip install  numpy==1.19.5
Extension command  STATS_TEXTANALYSIS  could not be loaded. The module or a
module that it requires may be missing, or there may be syntax errors in it.

/My attempt/
begin program Python3.
pip -m install nltk
end program.
  File "<string>", line 2
    pip -m install nltk
           ^
SyntaxError: invalid syntax




-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: STATS TEXTANALYSIS

Art Kendall
https://phoenixnap.com/kb/install-pip-windows
has us get python 3.9.4 from the windows store. Is it an inconsistency if
SPSS has 3.7?

then it says to download get-pip.py

But I do not know what folder to put that file in.

Now it appears that I have Python3 in 3 places,
1) wherever windows store put it
2) C:\Python37
3) C:\Program Files\IBM\SPSS Statistics\Python3?

if /Within/ SPSS do I put it in
C:\Program Files\IBM\SPSS Statistics\Python3\Lib
or under one of the other folders under lib



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: STATS TEXTANALYSIS

Art Kendall
also
I have a fourth Python3
C:\Users\Art\AppData\Roaming\IBM\SPSS
Statistics\One\Python39\Lib\site-packages
which has a pip.py

From the date on that file, I think it came with SPSS28 Beta.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: STATS TEXTANALYSIS

Jon Peck
In reply to this post by Art Kendall
No, no.  Pip is part of the Windows distribution.  You don't need to install it.  The V28 beta uses Python 3.9, but V27 uses 3.8, which is installed with Statistics.  However, with 27, it is still an unregistered Python.  V28 will have a new process that should make this a bit easier.

Here is what I said in the Help.
  • Make sure that you have a registered Python 3 distribution matching the Statistics version you are using. For Statistics version 27, that would be Python 3.8. If you don't have this, go to Python Software Foundation and install from there. Don't install this over the distribution installed with Statistics. After installing it, go to Edit > Options > Files in Statistics and set this location for Python 3.
  • Open a command window, cd to the location of the Python installation, and install nltk and pyspellchecker from the PyPI site:
    pip install nltk
    pip install pyspellchecker
  • Start Python from that location and run this code.
    import nltk
    nltk.download()
    This will display a table of items you can add to your installation. Select at least names and stopwords.
  • Optionally go to spelling dictionary
    as mentioned above and extract the words.txt file from words.zip.
  • Install the SPSSINC TRANS extension command via the Statistics Exensions > Extension Hub menu.

On Fri, Apr 16, 2021 at 11:10 AM Art Kendall <[hidden email]> wrote:
https://phoenixnap.com/kb/install-pip-windows
has us get python 3.9.4 from the windows store. Is it an inconsistency if
SPSS has 3.7?

then it says to download get-pip.py

But I do not know what folder to put that file in.

Now it appears that I have Python3 in 3 places,
1) wherever windows store put it
2) C:\Python37
3) C:\Program Files\IBM\SPSS Statistics\Python3?

if /Within/ SPSS do I put it in
C:\Program Files\IBM\SPSS Statistics\Python3\Lib
or under one of the other folders under lib



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD