sqlite

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

sqlite

Albert-Jan Roskam
I would like to write (large) spss files to R's sqlite database so I can access them using the sqldf package from R. Is this possible? Or should I simply write csv files and use those to populate the sqlite database with sqldf. The problem with .sav files and R's read.spss is that I cannot FIRST keep only the needed vars, THEN 'do stuff', because R crashes with large datafiles.
 
Thank you in advance!

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reply | Threaded
Open this post in threaded view
|

Re: sqlite

Jon K Peck

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Albert-Jan Roskam <[hidden email]>
To: [hidden email]
Date: 04/07/2010 06:47 AM
Subject: [SPSSX-L] sqlite
Sent by: "SPSSX(r) Discussion" <[hidden email]>





I would like to write (large) spss files to R's sqlite database so I can access them using the sqldf package from R. Is this possible? Or should I simply write csv files and use those to populate the sqlite database with sqldf. The problem with .sav files and R's read.spss is that I cannot FIRST keep only the needed vars, THEN 'do stuff', because R crashes with large datafiles.
 
>>>1.  sqlite is also part of Python, so you might find it easier to populate the database that way.
2. If you are running within SPSS, you could use the GetDataFromSPSS method to select just the variables and cases you want for R and then populate the database with them or save a native R file.  That would let you easily handle value labels, missing value mappings, and other such metadata, since, at least with V18, the get api will do all the mappings for you.  That would likely be the easiest way.

3.  sqlite does not have quite the same set of data types as SPSS, so you might need to convert some things - dates come to mind - first if you have that kind of data.

Regards,
Jon
Thank you in advance!

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



Reply | Threaded
Open this post in threaded view
|

Re: sqlite

Albert-Jan Roskam
Hi Jon,
 
Thank you for your reply.  Yes, the Python way might be the easiest. I was just hoping I could simply use SAVE TRANSLATE to populate the Sqlite database.
 
GetDataFromSPSS could still be tricky if I have a large dataset which, even with the minimum amount of columns, leads to the creation of a dataframe that does not fit into the computer' s memory. So Sqlite is more scalable, although it usually doesn't matter. Moreover, no R plugin exists for my fossil spss version ("sigh" ;-))
 
Re: dates, yep, that's always a problem (Oct 14, 1582 vs. Jan 1 1970 is quite a gap). Converting them to either strings or numeric (the latter: iso-format), or to Julian date seems easiest.

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- On Wed, 4/7/10, Jon K Peck <[hidden email]> wrote:

From: Jon K Peck <[hidden email]>
Subject: Re: [SPSSX-L] sqlite
To: [hidden email]
Date: Wednesday, April 7, 2010, 3:27 PM


Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Albert-Jan Roskam <[hidden email]>
To: [hidden email]
Date: 04/07/2010 06:47 AM
Subject: [SPSSX-L] sqlite
Sent by: "SPSSX(r) Discussion" <[hidden email]>





I would like to write (large) spss files to R's sqlite database so I can access them using the sqldf package from R. Is this possible? Or should I simply write csv files and use those to populate the sqlite database with sqldf. The problem with .sav files and R's read.spss is that I cannot FIRST keep only the needed vars, THEN 'do stuff', because R crashes with large datafiles.
 
>>>1.  sqlite is also part of Python, so you might find it easier to populate the database that way.
2. If you are running within SPSS, you could use the GetDataFromSPSS method to select just the variables and cases you want for R and then populate the database with them or save a native R file.  That would let you easily handle value labels, missing value mappings, and other such metadata, since, at least with V18, the get api will do all the mappings for you.  That would likely be the easiest way.

3.  sqlite does not have quite the same set of data types as SPSS, so you might need to convert some things - dates come to mind - first if you have that kind of data.

Regards,
Jon
Thank you in advance!

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~