Clementine - run stream on varying number of input files

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Clementine - run stream on varying number of input files

Albert-Jan Roskam
Hi,

I'm using Clementine v11 to process data. I have a generic stream that needs to be run on all incoming csv files. The number of incoming files varies. Is there a way to loop over all files, similar to file globbing in Python?

Í've recently started using Clementine. Actually, I was thinking/hoping that Python would be the "oil" of the program, but instead they use CLEMB. Can I still somehow invoke Python to automate certain tasks? I know that Python is used for the bulkloader in Clementine, so perhaps there are more possibilities.

Thanks!

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before you criticize someone, walk a mile in their shoes, that way
when you do criticize them, you're a mile away and you have their shoes!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Clementine - run stream on varying number of input files

Edwin Meijdam
Hoi Albert-Jan,

The key here is not automating Clementine from the inside (although this is possible!), but automating from the outside. The Clementine stream files (.str) are zipped XML files. These XML files are not hard to adjust. I used this method (in C#) to automatically convert SPSS and SAS file definitions to clementine input nodes, eliminating manually typing in fixed file definitions. I'm sorry that I do not have an example at this moment, but I'm sure if you unzip (just rename to zip) the .str files you see what I mean.

Hope this helps,

Edwin Meijdam


On Mon, Oct 26, 2009 at 4:03 PM, Albert-Jan Roskam <[hidden email]> wrote:
Hi,

I'm using Clementine v11 to process data. I have a generic stream that needs to be run on all incoming csv files. The number of incoming files varies. Is there a way to loop over all files, similar to file globbing in Python?

Í've recently started using Clementine. Actually, I was thinking/hoping that Python would be the "oil" of the program, but instead they use CLEMB. Can I still somehow invoke Python to automate certain tasks? I know that Python is used for the bulkloader in Clementine, so perhaps there are more possibilities.

Thanks!

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before you criticize someone, walk a mile in their shoes, that way
when you do criticize them, you're a mile away and you have their shoes!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Edwin Meijdam
[hidden email]
http://www.linkedin.com/in/emeijdam


Reply | Threaded
Open this post in threaded view
|

Re: Clementine - run stream on varying number of input files

Albert-Jan Roskam
Bedankt!

Very useful to know that the .str files are simply zipped .xml files.
I could use Python to make a list of all files in a given dir, loop over them and unzip each file, parse its xml code, modify the relevant elements and have Clementine run the resulting modified .str files.

But it's quite strange that such a task isn't possible in Clementine as it is. I've only been using Clementine for a short while, but the version I'm using stilll contains quite a bit of peculiarities, bugs and other 'this could be better' kind of things. Simple copy-paste actions that don't work, maximize screen buttons that are missing, problems reading csv files, impossibility to read (multisheet) xls files. etc.

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before you criticize someone, walk a mile in their shoes, that way
when you do criticize them, you're a mile away and you have their shoes!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


--- On Mon, 10/26/09, Edwin Meijdam <[hidden email]> wrote:

> From: Edwin Meijdam <[hidden email]>
> Subject: Re: Clementine - run stream on varying number of input files
> To: "Albert-Jan Roskam" <[hidden email]>
> Cc: [hidden email]
> Date: Monday, October 26, 2009, 9:43 PM
> Hoi Albert-Jan,
>
> The key here is not automating Clementine from the inside
> (although this is possible!), but automating from the
> outside. The Clementine stream files (.str) are zipped XML
> files. These XML files are not hard to adjust. I used this
> method (in C#) to automatically convert SPSS and SAS file
> definitions to clementine input nodes, eliminating manually
> typing in fixed file definitions. I'm sorry that I do
> not have an example at this moment, but I'm sure if you
> unzip (just rename to zip) the .str files you see what I
> mean.
>
>
>
> Hope this helps,
>
> Edwin Meijdam
>
>
> On Mon, Oct 26, 2009 at 4:03 PM,
> Albert-Jan Roskam <[hidden email]>
> wrote:
>
>
> Hi,
>
>
>
> I'm using Clementine v11 to process data. I have a
> generic stream that needs to be run on all incoming csv
> files. The number of incoming files varies. Is there a way
> to loop over all files, similar to file globbing in Python?
>
>
>
>
>
> Í've recently started using Clementine. Actually, I
> was thinking/hoping that Python would be the "oil"
> of the program, but instead they use CLEMB. Can I still
> somehow invoke Python to automate certain tasks? I know that
> Python is used for the bulkloader in Clementine, so perhaps
> there are more possibilities.
>
>
>
>
>
> Thanks!
>
>
>
> Cheers!!
>
> Albert-Jan
>
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Before you criticize someone, walk a mile in their shoes,
> that way
>
> when you do criticize them, you're a mile away and you
> have their shoes!
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>
> =====================
>
> To manage your subscription to SPSSX-L, send a message to
>
> [hidden email]
> (not to SPSSX-L), with no body text except the
>
> command. To leave the list, send the command
>
> SIGNOFF SPSSX-L
>
> For a list of commands to manage subscriptions, send the
> command
>
> INFO REFCARD
>
>
>
>
> --
> Edwin Meijdam
> [hidden email]
> http://www.linkedin.com/in/emeijdam
>
>
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD