Request to register for Geostatistics and Geoprocessing Twikis & using WPSR for multiple concurrent connections to RServe ...

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Request to register for Geostatistics and Geoprocessing Twikis & using WPSR for multiple concurrent connections to RServe ...

cuthillt

Hi,

 

My name is Tom Cuthill and I work for Landcare Research, a crown corporation, in New Zealand.  We have developed a WPS with the 52 degrees North infrastructure and we are very happy with the results.  I am just embarking on a new project  which requires web access to R scripts and I’m wondering about the suitability of using WPSR.  I’ve already built a prototype standalone servlet(not using the 52 degrees North infrastructure) which successfully runs multiple simultaneous connections to RServe to run Monte Carlo simulations and then merge the results and to serve back an .Rdata object so that the processing can be called remotely from the R environment.  I’m wondering if using WPSR might be a more elegant approach.

 

Would I be able to register to the Geostatistics and geoprocessing services Web Twiki and to get access to the WPSR source code?  If I could also ask two quick questions it would be a  great help.

 

1.       Can the call to the R script be configured for multiple connections?  Our need for parallel processing on the server is paramount.

2.       Can the results coming back from RServe be captured in the middleware (a bit like it is when you extent AbstractAlgorithm) so that further processing can be done before it gets sent back to the client?

 

Thanks for your help!

Tom Cuthill

 

PS. My email address is [hidden email]




Please consider the environment before printing this email
Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails.
The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz

_______________________________________________
Geoprocessingservices mailing list
[hidden email]
http://list.52north.org/mailman/listinfo/geoprocessingservices
http://geoprocessing.forum.52north.org
Please respect our mailing list guidelines:
http://52north.org/resources/mailing-lists-and-forums/guidelines
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Request to register for Geostatistics and Geoprocessing Twikis & using WPSR for multiple concurrent connections to RServe ...

Daniel
Hi Tom!

Thanks for your interest in the 52°North WPS framework, or even better:
for being an active user already :-).

Am 16/08/2015 um 23:30 schrieb Tom Cuthill:

> My name is Tom Cuthill and I work for Landcare Research, a crown
> corporation, in New Zealand.  We have developed a WPS with the 52
> degrees North infrastructure and we are very happy with the results.  I
> am just embarking on a new project  which requires web access to R
> scripts and I’m wondering about the suitability of using WPSR.  I’ve
> already built a prototype standalone servlet(not using the 52 degrees
> North infrastructure) which successfully runs multiple simultaneous
> connections to RServe to run Monte Carlo simulations and then merge the
> results and to serve back an .Rdata object so that the processing can be
> called remotely from the R environment.  I’m wondering if using WPSR
> might be a more elegant approach.

It depends. When your solution works I (albeit being the author of a lot
of WPS4R code) won't recommend to change that :-).

A primary motivation for WPS4R is to enable scientists that work with R
to easily publish their analyses as WPS processes. Since you are a
software developer yourself who can implement a server component to
execute R on a server, you surely fall out of that category.

But I can certainly see a benefit if you already have a 52N WPS in your
organisation.

Would your app actually profit from being exposed through the
standardized interface?

> Would I be able to register to the Geostatistics and geoprocessing
> services Web Twiki and to get access to the WPSR source code?  If I
> could also ask two quick questions it would be a  great help.

WPS4R is part of the WPS's codebase in the module 52n-wps-r, see
https://github.com/52North/WPS/tree/dev/52n-wps-r

> 1.Can the call to the R script be configured for multiple connections?
> Our need for parallel processing on the server is paramount.

Do you mean multiple connections from WPS clients?

WPS4R currently does not have specific code to execute multiple parallel
processes, but we would certainly be interested to evaluate and test
this further in collaboration with you.

The only limitation that I am aware of is the one that Rserve only
supports one connection on Windows machines [1], so we currently only
deploy WPS4R on Linux-based infrastrcutures and leave the handling of
multiple parallel requests to RServe.

> 2.Can the results coming back from RServe be captured in the middleware
> (a bit like it is when you extent AbstractAlgorithm) so that further
> processing can be done before it gets sent back to the client?

No, not out of the box. The class that executes the R process is
GenericRProcess [2], which itself is an abstract algorithm. You could
extend that class and add your own extensions in the run method (i.e.
overriding the run method, calling super.run() and then do what you need
to do.

Can you further explain what kind of processing you need to do? Can you
not do it in R?

An alternative would be to use a process chain, i.e. do the
post-processing in a seperate (Java?) process.


Hope this helps,
Daniel

[1] http://www.rforge.net/Rserve/faq.html#platform
[2]
https://github.com/52North/WPS/blob/dev/52n-wps-r/src/main/java/org/n52/wps/server/r/GenericRProcess.java

> Thanks for your help!
>
> Tom Cuthill
>
> PS. My email address is [hidden email]
>
>
> ------------------------------------------------------------------------
>
> Please consider the environment before printing this email
> Warning: This electronic message together with any attachments is
> confidential. If you receive it in error: (i) you must not read, use,
> disclose, copy or retain it; (ii) please contact the sender immediately
> by reply email and then delete the emails.
> The views expressed in this email may not be those of Landcare Research
> New Zealand Limited. http://www.landcareresearch.co.nz
>
>
> _______________________________________________
> Geoprocessingservices mailing list
> [hidden email]
> http://list.52north.org/mailman/listinfo/geoprocessingservices
> http://geoprocessing.forum.52north.org
> Please respect our mailing list guidelines:
> http://52north.org/resources/mailing-lists-and-forums/guidelines
>


--
Daniel Nüst
52°North Initiative for Geospatial Open Source Software GmbH
Martin-Luther-King-Weg 24
48155 Münster, Germany
E-Mail: [hidden email]
Fon: +49-(0)-251–396371-36
Fax: +49-(0)-251–396371-11

http://52north.org/
Twitter: @FiveTwoN

General Managers: Dr. Albert Remke, Dr. Andreas Wytzisk
Local Court Muenster HRB 10849
_______________________________________________
Geoprocessingservices mailing list
[hidden email]
http://list.52north.org/mailman/listinfo/geoprocessingservices
http://geoprocessing.forum.52north.org
Please respect our mailing list guidelines:
http://52north.org/resources/mailing-lists-and-forums/guidelines
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Request to register for Geostatistics and Geoprocessing Twikis & using WPSR for multiple concurrent connections to RServe ...

cuthillt
Hi Daniel,

Thanks for getting back to me.  My organization has a national soils database which includes uncertainty in all its attributes.  The goal is to allow research scientists to request suites of data (Monte Carlo simulations) which are representative of various soils that they may be interested in.  They use this data in complex models that they are developing in R.

Effectively we are developing a web service which supplies the data which can be loaded as an .Rdata object back in R.  Since the Monte Carlo simulations are very computationally heavy we want each web service request to translate into multiple simultaneous connections to RServe, to get a performance boost.  We also want to be able to manage the interaction in an algorithm because the URL request parameters identifying soils must be looked up in a database to get their associated uncertainty characteristics, which are then passed through to the RServe connections.  The algorithm must then gather the results from the connections and merge them.  It is also foreseeable that the algorithm does some post processing (which is only possible when all the realizations are gathered together).

Over time the request for the simulations may be made spatially (eg. for all the soils within a given region).  Also the results may be packaged in different formats; UncertML may be one.

If the 52 degrees north infrastructure supported multiple connections to RServe per WPS request there could be a definite advantage to users in terms of performance.  This is one of the advantages of using RServe and it means that the user doesn't have to wrap their code in package like 'snow' to gain parallelism.  In the R code annotation, the user could specify the number of parallel connections required and the infrastructure could gather the results in a list which could then be reformatted to XML or some other form in a ParallelRAlgorithm class.

I will have to talk to my superiors if there are the resources to follow this path now, or to come up with a more tailored servlet now, and to consider a more general approach using the 52 Degrees North infrastructure later.

Thanks again for your help,
Tom

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Request to register for Geostatistics and Geoprocessing Twikis & using WPSR for multiple concurrent connections to RServe ...

Daniel
Hi Tom,

Am 17/08/2015 um 23:20 schrieb cuthillt:
> Thanks for getting back to me.  My organization has a national soils
> database which includes uncertainty in all its attributes.  The goal is to
> allow research scientists to request suites of data (Monte Carlo
> simulations) which are representative of various soils that they may be
> interested in.  They use this data in complex models that they are
> developing in R.

Thanks for the context!

> Effectively we are developing a web service which supplies the data which
> can be loaded as an .Rdata object back in R.  Since the Monte Carlo
> simulations are very computationally heavy we want each web service request
> to translate into multiple simultaneous connections to RServe, to get a
> performance boost.  We also want to be able to manage the interaction in an
> algorithm because the URL request parameters identifying soils must be
> looked up in a database to get their associated uncertainty characteristics,
> which are then passed through to the RServe connections.  The algorithm must
> then gather the results from the connections and merge them.  It is also
> foreseeable that the algorithm does some post processing (which is only
> possible when all the realizations are gathered together).
>
> Over time the request for the simulations may be made spatially (eg. for all
> the soils within a given region).  Also the results may be packaged in
> different formats; UncertML may be one.

Those are features currently not included in WPS4R.

> If the 52 degrees north infrastructure supported multiple connections to
> RServe per WPS request there could be a definite advantage to users in terms
> of performance.  This is one of the advantages of using RServe and it means
> that the user doesn't have to wrap their code in package like 'snow' to gain
> parallelism.  In the R code annotation, the user could specify the number of
> parallel connections required and the infrastructure could gather the
> results in a list which could then be reformatted to XML or some other form
> in a ParallelRAlgorithm class.

Have you evaluated starting parellel processes within R?

But wait, I think you have. So you'd rather divide the task around the R
script. Given the data structure you describe (different regions) you
would do the splitting up and merging within Java, correct?

> I will have to talk to my superiors if there are the resources to follow
> this path now, or to come up with a more tailored servlet now, and to
> consider a more general approach using the 52 Degrees North infrastructure
> later.

Looking forward to hear about your decision. Let us know if we can help
in any way.

/Daniel

> --
> View this message in context: http://geoprocessing.forum.52north.org/Request-to-register-for-Geostatistics-and-Geoprocessing-Twikis-using-WPSR-for-multiple-concurrent-co-tp4026123p4026126.html
> Sent from the 52° North - Geoprocessing Community Forum mailing list archive at Nabble.com.
> _______________________________________________
> Geoprocessingservices mailing list
> [hidden email]
> http://list.52north.org/mailman/listinfo/geoprocessingservices
> http://geoprocessing.forum.52north.org
> Please respect our mailing list guidelines:
> http://52north.org/resources/mailing-lists-and-forums/guidelines


--
Daniel Nüst
52°North Initiative for Geospatial Open Source Software GmbH
Martin-Luther-King-Weg 24
48155 Münster, Germany
E-Mail: [hidden email]
Fon: +49-(0)-251–396371-36
Fax: +49-(0)-251–396371-11

http://52north.org/
Twitter: @FiveTwoN

General Managers: Dr. Albert Remke, Dr. Andreas Wytzisk
Local Court Muenster HRB 10849
_______________________________________________
Geoprocessingservices mailing list
[hidden email]
http://list.52north.org/mailman/listinfo/geoprocessingservices
http://geoprocessing.forum.52north.org
Please respect our mailing list guidelines:
http://52north.org/resources/mailing-lists-and-forums/guidelines
Loading...