MESSAGE
DATE | 2020-08-05 |
FROM | James Smith
|
SUBJECT | Re: [Hangout - NYLXS] Question about deployment of math computing
|
Wesley,
You will have seen my posts elsewhere - we work on large Terra/Peta byte scale datasets {and these aren't a large number of large records but more a very, very large number of small records} so the memory and response times are both large - less so compute in some cases but not others.
The services which use apache/mod_perl work reliably and return data for these - the dancer/starman sometimes fail/hang as there are no backends to serve the requests or those backends timeout requests to the nginx/proxy (but still continue using resources). The team running the backends fail to notice this - because there is no easy to see reporting etc on these boxes.
We do have other services which we have set up which return large amounts of data computed on the fly and the response time for these could be multiple hours - but by carefully streaming the data in apache we can get the data to return. A similar option isn't available in dancer (or wasn't at the time) to handle these sorts of requests and so similar code was impossible.
In most cases starman hasn't really been the answer and apache works sufficiently well. Even where people are using nginx we are often now using some of the alternative apache workers (mpm_event) which seem to be better/more reliable than nginx, and means we don't have to have completely different configuration setups for some of our proxies, static servers and dynamic content servers.
The good thing about Apache is it's dynamic rescaling - which isn't as easy with starman - if you have a large code base the spin up time for starman can be quite large as it appears (to make it efficient) load in every bit of code that the application needs - even if it is one of those small edge cases.
So yes use starman for simple apps if you need to, but for complex stuff I find mod_perl setup more reliable.
James
-----Original Message----- From: Wesley Peng Sent: 05 August 2020 04:31 To: dcook-at-prosentient.com.au; modperl-at-perl.apache.org Subject: Re: Question about deployment of math computing [EXT]
Hi
dcook-at-prosentient.com.au wrote: > That's interesting. After re-reading your earlier email, I think that I misunderstood what you were saying. > > Since this is a mod_perl listserv, I imagine that the advice will always be to use mod_perl rather than starman? > > Personally, I'd say either option would be fine. In my experience, the key advantage of mod_perl or starman (say over CGI) is that you can pre-load libraries into memory at web server startup time, and that processes are persistent (although they do have limited lifetimes of course). > > You could use a framework like Catalyst or Mojolicious (note Dancer is another framework, but I haven't worked with it) which can support different web servers, and then try the different options to see what suits you best. > > One thing to note would be that usually people put a reverse proxy in front of starman like Apache or Nginx (partially for serving static assets but other reasons as well). Your stack could be less complicated if you just went the mod_perl/Apache route. > > That said, what OS are you planning to use? It's worth checking if mod_perl is easily available in your target OS's package repositories. I think Red Hat dropped mod_perl starting with RHEL 8, although EPEL 8 now has mod_perl in it. Something to think about.
We use ubuntu 16.04 and 18.04.
We do use dancer/starman in product env, but the service only handle light weight API requests, for example, a restful api for data validation.
While our math computing is heavy weight service, each request will take a lot time to finish, so I think should it be deployed in dancer?
Since the webserver behind dancer is starman by default, starman is event driven, it uses very few processes ,and the process can't scale up/down automatically.
We deploy starman with 5 processes by default. when 5 requests coming, all 5 starman processes are so busy to compute them, so the next request will be blocked. is it?
But apache mp is working as prefork way, generally it can have as many as thousands of processes if the resource is permitted. And the process management can scale up/down the children automatically.
So my real question is, for a CPU consuming service, the event driven service like starman, has no advantage than preforked service like Apache.
Am I right?
Thanks.
-- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________ Hangout mailing list Hangout-at-nylxs.com http://lists.mrbrklyn.com/mailman/listinfo/hangout
|
|