MESSAGE
DATE | 2020-12-23 |
FROM | Mithun Bhattacharya
|
SUBJECT | Re: [Hangout - NYLXS] Confused about two development utils [EXT]
|
From hangout-bounces-at-nylxs.com Wed Dec 23 19:09:59 2020 Return-Path: X-Original-To: archive-at-mrbrklyn.com Delivered-To: archive-at-mrbrklyn.com Received: from www2.mrbrklyn.com (www2.mrbrklyn.com [96.57.23.82]) by mrbrklyn.com (Postfix) with ESMTP id 897C416402F; Wed, 23 Dec 2020 19:09:58 -0500 (EST) X-Original-To: hangout-at-www2.mrbrklyn.com Delivered-To: hangout-at-www2.mrbrklyn.com Received: by mrbrklyn.com (Postfix, from userid 1000) id 52DB416402A; Wed, 23 Dec 2020 19:09:55 -0500 (EST) Resent-From: Ruben Safir Resent-Date: Wed, 23 Dec 2020 19:09:55 -0500 Resent-Message-ID: <20201224000955.GA32464-at-www2.mrbrklyn.com> Resent-To: hangout-at-mrbrklyn.com X-Original-To: ruben-at-mrbrklyn.com Delivered-To: ruben-at-mrbrklyn.com Received: from mxout1-ec2-va.apache.org (mxout1-ec2-va.apache.org [3.227.148.255]) by mrbrklyn.com (Postfix) with ESMTP id 09B0E164028 for ; Wed, 23 Dec 2020 19:06:11 -0500 (EST) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-ec2-va.apache.org (ASF Mail Server at mxout1-ec2-va.apache.org) with SMTP id A1CFA47D23 for ; Thu, 24 Dec 2020 00:06:10 +0000 (UTC) Received: (qmail 48773 invoked by uid 500); 24 Dec 2020 00:06:10 -0000 Mailing-List: contact modperl-help-at-perl.apache.org; run by ezmlm Precedence: bulk Delivered-To: mailing list modperl-at-perl.apache.org Received: (qmail 48761 invoked by uid 99); 24 Dec 2020 00:06:09 -0000 Received: from spamproc1-he-de.apache.org (HELO spamproc1-he-de.apache.org) (116.203.196.100) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Dec 2020 00:06:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamproc1-he-de.apache.org (ASF Mail Server at spamproc1-he-de.apache.org) with ESMTP id C4E3F1FF3A1 for ; Thu, 24 Dec 2020 00:06:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamproc1-he-de.apache.org X-Spam-Flag: NO X-Spam-Score: 1.001 X-Spam-Level: * X-Spam-Status: No, score=1.001 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, RCVD_IN_DNSWL_BLOCKED=0.001, SCC_5_SHORT_WORD_LINES=1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamproc1-he-de.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-he-de.apache.org ([116.203.227.195]) by localhost (spamproc1-he-de.apache.org [116.203.196.100]) (amavisd-new, port 10024) with ESMTP id vIkHYcBssZJT for ; Thu, 24 Dec 2020 00:06:07 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2607:f8b0:4864:20::e33; helo=mail-vs1-xe33.google.com; envelope-from=mithnb-at-gmail.com; receiver= Received: from mail-vs1-xe33.google.com (mail-vs1-xe33.google.com [IPv6:2607:f8b0:4864:20::e33]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id 45F937F9F3 for ; Thu, 24 Dec 2020 00:06:07 +0000 (UTC) Received: by mail-vs1-xe33.google.com with SMTP id u7so605167vsg.11 for ; Wed, 23 Dec 2020 16:06:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=MtYsahtVnxvcKgVFE2B43XuBUFOMg84kXIMRWh5xSpY=; b=KffF89VcX11o4ezc02c9H97w93uGjqN18dNa2oQdscEXbNi9EgFE58E487XKHRuJtc JxfGkbBrsZYNTkfu60uLa/TjMuvGKZjcwDOOfT7I4vt+6KsY8nQXLqnQmub4t+kKfcb/ hNORvfMc5ugYL5B6a3qmkULmKp0Mpu6OnjQvnlNKNOZIaB1/QzLUxAl8aThHzgwvyaeK 2s8RzRieBr4dbsLr3mTu//hhSXqpgrpe6ZqGkNzWYhJn1GjAn7PfWd12fLUTE6SAcTNg WZQdcvEWFrU+JrQVjbm5WSryh6qEuhaOgi2cJ468CKZE2fczvlsJYWWq6Td7cauovK1d ZVxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=MtYsahtVnxvcKgVFE2B43XuBUFOMg84kXIMRWh5xSpY=; b=tPdUfVJhjInjVftyHAPBrXVB5qVdvxAnbkTFsRLL2HKOdi2ig6WMhmiWm4JRPpELp7 bKmLBFhhoOx9NhLFqHl0fVJhZEFC5wDiruuYxMMjTA9PX/Zy7DXLMNnm2t9WlbtCYERt iwv4txC7T8Y27yyhrV08wUzKE9zZd23+M7tNkl5re8m+7FymVuIS+8Jawy+Hci8jblHW KDaV72FS/QxE+VzvkAMHr2Oc1qv9XhhQ0NJgknhNW9dc+AxUS0n/oYRMHUciQyU/FDWj p91bZDC8k6NX/jk0ei++qOujX/Mox0UlZTeZd7fVmLkzDMLlbBJgotZSkWEShAhwve0t MzWg== X-Gm-Message-State: AOAM532A4CBYPH866NoqRALWpi8uEObfMQO5ltVvfDONgpVPrm/sKvnc qAYWf9hTIOV8w8w4dFGhhsXNY3GAFznVJlmiGxgjjP6e X-Google-Smtp-Source: ABdhPJxWvSCWWmui/HkuwV0AC4USjZaFUdO9ymSRg02+D2ppkYVLu5kGgnsdC0DGrJn7J3LXNmoIXgguvJS1GkeUTys= X-Received: by 2002:a67:c316:: with SMTP id r22mr19971743vsj.30.1608768365417; Wed, 23 Dec 2020 16:06:05 -0800 (PST) MIME-Version: 1.0 References: <971cc41d-b30e-7fc1-25a2-4a63f028321d-at-ice-sa.com> <90ae0836-d487-926c-89e4-696a46fae57d-at-ice-sa.com> <335e0e3cca2e4dd3aeb5f91d83ea08c0-at-sanger.ac.uk> <11d9dcd77b2a4de7a98592c31664eb0c-at-sanger.ac.uk> In-Reply-To: <11d9dcd77b2a4de7a98592c31664eb0c-at-sanger.ac.uk> From: Mithun Bhattacharya Date: Wed, 23 Dec 2020 18:05:54 -0600 Message-ID: To: mod_perl list Subject: Re: [Hangout - NYLXS] Confused about two development utils [EXT] X-BeenThere: hangout-at-nylxs.com X-Mailman-Version: 2.1.30rc1 List-Id: NYLXS Tech Talk and Politics List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============1785843175==" Errors-To: hangout-bounces-at-nylxs.com Sender: "Hangout"
--===============1785843175== Content-Type: multipart/alternative; boundary="000000000000f257f705b72a92c7"
--000000000000f257f705b72a92c7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
James would you be able to share more info about your setup ? 1. What exactly is your application doing which requires so much memory and CPU - is it something like gene splicing (no i don't know much about it beyond Jurassic Park :D ) 2. Do you feel Perl was the best choice for whatever you are doing and if yes then why ? How much of your stuff is using mod_perl considering you mentioned not much is web related ? 3. What are the challenges you are currently facing with your implementation ?
On Wed, Dec 23, 2020 at 6:58 AM James Smith wrote:
> Oh but memory is a problem =E2=80=93 but not if you have just a small clu= ster of > machines! > > Our boxes are larger than that =E2=80=93 but they all run virtual machine= {only a > small proportion web related} =E2=80=93 machines/memory would rapidly bec= ome in our > data centre - we run VMWARE [995 hosts] and openstack [10,000s of hosts] = + > a selection of large memory machines {measured in TBs of memory per machi= ne > }. > > We would be looking at somewhere between 0.5 PB and 1 PB of memory =E2=80= =93 not > just the price of buying that amount of memory - for many machines we nee= d > the fastest memory money can buy for the workload, but we would need a lo= t > more CPUs then we currently have as we would need a larger amount of > machines to have 64GB virtual machines {we would get 2 VMs per host. We > currently have approx. 1-2000 CPUs running our hardware (last time I had = a > figure) =E2=80=93 it would probably need to go to approximately 5-10,000! > It is not just the initial outlay but the environmental and financial cos= t > of running that number of machines, and finding space to run them without > putting the cooling costs through the roof!! That is without considering > what additional constraints on storage having the extra machines may have > (at the last count a year ago we had over 30 PBytes of storage on side = =E2=80=93 > and a large amount of offsite backup. > > We would also stretch the amount of power we can get from the national > grid to power it all - we currently have 3 feeds from different part of t= he > national grid (we are fortunately in position where this is possible) and > the dedicated link we would need to add more power would be at least 50 > miles long! > > So - managing cores/memory is vitally important to us =E2=80=93 moving to= the > cloud is an option we are looking at =E2=80=93 but that is more than 4 ti= mes the > price of our onsite set-up (with substantial discounts from AWS) and woul= d > require an upgrade of our existing link to the internet =E2=80=93 which i= s > currently 40Gbit of data (I think). > > Currently we are analysing a very large amounts of data directly linked t= o > the current major world problem =E2=80=93 this is why the UK is currently= being > isolated as we have discovered and can track a new strain, in near real > time =E2=80=93 other countries have no ability to do this =E2=80=93 we in= a day can and do > handle, sequence and analyse more samples than the whole of France has > sequenced since February. We probably don=E2=80=99t have more of the new = variant > strain than in other areas of the world =E2=80=93 it is just that we know= we have > because of the amount of sequencing and analysis that we in the UK have > done. > > > > *From:* Matthias Peng > *Sent:* 23 December 2020 12:02 > *To:* mod_perl list > *Subject:* Re: Confused about two development utils [EXT] > > > > Today memory is not serious problem, each of our server has 64GB memory. > > > > > Forgot to add - so our FCGI servers need a lot (and I mean a lot) more > memory than the mod_perl servers to serve the same level of content (just > in case memory blows up with FCGI backends) > > -----Original Message----- > From: James Smith > Sent: 23 December 2020 11:34 > To: Andr=C3=A9 Warnier (tomcat/perl) ; modperl-at-perl.apache= .org > Subject: RE: Confused about two development utils [EXT] > > > > This costs memory, and all the more since many perl modules are not > thread-safe, so if you use them in your code, at this moment the only saf= e > way to do it is to use the Apache httpd prefork model. This means that ea= ch > Apache httpd child process has its own copy of the perl interpreter, whic= h > means that the memory used by this embedded perl interpreter has to be > counted n times (as many times as there are Apache httpd child processes > running at any one time). > > This isn=E2=80=99t quite true - if you load modules before the process fo= rks then > they can cleverly share the same parts of memory. It is useful to be able > to "pre-load" core functionality which is used across all functions {this > is the case in Linux anyway}. It also speeds up child process generation = as > the modules are already in memory and converted to byte code. > > One of the great advantages of mod_perl is Apache2::SizeLimit which can > blow away large child process - and then if needed create new ones. This = is > not the case with some of the FCGI solutions as the individual processes > can grow if there is a memory leak or a request that retrieves a large > amount of content (even if not served), but perl can't give the memory > back. So FCGI processes only get bigger and bigger and eventually blow up > memory (or hit swap first) > > > > > > -- > The Wellcome Sanger Institute is operated by Genome Research Limited, a > charity registered in England with number 1021457 and a company register= ed > in England with number 2742969, whose registered office is 215 Euston > Road, London, NW1 2 [google.com] > ps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-= 26source-3Dg&d=3DDwMFaQ&c=3DD7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r= =3DoH2yp0ge1ecj4oDX0XM7vQ&m=3DfriR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s= =3DxU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&e=3D> > BE. > > > > -- > The Wellcome Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2 [google.com] > ps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-= 26source-3Dg&d=3DDwMFaQ&c=3DD7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r= =3DoH2yp0ge1ecj4oDX0XM7vQ&m=3DfriR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s= =3DxU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&e=3D> > BE. > > -- The Wellcome Sanger Institute is operated by Genome Research Limited, = a > charity registered in England with number 1021457 and a company registere= d > in England with number 2742969, whose registered office is 215 Euston Roa= d, > London, NW1 2BE. >
--000000000000f257f705b72a92c7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
James would you be able to share more info about your setu= p ? 1. What exactly is your application doing which requires so much me= mory and CPU - is it something like gene splicing (no i don't know much= about it beyond Jurassic Park :D ) 2. Do you feel Perl was the b= est choice for whatever you are doing and if yes then why ? How much of you= r stuff is using mod_perl considering you mentioned not much is web related= ? 3. What are the challenges you are currently facing with your = implementation ?
ockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-lef= t:1px solid rgb(204,204,204);padding-left:1ex">
Oh but memory is a problem =E2=80=93 but not i= f you have just a small cluster of machines!
Our boxes are larger than that =E2=80=93 but they all run virtual machine {= only a small proportion web related} =E2=80=93 machines/memory would rapidl= y become in our data centre - we run VMWARE [995 hosts] and openstack [10,0= 00s of hosts] + a selection of large memory machines {measured in TBs of memory per machine }.
We would be looking at somewhere between 0.5 PB and 1 PB of memory =E2=80= =93 not just the price of buying that amount of memory - for many machines = we need the fastest memory money can buy for the workload, but we would nee= d a lot more CPUs then we currently have as we would need a larger amount of machines to have 64GB virtual machines {w= e would get 2 VMs per host. We currently have approx. 1-2000 CPUs running o= ur hardware (last time I had a figure) =E2=80=93 it would probably need to = go to approximately 5-10,000!
It is not just the initial outlay but the environmental and financial cost = of running that number of machines, and finding space to run them without p= utting the cooling costs through the roof!! That is without considering wha= t additional constraints on storage having the extra machines may have (at the last count a year ago we had ov= er 30 PBytes of storage on side =E2=80=93 and a large amount of offsite bac= kup.
We would also stretch the amount of power we can get from the national grid= to power it all - we currently have 3 feeds from different part of the nat= ional grid (we are fortunately in position where this is possible) and the = dedicated link we would need to add more power would be at least 50 miles long!
So - managing cores/memory is vitally important to us =E2=80=93 moving to t= he cloud is an option we are looking at =E2=80=93 but that is more than 4 t= imes the price of our onsite set-up (with substantial discounts from AWS) a= nd would require an upgrade of our existing link to the internet =E2=80=93 which is currently 40Gbit of data (I think).
Currently we are analysing a very large amounts of data directly linked to = the current major world problem =E2=80=93 this is why the UK is currently b= eing isolated as we have discovered and can track a new strain, in near rea= l time =E2=80=93 other countries have no ability to do this =E2=80=93 we in a day can and do handle, sequence and analyse m= ore samples than the whole of France has sequenced since February. We proba= bly don=E2=80=99t have more of the new variant strain than in other areas o= f the world =E2=80=93 it is just that we know we have because of the amount of sequencing and analysis that we in the UK have done. <= /u> =C2=A0 =C2=A0 Today memory is not serious problem, each of our ser= ver has 64GB memory. =C2=A0 order-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin-left:4= .8pt;margin-right:0cm">
Forgot to add - so our FCGI servers need a lot (and I mean a lot) more memo= ry than the mod_perl servers to serve the same level of content (just in ca= se memory blows up with FCGI backends)
-----Original Message-----
From: James Smith <>js5-at-sanger.ac.uk>
Sent: 23 December 2020 11:34
To: Andr=C3=A9 Warnier (tomcat/perl) <arget=3D"_blank">aw-at-ice-sa.com>; modperl-at-perl.a= pache.org
Subject: RE: Confused about two development utils [EXT]
> This costs memory, and all the more since many perl modules are not th= read-safe, so if you use them in your code, at this moment the only safe wa= y to do it is to use the Apache httpd prefork model. This means that each A= pache httpd child process has its own copy of the perl interpreter, which means that the memory used by this emb= edded perl interpreter has to be counted n times (as many times as there ar= e Apache httpd child processes running at any one time).
This isn=E2=80=99t quite true - if you load modules before the process fork= s then they can cleverly share the same parts of memory. It is useful to be= able to "pre-load" core functionality which is used across all f= unctions {this is the case in Linux anyway}. It also speeds up child process generation as the modules are already in memory an= d converted to byte code.
One of the great advantages of mod_perl is Apache2::SizeLimit which can blo= w away large child process - and then if needed create new ones. This is no= t the case with some of the FCGI solutions as the individual processes can = grow if there is a memory leak or a request that retrieves a large amount of content (even if not served), b= ut perl can't give the memory back. So FCGI processes only get bigger a= nd bigger and eventually blow up memory (or hit swap first)
--
=C2=A0The Wellcome Sanger Institute is operated by Genome Research=C2=A0 Li= mited, a charity registered in England with number 1021457 and a=C2=A0 comp= any registered in England with number 2742969, whose registered=C2=A0 offic= e iogle.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fent= ry-3Dgmail-26source-3Dg&d=3DDwMFaQ&c=3DD7ByGjS34AllFgecYw0iC6Zq7qlm= 8uclZFI0SqQnqBo&r=3DoH2yp0ge1ecj4oDX0XM7vQ&m=3DfriR8ykiZ-NWYdX6SrbT= _ogNXEVR-4ixdkrhy5khQjA&s=3DxU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg= &e=3D" target=3D"_blank">s 215 Euston Road, London, NW1 2 [google.com]BE.
--
=C2=A0The Wellcome Sanger Institute is operated by Genome Research
=C2=A0Limited, a charity registered in England with number 1021457 and a r> =C2=A0company registered in England with number 2742969, whose registered <= br> =C2=A0office i-3A__www.google.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW= 1-2B2-3Fentry-3Dgmail-26source-3Dg&d=3DDwMFaQ&c=3DD7ByGjS34AllFgecY= w0iC6Zq7qlm8uclZFI0SqQnqBo&r=3DoH2yp0ge1ecj4oDX0XM7vQ&m=3DfriR8ykiZ= -NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s=3DxU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI= 0PYScsSNjSg&e=3D" target=3D"_blank">s 215 Euston Road, London, NW1 2 [google.com]BE.
--=20 The Wellcome Sanger Institute is operated by Genome Research=20 Limited, a charity registered in England with number 1021457 and a=20 company registered in England with number 2742969, whose registered=20 office is 215 Euston Road, London, NW1 2BE.=20
--000000000000f257f705b72a92c7--
--===============1785843175== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline
_______________________________________________ Hangout mailing list Hangout-at-nylxs.com http://lists.mrbrklyn.com/mailman/listinfo/hangout
--===============1785843175==--
--===============1785843175== Content-Type: multipart/alternative; boundary="000000000000f257f705b72a92c7"
--000000000000f257f705b72a92c7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
James would you be able to share more info about your setup ? 1. What exactly is your application doing which requires so much memory and CPU - is it something like gene splicing (no i don't know much about it beyond Jurassic Park :D ) 2. Do you feel Perl was the best choice for whatever you are doing and if yes then why ? How much of your stuff is using mod_perl considering you mentioned not much is web related ? 3. What are the challenges you are currently facing with your implementation ?
On Wed, Dec 23, 2020 at 6:58 AM James Smith wrote:
> Oh but memory is a problem =E2=80=93 but not if you have just a small clu= ster of > machines! > > Our boxes are larger than that =E2=80=93 but they all run virtual machine= {only a > small proportion web related} =E2=80=93 machines/memory would rapidly bec= ome in our > data centre - we run VMWARE [995 hosts] and openstack [10,000s of hosts] = + > a selection of large memory machines {measured in TBs of memory per machi= ne > }. > > We would be looking at somewhere between 0.5 PB and 1 PB of memory =E2=80= =93 not > just the price of buying that amount of memory - for many machines we nee= d > the fastest memory money can buy for the workload, but we would need a lo= t > more CPUs then we currently have as we would need a larger amount of > machines to have 64GB virtual machines {we would get 2 VMs per host. We > currently have approx. 1-2000 CPUs running our hardware (last time I had = a > figure) =E2=80=93 it would probably need to go to approximately 5-10,000! > It is not just the initial outlay but the environmental and financial cos= t > of running that number of machines, and finding space to run them without > putting the cooling costs through the roof!! That is without considering > what additional constraints on storage having the extra machines may have > (at the last count a year ago we had over 30 PBytes of storage on side = =E2=80=93 > and a large amount of offsite backup. > > We would also stretch the amount of power we can get from the national > grid to power it all - we currently have 3 feeds from different part of t= he > national grid (we are fortunately in position where this is possible) and > the dedicated link we would need to add more power would be at least 50 > miles long! > > So - managing cores/memory is vitally important to us =E2=80=93 moving to= the > cloud is an option we are looking at =E2=80=93 but that is more than 4 ti= mes the > price of our onsite set-up (with substantial discounts from AWS) and woul= d > require an upgrade of our existing link to the internet =E2=80=93 which i= s > currently 40Gbit of data (I think). > > Currently we are analysing a very large amounts of data directly linked t= o > the current major world problem =E2=80=93 this is why the UK is currently= being > isolated as we have discovered and can track a new strain, in near real > time =E2=80=93 other countries have no ability to do this =E2=80=93 we in= a day can and do > handle, sequence and analyse more samples than the whole of France has > sequenced since February. We probably don=E2=80=99t have more of the new = variant > strain than in other areas of the world =E2=80=93 it is just that we know= we have > because of the amount of sequencing and analysis that we in the UK have > done. > > > > *From:* Matthias Peng > *Sent:* 23 December 2020 12:02 > *To:* mod_perl list > *Subject:* Re: Confused about two development utils [EXT] > > > > Today memory is not serious problem, each of our server has 64GB memory. > > > > > Forgot to add - so our FCGI servers need a lot (and I mean a lot) more > memory than the mod_perl servers to serve the same level of content (just > in case memory blows up with FCGI backends) > > -----Original Message----- > From: James Smith > Sent: 23 December 2020 11:34 > To: Andr=C3=A9 Warnier (tomcat/perl) ; modperl-at-perl.apache= .org > Subject: RE: Confused about two development utils [EXT] > > > > This costs memory, and all the more since many perl modules are not > thread-safe, so if you use them in your code, at this moment the only saf= e > way to do it is to use the Apache httpd prefork model. This means that ea= ch > Apache httpd child process has its own copy of the perl interpreter, whic= h > means that the memory used by this embedded perl interpreter has to be > counted n times (as many times as there are Apache httpd child processes > running at any one time). > > This isn=E2=80=99t quite true - if you load modules before the process fo= rks then > they can cleverly share the same parts of memory. It is useful to be able > to "pre-load" core functionality which is used across all functions {this > is the case in Linux anyway}. It also speeds up child process generation = as > the modules are already in memory and converted to byte code. > > One of the great advantages of mod_perl is Apache2::SizeLimit which can > blow away large child process - and then if needed create new ones. This = is > not the case with some of the FCGI solutions as the individual processes > can grow if there is a memory leak or a request that retrieves a large > amount of content (even if not served), but perl can't give the memory > back. So FCGI processes only get bigger and bigger and eventually blow up > memory (or hit swap first) > > > > > > -- > The Wellcome Sanger Institute is operated by Genome Research Limited, a > charity registered in England with number 1021457 and a company register= ed > in England with number 2742969, whose registered office is 215 Euston > Road, London, NW1 2 [google.com] > ps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-= 26source-3Dg&d=3DDwMFaQ&c=3DD7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r= =3DoH2yp0ge1ecj4oDX0XM7vQ&m=3DfriR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s= =3DxU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&e=3D> > BE. > > > > -- > The Wellcome Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2 [google.com] > ps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-= 26source-3Dg&d=3DDwMFaQ&c=3DD7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r= =3DoH2yp0ge1ecj4oDX0XM7vQ&m=3DfriR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s= =3DxU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&e=3D> > BE. > > -- The Wellcome Sanger Institute is operated by Genome Research Limited, = a > charity registered in England with number 1021457 and a company registere= d > in England with number 2742969, whose registered office is 215 Euston Roa= d, > London, NW1 2BE. >
--000000000000f257f705b72a92c7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
James would you be able to share more info about your setu= p ? 1. What exactly is your application doing which requires so much me= mory and CPU - is it something like gene splicing (no i don't know much= about it beyond Jurassic Park :D ) 2. Do you feel Perl was the b= est choice for whatever you are doing and if yes then why ? How much of you= r stuff is using mod_perl considering you mentioned not much is web related= ? 3. What are the challenges you are currently facing with your = implementation ?
ockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-lef= t:1px solid rgb(204,204,204);padding-left:1ex">
Oh but memory is a problem =E2=80=93 but not i= f you have just a small cluster of machines!
Our boxes are larger than that =E2=80=93 but they all run virtual machine {= only a small proportion web related} =E2=80=93 machines/memory would rapidl= y become in our data centre - we run VMWARE [995 hosts] and openstack [10,0= 00s of hosts] + a selection of large memory machines {measured in TBs of memory per machine }.
We would be looking at somewhere between 0.5 PB and 1 PB of memory =E2=80= =93 not just the price of buying that amount of memory - for many machines = we need the fastest memory money can buy for the workload, but we would nee= d a lot more CPUs then we currently have as we would need a larger amount of machines to have 64GB virtual machines {w= e would get 2 VMs per host. We currently have approx. 1-2000 CPUs running o= ur hardware (last time I had a figure) =E2=80=93 it would probably need to = go to approximately 5-10,000!
It is not just the initial outlay but the environmental and financial cost = of running that number of machines, and finding space to run them without p= utting the cooling costs through the roof!! That is without considering wha= t additional constraints on storage having the extra machines may have (at the last count a year ago we had ov= er 30 PBytes of storage on side =E2=80=93 and a large amount of offsite bac= kup.
We would also stretch the amount of power we can get from the national grid= to power it all - we currently have 3 feeds from different part of the nat= ional grid (we are fortunately in position where this is possible) and the = dedicated link we would need to add more power would be at least 50 miles long!
So - managing cores/memory is vitally important to us =E2=80=93 moving to t= he cloud is an option we are looking at =E2=80=93 but that is more than 4 t= imes the price of our onsite set-up (with substantial discounts from AWS) a= nd would require an upgrade of our existing link to the internet =E2=80=93 which is currently 40Gbit of data (I think).
Currently we are analysing a very large amounts of data directly linked to = the current major world problem =E2=80=93 this is why the UK is currently b= eing isolated as we have discovered and can track a new strain, in near rea= l time =E2=80=93 other countries have no ability to do this =E2=80=93 we in a day can and do handle, sequence and analyse m= ore samples than the whole of France has sequenced since February. We proba= bly don=E2=80=99t have more of the new variant strain than in other areas o= f the world =E2=80=93 it is just that we know we have because of the amount of sequencing and analysis that we in the UK have done. <= /u> =C2=A0 =C2=A0 Today memory is not serious problem, each of our ser= ver has 64GB memory. =C2=A0 order-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin-left:4= .8pt;margin-right:0cm">
Forgot to add - so our FCGI servers need a lot (and I mean a lot) more memo= ry than the mod_perl servers to serve the same level of content (just in ca= se memory blows up with FCGI backends)
-----Original Message-----
From: James Smith <>js5-at-sanger.ac.uk>
Sent: 23 December 2020 11:34
To: Andr=C3=A9 Warnier (tomcat/perl) <arget=3D"_blank">aw-at-ice-sa.com>; modperl-at-perl.a= pache.org
Subject: RE: Confused about two development utils [EXT]
> This costs memory, and all the more since many perl modules are not th= read-safe, so if you use them in your code, at this moment the only safe wa= y to do it is to use the Apache httpd prefork model. This means that each A= pache httpd child process has its own copy of the perl interpreter, which means that the memory used by this emb= edded perl interpreter has to be counted n times (as many times as there ar= e Apache httpd child processes running at any one time).
This isn=E2=80=99t quite true - if you load modules before the process fork= s then they can cleverly share the same parts of memory. It is useful to be= able to "pre-load" core functionality which is used across all f= unctions {this is the case in Linux anyway}. It also speeds up child process generation as the modules are already in memory an= d converted to byte code.
One of the great advantages of mod_perl is Apache2::SizeLimit which can blo= w away large child process - and then if needed create new ones. This is no= t the case with some of the FCGI solutions as the individual processes can = grow if there is a memory leak or a request that retrieves a large amount of content (even if not served), b= ut perl can't give the memory back. So FCGI processes only get bigger a= nd bigger and eventually blow up memory (or hit swap first)
--
=C2=A0The Wellcome Sanger Institute is operated by Genome Research=C2=A0 Li= mited, a charity registered in England with number 1021457 and a=C2=A0 comp= any registered in England with number 2742969, whose registered=C2=A0 offic= e iogle.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fent= ry-3Dgmail-26source-3Dg&d=3DDwMFaQ&c=3DD7ByGjS34AllFgecYw0iC6Zq7qlm= 8uclZFI0SqQnqBo&r=3DoH2yp0ge1ecj4oDX0XM7vQ&m=3DfriR8ykiZ-NWYdX6SrbT= _ogNXEVR-4ixdkrhy5khQjA&s=3DxU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg= &e=3D" target=3D"_blank">s 215 Euston Road, London, NW1 2 [google.com]BE.
--
=C2=A0The Wellcome Sanger Institute is operated by Genome Research
=C2=A0Limited, a charity registered in England with number 1021457 and a r> =C2=A0company registered in England with number 2742969, whose registered <= br> =C2=A0office i-3A__www.google.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW= 1-2B2-3Fentry-3Dgmail-26source-3Dg&d=3DDwMFaQ&c=3DD7ByGjS34AllFgecY= w0iC6Zq7qlm8uclZFI0SqQnqBo&r=3DoH2yp0ge1ecj4oDX0XM7vQ&m=3DfriR8ykiZ= -NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s=3DxU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI= 0PYScsSNjSg&e=3D" target=3D"_blank">s 215 Euston Road, London, NW1 2 [google.com]BE.
--=20 The Wellcome Sanger Institute is operated by Genome Research=20 Limited, a charity registered in England with number 1021457 and a=20 company registered in England with number 2742969, whose registered=20 office is 215 Euston Road, London, NW1 2BE.=20
--000000000000f257f705b72a92c7--
--===============1785843175== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline
_______________________________________________ Hangout mailing list Hangout-at-nylxs.com http://lists.mrbrklyn.com/mailman/listinfo/hangout
--===============1785843175==--
|
|