Thu Nov 21 23:03:44 2024
EVENTS
 FREE
SOFTWARE
INSTITUTE

POLITICS
JOBS
MEMBERS'
CORNER

MAILING
LIST

NYLXS Mailing Lists and Archives
NYLXS Members have a lot to say and share but we don't keep many secrets. Join the Hangout Mailing List and say your peice.

DATE 2007-11-01

HANGOUT

2024-11-21 | 2024-10-21 | 2024-09-21 | 2024-08-21 | 2024-07-21 | 2024-06-21 | 2024-05-21 | 2024-04-21 | 2024-03-21 | 2024-02-21 | 2024-01-21 | 2023-12-21 | 2023-11-21 | 2023-10-21 | 2023-09-21 | 2023-08-21 | 2023-07-21 | 2023-06-21 | 2023-05-21 | 2023-04-21 | 2023-03-21 | 2023-02-21 | 2023-01-21 | 2022-12-21 | 2022-11-21 | 2022-10-21 | 2022-09-21 | 2022-08-21 | 2022-07-21 | 2022-06-21 | 2022-05-21 | 2022-04-21 | 2022-03-21 | 2022-02-21 | 2022-01-21 | 2021-12-21 | 2021-11-21 | 2021-10-21 | 2021-09-21 | 2021-08-21 | 2021-07-21 | 2021-06-21 | 2021-05-21 | 2021-04-21 | 2021-03-21 | 2021-02-21 | 2021-01-21 | 2020-12-21 | 2020-11-21 | 2020-10-21 | 2020-09-21 | 2020-08-21 | 2020-07-21 | 2020-06-21 | 2020-05-21 | 2020-04-21 | 2020-03-21 | 2020-02-21 | 2020-01-21 | 2019-12-21 | 2019-11-21 | 2019-10-21 | 2019-09-21 | 2019-08-21 | 2019-07-21 | 2019-06-21 | 2019-05-21 | 2019-04-21 | 2019-03-21 | 2019-02-21 | 2019-01-21 | 2018-12-21 | 2018-11-21 | 2018-10-21 | 2018-09-21 | 2018-08-21 | 2018-07-21 | 2018-06-21 | 2018-05-21 | 2018-04-21 | 2018-03-21 | 2018-02-21 | 2018-01-21 | 2017-12-21 | 2017-11-21 | 2017-10-21 | 2017-09-21 | 2017-08-21 | 2017-07-21 | 2017-06-21 | 2017-05-21 | 2017-04-21 | 2017-03-21 | 2017-02-21 | 2017-01-21 | 2016-12-21 | 2016-11-21 | 2016-10-21 | 2016-09-21 | 2016-08-21 | 2016-07-21 | 2016-06-21 | 2016-05-21 | 2016-04-21 | 2016-03-21 | 2016-02-21 | 2016-01-21 | 2015-12-21 | 2015-11-21 | 2015-10-21 | 2015-09-21 | 2015-08-21 | 2015-07-21 | 2015-06-21 | 2015-05-21 | 2015-04-21 | 2015-03-21 | 2015-02-21 | 2015-01-21 | 2014-12-21 | 2014-11-21 | 2014-10-21 | 2014-09-21 | 2014-08-21 | 2014-07-21 | 2014-06-21 | 2014-05-21 | 2014-04-21 | 2014-03-21 | 2014-02-21 | 2014-01-21 | 2013-12-21 | 2013-11-21 | 2013-10-21 | 2013-09-21 | 2013-08-21 | 2013-07-21 | 2013-06-21 | 2013-05-21 | 2013-04-21 | 2013-03-21 | 2013-02-21 | 2013-01-21 | 2012-12-21 | 2012-11-21 | 2012-10-21 | 2012-09-21 | 2012-08-21 | 2012-07-21 | 2012-06-21 | 2012-05-21 | 2012-04-21 | 2012-03-21 | 2012-02-21 | 2012-01-21 | 2011-12-21 | 2011-11-21 | 2011-10-21 | 2011-09-21 | 2011-08-21 | 2011-07-21 | 2011-06-21 | 2011-05-21 | 2011-04-21 | 2011-03-21 | 2011-02-21 | 2011-01-21 | 2010-12-21 | 2010-11-21 | 2010-10-21 | 2010-09-21 | 2010-08-21 | 2010-07-21 | 2010-06-21 | 2010-05-21 | 2010-04-21 | 2010-03-21 | 2010-02-21 | 2010-01-21 | 2009-12-21 | 2009-11-21 | 2009-10-21 | 2009-09-21 | 2009-08-21 | 2009-07-21 | 2009-06-21 | 2009-05-21 | 2009-04-21 | 2009-03-21 | 2009-02-21 | 2009-01-21 | 2008-12-21 | 2008-11-21 | 2008-10-21 | 2008-09-21 | 2008-08-21 | 2008-07-21 | 2008-06-21 | 2008-05-21 | 2008-04-21 | 2008-03-21 | 2008-02-21 | 2008-01-21 | 2007-12-21 | 2007-11-21 | 2007-10-21 | 2007-09-21 | 2007-08-21 | 2007-07-21 | 2007-06-21 | 2007-05-21 | 2007-04-21 | 2007-03-21 | 2007-02-21 | 2007-01-21 | 2006-12-21 | 2006-11-21 | 2006-10-21 | 2006-09-21 | 2006-08-21 | 2006-07-21 | 2006-06-21 | 2006-05-21 | 2006-04-21 | 2006-03-21 | 2006-02-21 | 2006-01-21 | 2005-12-21 | 2005-11-21 | 2005-10-21 | 2005-09-21 | 2005-08-21 | 2005-07-21 | 2005-06-21 | 2005-05-21 | 2005-04-21 | 2005-03-21 | 2005-02-21 | 2005-01-21 | 2004-12-21 | 2004-11-21 | 2004-10-21 | 2004-09-21 | 2004-08-21 | 2004-07-21 | 2004-06-21 | 2004-05-21 | 2004-04-21 | 2004-03-21 | 2004-02-21 | 2004-01-21 | 2003-12-21 | 2003-11-21 | 2003-10-21 | 2003-09-21 | 2003-08-21 | 2003-07-21 | 2003-06-21 | 2003-05-21 | 2003-04-21 | 2003-03-21 | 2003-02-21 | 2003-01-21 | 2002-12-21 | 2002-11-21 | 2002-10-21 | 2002-09-21 | 2002-08-21 | 2002-07-21 | 2002-06-21 | 2002-05-21 | 2002-04-21 | 2002-03-21 | 2002-02-21 | 2002-01-21 | 2001-12-21 | 2001-11-21 | 2001-10-21 | 2001-09-21 | 2001-08-21 | 2001-07-21 | 2001-06-21 | 2001-05-21 | 2001-04-21 | 2001-03-21 | 2001-02-21 | 2001-01-21 | 2000-12-21 | 2000-11-21 | 2000-10-21 | 2000-09-21 | 2000-08-21 | 2000-07-21 | 2000-06-21 | 2000-05-21 | 2000-04-21 | 2000-03-21 | 2000-02-21 | 2000-01-21 | 1999-12-21

Key: Value:

Key: Value:

MESSAGE
DATE 2007-11-17
FROM Ruben Safir
SUBJECT Re: [NYLXS - HANGOUT] Website Updates
On Sat, Nov 17, 2007 at 08:56:33PM -0500, Ron Guerin wrote:
> Ruben Safir wrote:
>
> > Anyway, even that code I wrote is now running 18 hours plus and
> > still parsing mail. It makes me apreciaite what the boys working
> > for Wall Street go through. My regex must be chewing up too much
> > CPU power. It's using 99% of the CPU to do this and still running
> > and it has parsed a little over 30,000 messages.
> >
> > m/^From\s+[-.=\w]+\-at-[-.\w]+\.\w{2,3}\s+\w{3}\s+\w{3}\s+\d{1,2}\s+\d\d:\d\d:\d\d\s+\d\d\d\d/
> >
> > is the From line regex. And it is still missing some From Headers.
> >
> > Perhaps I should reduce this to a more generalized format such as
> >
> > m/^From\s+w.*\-at-w.*\s+\w{3}\s+\w{3}\s+\d{1,2}\s+\d\d:\d\d:\d\d\s+\d\d\d\d/
> >
> > or even
> >
> > m/^From\s+w.*\-at-w.*\s+\w{3}\s+\w.*\s+\d+\s+\d.*\s+\d.*/
> >
> >
> > Would that make it run faster?
>
> If you can optimize those regexes, it can only help. But also make sure
> you're not looking at anything but the *headers*, because you neither
> want to parse the message body looking for headers, nor do you want to
> treat something that looks like a From: header like a header if it's in
> the message body.
>

Well two things. First, you can only identify the headers IF you identify them
as headers. Thats what the regex does, so i don't see how you can cut it out
of a search on the body. Even the binaries need to be searched until you reach the
end of the content type marker.

thats the whole problem, right. Grep through the file and find From lines.


then the body itself needs to be captured and entered into the database.

Maybe there is a magic way around this if the header tells me how many lines of
content there is. Then I can gobble up the content without viewing
the individual lines.

> So if you cut back on the data you regex against, that's probably going
> to help as much, if not more than anything else, because the headers are
> probably a relatively small percentage of your data.
>

I'm open to suggestions. Meanwhile I just noticed that the message body is being doubled so I
need to look at the code again in the morning when I get home from work.


Ruben
--
http://www.mrbrklyn.com - Interesting Stuff
http://www.nylxs.com - Leadership Development in Free Software

So many immigrant groups have swept through our town that Brooklyn, like Atlantis, reaches mythological proportions in the mind of the world - RI Safir 1998

http://fairuse.nylxs.com DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002

"Yeah - I write Free Software...so SUE ME"

"The tremendous problem we face is that we are becoming sharecroppers to our own cultural heritage -- we need the ability to participate in our own society."

"> I'm an engineer. I choose the best tool for the job, politics be damned.<
You must be a stupid engineer then, because politcs and technology have been attached at the hip since the 1st dynasty in Ancient Egypt. I guess you missed that one."

© Copyright for the Digital Millennium

  1. 2007-11-01 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Multimedia Organizations to invite to Freedom-IT
  2. 2007-11-02 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] New Classes and Space
  3. 2007-11-02 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Audio Project Resource from Upstate NY
  4. 2007-11-02 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Re: [Freedom-IT] Audio Project Resource from Upstate NY
  5. 2007-11-03 Elfen Magix <elfen_magix-at-yahoo.com> Subject: [NYLXS - HANGOUT] Dead Suse Mouse
  6. 2007-11-03 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Dead Suse Mouse
  7. 2007-11-03 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Dead Suse Mouse
  8. 2007-11-03 Kevin Mark <kevin.mark-at-verizon.net> Re: [NYLXS - HANGOUT] Dead Suse Mouse
  9. 2007-11-05 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] google phones
  10. 2007-11-05 Elfen Magix <elfen_magix-at-yahoo.com> Re: [NYLXS - HANGOUT] Dead Suse Mouse
  11. 2007-11-06 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Narcotic Ordering
  12. 2007-11-07 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Narcotic Ordering
  13. 2007-11-08 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Narcotic Ordering
  14. 2007-11-09 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] NYLXS Board Meeting
  15. 2007-11-09 swd <sderrick-at-optonline.net> Subject: [NYLXS - HANGOUT] Can I "fool" a browser into thinking I am not in America?
  16. 2007-11-09 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Can I "fool" a browser into thinking I am not in America?
  17. 2007-11-09 Paul Robert Marino <prmarino1-at-gmail.com> Re: [NYLXS - HANGOUT] Can I "fool" a browser into thinking I am not in America?
  18. 2007-11-09 email <ray-pub-at-rcn.com> Re: [NYLXS - HANGOUT] Can I "fool" a browser into thinking I am not
  19. 2007-11-10 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Re: Journalist inquiry about Linux and Google
  20. 2007-11-11 Kevin Mark <kevin.mark-at-verizon.net> Re: [NYLXS - HANGOUT] Re: Journalist inquiry about Linux and Google
  21. 2007-11-12 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] NYLXS and Freedom-it planning meeting for tomorrow
  22. 2007-11-13 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] [nyc-at-workatjelly.com: Jelly this Friday (11/16) in Manhattan!]
  23. 2007-11-13 Amy Coleman <acoleman-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] NYLXS and Freedom-it planning meeting for
  24. 2007-11-13 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] NYLXS and Freedom-it planning meeting for tomorrow
  25. 2007-11-13 Ron Guerin <ron-at-vnetworx.net> Subject: [NYLXS - HANGOUT] [Fwd: [nylug-announce] TOMORROW! WEDNESDAY: NYLUG presents James
  26. 2007-11-13 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] [Fwd: [nylug-announce] TOMORROW! WEDNESDAY: NYLUG presents James Vasile on GPL3, FOSS Legal Primer & The Interaction of Licenses & Communities]
  27. 2007-11-13 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] NYLUG MEeting Tomorrow
  28. 2007-11-14 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] [Fwd: [nylug-announce] TOMORROW! WEDNESDAY:
  29. 2007-11-14 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] [Fwd: [nylug-announce] TOMORROW! WEDNESDAY: NYLUG presents James Vasile on GPL3, FOSS Legal Primer & The Interaction of Licenses & Communities]
  30. 2007-11-14 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] NYLUG Meeting Tonight
  31. 2007-11-15 From: "Michael L. Richardson" <mlr52-at-michaellrichardson.com> Subject: [NYLXS - HANGOUT] Resluts of GNU/Linux Demo
  32. 2007-11-15 email <ray-pub-at-rcn.com> Subject: [NYLXS - HANGOUT] VPN Issue
  33. 2007-11-15 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Website Updates
  34. 2007-11-15 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Freedom-IT: GNU/Linux Multimedia Conference
  35. 2007-11-15 Kevin Mark <kevin.mark-at-verizon.net> Re: [NYLXS - HANGOUT] Website Updates
  36. 2007-11-16 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  37. 2007-11-16 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  38. 2007-11-16 Kevin Mark <kevin.mark-at-verizon.net> Re: [NYLXS - HANGOUT] Website Updates
  39. 2007-11-16 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  40. 2007-11-16 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  41. 2007-11-16 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  42. 2007-11-16 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  43. 2007-11-16 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  44. 2007-11-16 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Re: [nylug-talk] Getting rid of old computer bits?
  45. 2007-11-17 Kevin Mark <kevin.mark-at-verizon.net> Re: [NYLXS - HANGOUT] Website Updates
  46. 2007-11-17 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  47. 2007-11-17 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Ripping DVD's
  48. 2007-11-17 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] [nbs-at-sonic.net: [vox] Freedomware Gamefest 2007]
  49. 2007-11-17 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  50. 2007-11-17 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  51. 2007-11-18 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  52. 2007-11-18 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  53. 2007-11-18 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  54. 2007-11-18 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  55. 2007-11-18 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  56. 2007-11-18 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  57. 2007-11-18 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  58. 2007-11-18 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  59. 2007-11-18 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  60. 2007-11-18 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  61. 2007-11-18 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  62. 2007-11-18 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  63. 2007-11-18 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  64. 2007-11-18 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  65. 2007-11-20 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Website Updates
  66. 2007-11-21 Elfen Magix <elfen_magix-at-yahoo.com> Subject: [NYLXS - HANGOUT] Sorry (WAS: Dead CPU)
  67. 2007-11-21 Elfen Magix <elfen_magix-at-yahoo.com> Dead CPU (WAS: Re: [NYLXS - HANGOUT] Website Updates)
  68. 2007-11-22 From: "Michael L. Richardson" <mlr52-at-michaellrichardson.com> Subject: [NYLXS - HANGOUT] [Fwd: FW: Something cool that Xerox is doing to say Thank you*marvin*]
  69. 2007-11-22 From: "Evan Inker" <eminker-at-gmail.com> Subject: [NYLXS - HANGOUT] FSM newsletter: FSM Newsletter 19 November 2007
  70. 2007-11-22 From: "Evan Inker" <eminker-at-gmail.com> Subject: [NYLXS - HANGOUT] FSM newsletter: FSM Newsletter 19 November 2007
  71. 2007-11-23 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Kernel Update and USB Mouse on SuSE 9.3
  72. 2007-11-23 Ron Guerin <ron-at-vnetworx.net> Re: [NYLXS - HANGOUT] Sorry (WAS: Dead CPU)
  73. 2007-11-27 Ruben Safir <ruben-at-mrbrklyn.com> Re: [NYLXS - HANGOUT] Website Updates
  74. 2007-11-23 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Re: [opensuse] Kernel Update and USB Mouse on SuSE 9.3
  75. 2007-11-23 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Re: [opensuse] Kernel Update and USB Mouse on SuSE 9.3
  76. 2007-11-23 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Re: [opensuse] Kernel Update and USB Mouse on SuSE 9.3
  77. 2007-11-24 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Re: [opensuse] Kernel Update and USB Mouse on SuSE 9.3
  78. 2007-11-24 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [NYLXS - HANGOUT] Re: [opensuse] Kernel Update and USB Mouse on SuSE 9.3
  79. 2007-11-16 Ruben Safir <ruben-at-mrbrklyn.com> Subject: [ruben-at-mrbrklyn.com: Re: [NYLXS - HANGOUT] Website Updates]

NYLXS are Do'ers and the first step of Doing is Joining! Join NYLXS and make a difference in your community today!