MESSAGE
DATE | 2018-02-04 |
FROM | Ruben Safir
|
SUBJECT | Subject: [Hangout - NYLXS] Fwd: Re: [opensuse] Need new source for unix
|
-------- Forwarded Message -------- From: 24 2018 <> X-Account-Key: account14 X-UIDL: GmailId16161f78eba23673 X-Mozilla-Status: 0011 X-Mozilla-Status2: 00000000 X-Mozilla-Keys: Delivered-To: ruben.safir-at-my.liu.edu Received: by 10.223.188.66 with SMTP id a2csp1820512wrh; Sun, 4 Feb 2018 09:57:49 -0800 (PST) X-Google-Smtp-Source: AH8x2257w+/ErN3xkbn9Tspq6fbhsP6XTeWfhw6+EMLehq8QegCvFgh7LaBFN4cCj6R1enOTbI43 X-Received: by 10.223.176.232 with SMTP id j37mr27161188wra.252.1517767069264; Sun, 04 Feb 2018 09:57:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517767069; cv=none; d=google.com; s=arc-20160816; b=Los1GzuKvbKmF/vL/Fs8arcCAu6WD2bEqBOgnYZJ49V78Egqy7i1AhhQhlxLCNWX2m DbwlbXsvMcXoigPPGbARuOgWlO+aonW4w5e5P84e1Ctxz0tmVEFIO70BMIm5ZMPUnPOa
/BbWGSf833v8qqxtLvw/bG40Z9tZmjXP4LdsBB7ANink7r982qTTXXa+qdwJkeZIDASa 6OSZDTxHjQoLWAF07azRUKMQ+SPUdxbqGcZ/Xhl1zZzd9kSbxRHTQKxnkJUS2Y/5JFTB
uLXeTGfdTMH+5fqYn5UcBa5W0xYtfIjGGyh29tswF0LJ51O5ejDKR1vpJBQyTlovB4Gp Uieg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:references:subject:cc:to :mime-version:list-archive:list-owner:list-unsubscribe :list-subscribe:list-help:list-post:mailing-list:precedence :user-agent:from:date:message-id:delivered-to :arc-authentication-results; bh=mRxd4vQDDHcj1hjBvE0HHznfIyxhIrShVlU+ypVdb3Y=; b=xEPpu70tsJmXMM5Rb0rVJG3baFl/5u8CFkjsgTmZbcAOmbYEgSh90cTubHAODn7qh/ tGwsoM9Zk0fjHEE+mOlUwZ3yaKrRJYpHkHLUeJQ5tdUPPUBHracJKIqxpO6u07jLmDza
bLr+RKGHEzRnDazlXYbaiQVZlx30c/ONne9+b0bea7wfxYz2ibThPE5xIeD0oPLktWZ/ gKwy6+2W2hdUaiU/a7PxXZSc56cJHNhxTUYiGMXeXUOIJteJ87tUuQux+s7tcrQ2xKVz
CEM3RRuzklkiah2FnUoof2speiD2BdxxcR7lO0uOmVjSogUhlUZNvDTlU382GFf79xHR sXEQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of opensuse+bounces-205681-ruben.safir=my.liu.edu-at-opensuse.org designates 195.135.221.145 as permitted sender) smtp.mailfrom=opensuse+bounces-205681-ruben.safir=my.liu.edu-at-opensuse.org Return-Path: Received: from hydra.opensuse.org (proxy-nue1.opensuse.org. [195.135.221.145]) by mx.google.com with ESMTP id n8si5045536wrh.371.2018.02.04.09.57.49 for ; Sun, 04 Feb 2018 09:57:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of opensuse+bounces-205681-ruben.safir=my.liu.edu-at-opensuse.org designates 195.135.221.145 as permitted sender) client-ip=195.135.221.145; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of opensuse+bounces-205681-ruben.safir=my.liu.edu-at-opensuse.org designates 195.135.221.145 as permitted sender) smtp.mailfrom=opensuse+bounces-205681-ruben.safir=my.liu.edu-at-opensuse.org Received: from lists5.opensuse.org (baloo.infra.opensuse.org [192.168.47.38]) by hydra.opensuse.org (Postfix) with ESMTP id DF4CB2401B for ; Sun, 4 Feb 2018 17:57:39 +0000 (UTC) Received: from baloo.infra.opensuse.org (localhost [127.0.0.1]) by lists5.opensuse.org (Postfix) with ESMTP id 235A0112DE; Sun, 4 Feb 2018 17:57:38 +0000 (UTC) X-Original-To: opensuse-at-lists5-opensuse.suse.de Delivered-To: opensuse-at-lists5-opensuse.suse.de Received: from relay2.suse.de (unknown [149.44.160.134]) by lists5.opensuse.org (Postfix) with ESMTP id 4B35510E18 for ; Sun, 4 Feb 2018 17:57:37 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by relay2.suse.de (Postfix) with ESMTP id 1E0462C143 for ; Sun, 4 Feb 2018 17:57:37 +0000 (UTC) X-Virus-Scanned: by amavisd-new at localhost X-Spam-Flag: NO X-Spam-Score: -0.01 X-Spam-Level: X-Spam-Status: No, score=-0.01 tagged_above=-9999 required=5 tests=[T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from relay2.suse.de ([127.0.0.1]) by localhost (localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id hZ5iDI4fWV6q for ; Sun, 4 Feb 2018 17:57:35 +0000 (UTC) Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 943092C141 for ; Sun, 4 Feb 2018 17:57:35 +0000 (UTC) Received: from Ishtar.sc.tlinx.org (ishtar.tlinx.org [173.164.175.65]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTPS id A76B4AABA for ; Sun, 4 Feb 2018 17:57:32 +0000 (UTC) Received: from [192.168.3.12] (Athenae [192.168.3.12]) by Ishtar.sc.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id w14HvRTP000972; Sun, 4 Feb 2018 09:57:29 -0800 Message-ID: <5A774985.1030906-at-tlinx.org> Date: Sun, 04 Feb 2018 09:57:25 -0800 From: L A Walsh User-Agent: Thunderbird Precedence: bulk Mailing-List: contact opensuse+help-at-opensuse.org; run by mlmmj X-Mailinglist: opensuse List-Post: List-Help: List-Subscribe: List-Unsubscribe: List-Owner: List-Archive: X-MIME-Notice: attachments may have been removed from this message MIME-Version: 1.0 To: Roger Price CC: opensuse Mailing List Subject: Re: [opensuse] Need new source for unix utils -- gnu has broken another. (fwd) References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit
Roger Price wrote: > On Fri, 2 Feb 2018, Linda Walsh wrote: > >> I've used grep to search for strings across all my mailboxes for >> decades. Found out today, it randomly doesn't work based on whether >> or not the file contains any text that doesn't comply with >> POSIX. ... >> I suppose no one else really does a quick search through all their >> email this way any more. Though is this what you'd expect? > > This worried me since I use grep to search through mail archives. 42.3 includes > grep 2.16 dated 2014-01-01. man grep for 2.16 says under the heading > ENVIRONMENT VARIABLES: > > POSIXLY_CORRECT > If set, grep behaves as POSIX requires; otherwise, grep behaves more like > other GNU programs. > > The latest grep, 3.1, dated 2017-07-02, contains the same statement. > > Do you have POSIXLY_CORRECT set? If not, I would not expect to see grep > enforcing POSIX. ---- Bingo.
This was what I said -- (I DO NOT have POSIXLY_CORRECT set in my ENV). I pointed this out. I submitted this as a bug against grep only to have it closed because the email sent to me doesn't conform to POSIX! Specifically, see the "***"d statements by Eric Blake, below, and my response to his closing out the bug...
-------- Original Message -------- Subject: Re: bug#30326: grep not searching through a text file (thinking it binary) Date: Fri, 02 Feb 2018 12:09:23 -0800 From: L A Walsh CC: 30326-done-at-debbugs.gnu.org, GNU bug control References: <5A74BC3F.1030401-at-tlinx.org> <2c00563c-9347-c596-4ade-a87bd9262ca1-at-redhat.com>
Eric Blake wrote: > > tag 30326 notabug >
> On 02/02/2018 01:30 PM, L. A. Walsh wrote: > >> I've used grep to search through my mbox-format emails for decades, but >> I've run into a case where it seems to be ignore a text mailbox >> because, I guess, it thinks it is "binary" > > Yes, that's correct. >
>> If I used "-Par" it finds it. > > Yes, that's also correct. > > >> It seems that grep believes the file to binary and ignores it, though >> "file" calls it "text". >> > > The file is conditionally text. The POSIX definition of a text file is > one whose lines consist of valid characters in the current locale - but > note this definition is locale-dependent! So a file that is text under > one locale may be binary under another. When you are grepping a file > encoded correctly for the current locale, you get the output you want; > when you are grepping a file that contains encoding errors for the > current locale, POSIX says behavior is undefined, so GNU grep warns you > that the file is binary (in the current locale); and your use of -a > tells grep to process it anyways. As 'file' reported that your file was > using non-ISO extended-ASCII, it probable means the file was encoded for > an 8-bit single-byte locale; and my guess is that you were running grep > under a UTF-8 locale, and generally, UTF-8 treats 8-bit single-byte > inputs as encoding errors. Hence the warning that your file is binary, > under the current locale. > > You can also use 'LC_ALL=C grep' to force a locale where EVERY byte is a > valid character, and thus where you will never encounter encoding errors > (you may encounter OTHER things that make your file binary, such as > embedded NULs, but that's a different matter). > >*** This behavior is documented and intentional, so I'm closing this as not >*** a bug in the tracker. However, feel free to add further comments or >*** questions to the thread. > > And perhaps we could tweak the grep diagnostics to clarify whether a > file is binary because NUL bytes were encountered, vs. a file is binary > because encoding errors were encountered. >
Grep was around long before POSIX, as were most of the unix utils.
Grep was able to find text strings in mboxes without a POSIX definition telling it that it was "broken". I don't want it displaying random binary that throws my terminal into weird modes, which is why I skip binary files. To have grep searching through some mailboxes while skipping others, randomly based on what email happens to be in the box at the time, is hardly a useful utility.
I did not ask for POSIXLY_CORRECT -- if you need to have it be POSIXLY Correct, then use the existing var, but grep is now broken -- since POSIX doesn't define "text" files "out in the real world", but only for files that adhere to the POSIX standard.
People don't write emails that adhere to the POSIX standard.
Also, FWIW, grep's manpage doesn't say it is limited to posix-only files. It's summary says: grep, egrep, fgrep - print lines matching a pattern
which it does not do. It doesn't say "print lines matching a pattern only from POSIX text files.
------END "Original Message" --------
-- To unsubscribe, e-mail: opensuse+unsubscribe-at-opensuse.org To contact the owner, e-mail: opensuse+owner-at-opensuse.org
_______________________________________________ Hangout mailing list Hangout-at-nylxs.com http://lists.mrbrklyn.com/mailman/listinfo/hangout
|
|