MESSAGE
DATE | 2017-02-10 |
FROM | ruben safir
|
SUBJECT | Subject: [Learn] Fwd: Alternatives to Syntax Trees
|
From learn-bounces-at-nylxs.com Fri Feb 10 16:50:29 2017 Return-Path: X-Original-To: archive-at-mrbrklyn.com Delivered-To: archive-at-mrbrklyn.com Received: from www.mrbrklyn.com (www.mrbrklyn.com [96.57.23.82]) by mrbrklyn.com (Postfix) with ESMTP id E1804161312; Fri, 10 Feb 2017 16:50:28 -0500 (EST) X-Original-To: learn-at-nylxs.com Delivered-To: learn-at-nylxs.com Received: from [10.0.0.62] (flatbush.mrbrklyn.com [10.0.0.62]) by mrbrklyn.com (Postfix) with ESMTP id 3AB56160E77; Fri, 10 Feb 2017 16:50:26 -0500 (EST) References: <17-01-002-at-comp.compilers> <17-01-003-at-comp.compilers> <17-01-004-at-comp.compilers> <17-01-005-at-comp.compilers> <17-01-006-at-comp.compilers> <17-01-007-at-comp.compilers> <17-01-008-at-comp.compilers> To: "learn-at-nylxs.com" From: ruben safir X-Forwarded-Message-Id: <17-01-002-at-comp.compilers> <17-01-003-at-comp.compilers> <17-01-004-at-comp.compilers> <17-01-005-at-comp.compilers> <17-01-006-at-comp.compilers> <17-01-007-at-comp.compilers> <17-01-008-at-comp.compilers> Message-ID: <419f330e-4d12-b9fe-a425-6a7506c1fe93-at-mrbrklyn.com> Date: Fri, 10 Feb 2017 16:50:26 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <17-01-008-at-comp.compilers> Content-Type: multipart/mixed; boundary="------------1F19142AC4058EC48291CDEF" Subject: [Learn] Fwd: Alternatives to Syntax Trees X-BeenThere: learn-at-nylxs.com X-Mailman-Version: 2.1.17 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: learn-bounces-at-nylxs.com Sender: "Learn"
This is a multi-part message in MIME format. --------------1F19142AC4058EC48291CDEF Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit
compiler without a syntax tree!
--------------1F19142AC4058EC48291CDEF Content-Type: message/rfc822; name="Alternatives to Syntax Trees.eml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Alternatives to Syntax Trees.eml"
Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!168.235.88.217.MISMATCH!feeder.erje.net!2.us.feeder.erje.net!newspeer1.nac.net!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Seima Rao Newsgroups: comp.compilers Subject: Alternatives to Syntax Trees Date: Sun, 15 Jan 2017 16:52:15 -0500 (EST) Organization: Compilers Central Sender: news-at-iecc.com Approved: comp.compilers-at-iecc.com Message-ID: <17-01-002-at-comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="97584"; mail-complaints-to="abuse-at-iecc.com" Keywords: question, comment Posted-Date: 15 Jan 2017 16:52:15 EST X-submission-address: compilers-at-iecc.com X-moderator-address: compilers-request-at-iecc.com X-FAQ-and-archives: http://compilers.iecc.com Bytes: 1304 Xref: panix comp.compilers:33918
Hi,
Are there alternatives to syntax trees(i.e. the tree data structure) when compiling via yacc or yacc like tools ?
Sincerely, Seima Rao. [There's quadruples, doubly linked lists of data structures. They make it easier to rearrange code, but harder to do just about anything else. There's also DAGs, which are trees that can have shared subtrees, useful when you're doing common subexpressions or tail merging. -John]
--------------1F19142AC4058EC48291CDEF Content-Type: message/rfc822; name="Re: Alternatives to Syntax Trees.eml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Re: Alternatives to Syntax Trees.eml"
Path: reader1.panix.com!panix!news.linkpendium.com!news.linkpendium.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Kaz Kylheku <221-501-9011-at-kylheku.com> Newsgroups: comp.compilers Subject: Re: Alternatives to Syntax Trees Date: Sun, 15 Jan 2017 22:31:45 +0000 (UTC) Organization: Aioe.org NNTP Server Sender: news-at-iecc.com Approved: comp.compilers-at-iecc.com Message-ID: <17-01-003-at-comp.compilers> References: <17-01-002-at-comp.compilers> Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="32457"; mail-complaints-to="abuse-at-iecc.com" Keywords: parse, optimize Posted-Date: 16 Jan 2017 09:21:28 EST X-submission-address: compilers-at-iecc.com X-moderator-address: compilers-request-at-iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: panix comp.compilers:33919
On 2017-01-15, Seima Rao wrote: > Hi, > > Are there alternatives to syntax trees(i.e. the tree data structure) > when compiling via yacc or yacc like tools ?
If the language you are parsing has a context-free grammar suitable for Yacc, then the parse corresponds to a tree. More precisely, a real-world parse corresponds to a DAG: a tree in which some nodes are shared. For instance if you're parsing Lisp, an input like (a a a) is converted to DAG in which all three occurences of "a" point to the same symbol object. This is because the symbols are "interned" as scanning takes place.
Compilers do not use the tree data structure in all passes. Syntax trees represent only the raw syntax. An intermediate representation may be quite different: a graph containing chunks of instruction sequences.
-- TXR Programming Lanuage: http://nongnu.org/txr Music DIY Mailing List: http://www.kylheku.com/diy ADA MP-1 Mailing List: http://www.kylheku.com/mp1
--------------1F19142AC4058EC48291CDEF Content-Type: message/rfc822; name="Re: Alternatives to Syntax Trees.eml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Re: Alternatives to Syntax Trees.eml"
Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!168.235.88.217.MISMATCH!feeder.erje.net!2.us.feeder.erje.net!enother.net!enother.net!enother.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: George Neuner Newsgroups: comp.compilers Subject: Re: Alternatives to Syntax Trees Date: Mon, 16 Jan 2017 06:11:27 -0500 Organization: A noiseless patient Spider Sender: news-at-iecc.com Approved: comp.compilers-at-iecc.com Message-ID: <17-01-004-at-comp.compilers> References: <17-01-002-at-comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="32746"; mail-complaints-to="abuse-at-iecc.com" Keywords: parse, optimize Posted-Date: 16 Jan 2017 09:22:14 EST X-submission-address: compilers-at-iecc.com X-moderator-address: compilers-request-at-iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: panix comp.compilers:33920
On Sun, 15 Jan 2017 16:52:15 -0500 (EST), Seima Rao wrote:
> Are there alternatives to syntax trees(i.e. the tree data structure) > when compiling via yacc or yacc like tools ?
Parser tools don't limit the structure of the intermediate representation.
>[There's quadruples, doubly linked lists of data structures. They >make it easier to rearrange code, but harder to do just about anything >else. There's also DAGs, which are trees that can have shared subtrees, >useful when you're doing common subexpressions or tail merging. -John]
To add a bit to John's excellent response:
You can use a combined representation: e.g., a graph to represent control - function defintions, loops, conditionals, etc. - and within the graph use 3-address form to represent basic blocks (straight line code).
Some analyses, e.g., determining dominance for EBB def/kill chains, require both the basic blocks and the graph of their connectivity.
Also, "quadruples" are one form of 3-address representation. There are other forms of 3-address and you can implement them using either lists or arrays. Quadruples are the most commonly used because - as John mentioned - they are the easiest to rearrange.
George
--------------1F19142AC4058EC48291CDEF Content-Type: message/rfc822; name="Re: Alternatives to Syntax Trees.eml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Re: Alternatives to Syntax Trees.eml"
Path: reader1.panix.com!panix!bloom-beacon.mit.edu!bloom-beacon.mit.edu!168.235.88.217.MISMATCH!feeder.erje.net!2.us.feeder.erje.net!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder1.iad1.usenetexpress.com!216.166.98.84.MISMATCH!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: rpw3-at-rpw3.org (Rob Warnock) Newsgroups: comp.compilers Subject: Re: Alternatives to Syntax Trees Date: Mon, 16 Jan 2017 12:12:01 -0000 (UTC) Organization: Rob Warnock, Consulting Systems Architect Sender: news-at-iecc.com Approved: comp.compilers-at-iecc.com Message-ID: <17-01-005-at-comp.compilers> References: <17-01-002-at-comp.compilers> Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="33108"; mail-complaints-to="abuse-at-iecc.com" Keywords: parse, optimize, Lisp Posted-Date: 16 Jan 2017 09:23:04 EST X-submission-address: compilers-at-iecc.com X-moderator-address: compilers-request-at-iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: panix comp.compilers:33921
Seima Rao wrote: +--------------- | Are there alternatives to syntax trees (i.e. the tree data | structure) when compiling via yacc or yacc like tools ? | | [There's quadruples, doubly linked lists of data structures. They | make it easier to rearrange code, but harder to do just about anything | else. There's also DAGs, which are trees that can have shared subtrees, | useful when you're doing common subexpressions or tail merging. -John] +---------------
CMU Common Lisp (CMUCL) [circa 1990ff] used Implicit Continuation Representation (ICR) as its first intermediate representation (a.k.a. "IR1"). The key point w.r.t. Seima's question is that IR1 is "represented as a flow-graph rather than a syntax tree":
https://common-lisp.net/project/cmucl/doc/CMUCL-design.pdf Design of CMU Common Lisp Robert A. MacLachlan (ed) January 15, 2003 ... Chapter 3 Compiler Overview ... Two major intermediate representations are used in the compiler:
* The Implicit Continuation Representation (ICR) represents the lisp-level semantics of the source code during the initial phases. Partial evaluation and semantic analysis are done on this representation. ICR is roughly equivalent to a subset of Common Lisp, but is represented as a flow-graph rather than a syntax tree. Phases which only manipulate ICR comprise the front end. It would be possible to use a different back end such as one that directly generated code for a stack machine.
* The Virtual Machine Representation (VMR) [a.k.a. "IR2"] represents the implementation of the source code on a virtual machine. ...[trimmed]...
Each phase is briefly described here. The phases from "local call analysis" to "constraint propagation" all interact; for maximum optimization, they are generally repeated until nothing new is discovered. ...[trimmed]...
-Rob
----- Rob Warnock 627 26th Avenue San Mateo, CA 94403
--------------1F19142AC4058EC48291CDEF Content-Type: message/rfc822; name="Alternatives to Syntax Trees.eml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Alternatives to Syntax Trees.eml"
Path: reader1.panix.com!panix!news.linkpendium.com!news.linkpendium.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Seima Rao Newsgroups: comp.compilers Subject: Alternatives to Syntax Trees Date: Tue, 17 Jan 2017 14:16:27 +0530 Organization: Compilers Central Sender: news-at-iecc.com Approved: comp.compilers-at-iecc.com Message-ID: <17-01-006-at-comp.compilers> References: <17-01-002-at-comp.compilers> <17-01-005-at-comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="34136"; mail-complaints-to="abuse-at-iecc.com" Keywords: analysis, question Posted-Date: 17 Jan 2017 09:12:29 EST X-submission-address: compilers-at-iecc.com X-moderator-address: compilers-request-at-iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: panix comp.compilers:33922
Thanks for your responses!
Your responses together with acm.org/dl(of which I am currently not a member of) helped me mark down the todos.
For those of you interested, I have some questions:
Specific
Q: What happens when we balance a syntax tree ? Q: What happens when we use a B-tree (say) ?
You may or may not seek recourse to language-only domains.
(some might want to discuss structural analysis aka Muchnick) (some might want to discuss dbms)
General
Q: Yacc(and likewise tools) make use of a set of rules in helping ,make progress in computing.
Are there approaches other then AI that use rule based computing?
-- Sincerely, Seima Rao. [To the first question, I suppose it might be useful if you're trying to reorganize an expression to fit into a limited set of registers so each subtree is about the same size, but it seems pretty marginal. To the second, that's not a meaningful question. B-Trees are a way of organizing indexed data on disks, not relevant to data structures a compiler might use in memory. -John]
--------------1F19142AC4058EC48291CDEF Content-Type: message/rfc822; name="Re: Alternatives to Syntax Trees.eml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Re: Alternatives to Syntax Trees.eml"
Path: reader1.panix.com!panix!news.linkpendium.com!news.linkpendium.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Kaz Kylheku <221-501-9011-at-kylheku.com> Newsgroups: comp.compilers Subject: Re: Alternatives to Syntax Trees Date: Tue, 17 Jan 2017 22:39:06 +0000 (UTC) Organization: Aioe.org NNTP Server Sender: news-at-iecc.com Approved: comp.compilers-at-iecc.com Message-ID: <17-01-007-at-comp.compilers> References: <17-01-002-at-comp.compilers> <17-01-005-at-comp.compilers> <17-01-006-at-comp.compilers> Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="13120"; mail-complaints-to="abuse-at-iecc.com" Keywords: analysis Posted-Date: 17 Jan 2017 17:58:15 EST X-submission-address: compilers-at-iecc.com X-moderator-address: compilers-request-at-iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: panix comp.compilers:33923
On 2017-01-17, Seima Rao wrote: > Thanks for your responses! > > Your responses together with acm.org/dl(of which I > am currently not a member of) helped me mark > down the todos. > > For those of you interested, I have some questions: > > Specific > > Q: What happens when we balance a syntax tree ?
Something pretty obviously silly; why don't you try a toy example?
The structure of a syntax tree is of utmost significance; it reflects the structure of the grammar rules to which it corresponds.
Balancing operations wreck structure, preserving only the inorder traversal.
If you wreck the structure, you wreck the syntax.
If this isn't clear, you will likely struggle later in the course even though these answers will get you through the current homework round.
--------------1F19142AC4058EC48291CDEF Content-Type: message/rfc822; name="Re: Alternatives to Syntax Trees.eml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Re: Alternatives to Syntax Trees.eml"
Path: reader1.panix.com!panix!news.linkpendium.com!news.linkpendium.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: George Neuner Newsgroups: comp.compilers Subject: Re: Alternatives to Syntax Trees Date: Tue, 17 Jan 2017 22:08:24 -0500 Organization: A noiseless patient Spider Sender: news-at-iecc.com Approved: comp.compilers-at-iecc.com Message-ID: <17-01-008-at-comp.compilers> References: <17-01-002-at-comp.compilers> <17-01-005-at-comp.compilers> <17-01-006-at-comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="81370"; mail-complaints-to="abuse-at-iecc.com" Keywords: analysis, comment Posted-Date: 18 Jan 2017 10:14:17 EST X-submission-address: compilers-at-iecc.com X-moderator-address: compilers-request-at-iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: panix comp.compilers:33924
On Tue, 17 Jan 2017 14:16:27 +0530, Seima Rao wrote:
> Q: What happens when we balance a syntax tree ?
Nothing, because you don't ever balance a syntax tree. Doing so would destroy the structure of the program and make it incomprehensible.
Syntax graphs (trees or DAGs) necessarily must reflect the structure of the program. Transformations made to the graph may affect local structure, but the gross (overall) structure must remain.
> Q: What happens when we use a B-tree (say) ?
A syntax graph is *not* a search tree in the conventional sense. A balanced tree would be a very *poor* choice of representation.
When a compiler uses a graph representation, typically the nodes will be heterogenous: there will be unique node types for each different language construct that needs to be represented. Code walking the graph will ignore nodes it does not understand.
> (some might want to discuss structural analysis aka Muchnick)
Structural analyses are the very reasons a syntax tree can't be balanced.
> (some might want to discuss dbms)
What for?
> Q: Yacc(and likewise tools) make use of a set of rules in helping > ,make progress in computing. > > Are there approaches other then AI that use rule based computing?
"AI" is not synonymous with "rules". Rule based decision systems - so called "expert" systems - generally are no longer considered to be "AI". Attempts to model intelligent behavior using rule systems failed miserably. Expert systems are very important and have many uses - they just aren't "AI".
Generally yacc is considered to be finite automata. The (E)BNF "rules" specify the structure of the automata and direct its behavior. I suppose that an argument /could/ be made that is a "decision" system, although I think such an argument would have to severely stretch definitions.
At some level, all forms of computing that are based on Turing's hypothesis can be considered to be "rule based". The rules may be explicit: a list of steps to follow, or implicit in the behavior of the "processor" that executes the steps.
Things which are not rule based are out of the ordinary: analog [continous signal] computing, neural networks, quantum [annealing] computing, etc. However, digital emulations/simulations of these things *are* rule based.
George [I can think of one place where one might balance a syntax tree -- an expression like a+b+c+d would parse as a+(b+(c+d)) and for some kinds of code generation you might know that + is associative so you could rewrite it to (a+b)+(c+d). But that's a very narrow and special case. In general I agree that compiler syntax trees and database B-trees have nothing in common beyond the word "trees". -John]
--------------1F19142AC4058EC48291CDEF Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline
_______________________________________________ Learn mailing list Learn-at-nylxs.com http://lists.mrbrklyn.com/mailman/listinfo/learn
--------------1F19142AC4058EC48291CDEF--
|
|