Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:51185 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 5982 invoked from network); 2 Jan 2011 21:31:44 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 2 Jan 2011 21:31:44 -0000 Authentication-Results: pb1.pair.com header.from=rumi.kg@gmail.com; sender-id=pass; domainkeys=bad Authentication-Results: pb1.pair.com smtp.mail=rumi.kg@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.214.170 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: rumi.kg@gmail.com X-Host-Fingerprint: 209.85.214.170 mail-iw0-f170.google.com Received: from [209.85.214.170] ([209.85.214.170:57467] helo=mail-iw0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 7A/08-48048-EBEE02D4 for ; Sun, 02 Jan 2011 16:31:43 -0500 Received: by iwn6 with SMTP id 6so14069606iwn.29 for ; Sun, 02 Jan 2011 13:31:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Dsbw6/SeIydIS5706/rfshNNvgFbsJvIRm8RI9ZolJ0=; b=YTI4/pYTGYghl0VYuH+ElEw6XgYttDOggHfI9PeBuPtdEfgsB44Ark5021ouIMie/W Z3IEQnZBw6+9sGy8wC0RZ4bFi8ibYN0EHgshq+RKZNiEX+b8F+Mj+SBUw9yPZ1+2N+jQ 9Yp+qkgZmcVnwFcyA4Pc42bgGzu4dkXKoIWOE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=hW0xz7PjLPqNcdS6DEsJa7Jxy2MHdtPhWvZHVcLDjDWk2JT80c3JSUuLeK6Q7THpXJ 8QRKPb8rur24xG2gx0dncwgPL7ZTd3+XUPpytdaKz5hcrmYJl/fhf2JqeP/Zsr40zlSI J4V7N4jCeRhd+cWmWhtMuWhN9ESqAp6N+7xtE= MIME-Version: 1.0 Received: by 10.231.34.134 with SMTP id l6mr3847341ibd.22.1294003900604; Sun, 02 Jan 2011 13:31:40 -0800 (PST) Received: by 10.231.37.200 with HTTP; Sun, 2 Jan 2011 13:31:40 -0800 (PST) In-Reply-To: References: <20101231115408.GD18520@nibiru.local> <542423FA-1522-4AEC-8CC3-4AFF2DC4B453@darkrainfall.org> Date: Sun, 2 Jan 2011 22:31:40 +0100 Message-ID: To: "guilhermeblanco@gmail.com" Cc: internals Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Re: EBNF From: rumi.kg@gmail.com (Rune Kaagaard) Hi Guilherme You wrote that Java spec? Cool! Also very nice example of the PHP EBNF! I think PHP needs a canonical one of those and that the parser should be rewritten to represent said EBNF. Thats what I'm dreaming of at least :) Cheers Rune On Sat, Jan 1, 2011 at 5:23 PM, guilhermeblanco@gmail.com wrote: > As a final note, I'd like to mention that even PHP grammar being quite > simple, it is light-years more complex (due to the lack of > standardization) than other languages. > > You can compare this initial description I wrote to the Java > Specification and get your own conclusions: > http://java.sun.com/docs/books/jls/second_edition/html/syntax.doc.html > > > Cheers, > > On Sat, Jan 1, 2011 at 2:20 PM, guilhermeblanco@gmail.com > wrote: >> Hi all, >> >> PHP grammar is far from being complex. It is possible to describe most >> of the syntax with a simple explanation. >> Example: >> >> * We can separate a program into several statements. >> * There're a couple of items that cannot be declared into different >> places (namespace, use), so consider them as top-statements. >> * Also, Namespace declaration may contain multiple statements if you >> define them under brackets. >> * UseStatement can only be used inside a namespace or inside global scop= e. >> * Finally, we support Classes. >> >> Now we can describe a good portion of PHP grammar: >> >> /* Terminals */ >> identifier >> char >> string >> integer >> float >> boolean >> >> /* Grammar Rules */ >> Literal ::=3D string | char | integer | float | boolean >> >> Qualifier ::=3D ("private" | "public" | "protected") ["static"] >> >> /* Identifiers */ >> NamespaceIdentifier ::=3D identifier {"\" identifier} >> ClassIdentifier ::=3D identifier >> MethodIdentifier ::=3D identifier >> FullyQualifiedClassIdentifier ::=3D [NamespaceIdentifier] ClassIdentifie= r >> >> /* Root grammar */ >> Program ::=3D {TopStatement} {Statement} >> >> TopStatement ::=3D NamespaceDeclaration | UseStatement | CommentStatemen= t >> Statement ::=3D ClassDeclaration | FunctionDeclaration | ... >> >> /* Namespace Declaration */ >> NamespaceDeclaration ::=3D InlineNamespaceDeclaration | ScopeNamespaceDe= claration >> InlineNamespaceDeclaration ::=3D SimpleNamespaceDeclaration ";" >> {UseDeclaration} {Statement} >> ScopeNamespaceDeclaration ::=3D SimpleNamespaceDeclaration "{" >> {UseDeclaration} {Statement} "}" >> SimpleNamespaceDeclaration ::=3D "namespace" NamespaceIdentifier >> >> /* Use Statement */ >> UseStatement ::=3D "use" SimpleUseStatement {"," SimpleUseStatement} ";" >> SimpleUseStatement ::=3D SimpleNamespaceUseStatement | SimpleClassUseSta= tement >> SimpleNamespaceUseStatement ::=3D NamespaceIdentifier ["as" NamespaceIde= ntifier] >> SimpleClassUseStatement ::=3D FullyQualifiedClassIdentifier ["as" ClassI= dentifier] >> >> /* Comment Declaration */ >> CommentStatement ::=3D InlineCommentStatement | MultilineCommentStatemen= t >> InlineCommentStatement ::=3D ("//" | "#") string >> MultilineCommentStatement ::=3D SimpleMultilineCommentStatement | >> DocBlockStatement >> SimpleMultilineCommentStatement ::=3D "/*" {"*" string} "*/" >> DocBlockStatement ::=3D "/**" {"*" string} "*/" >> >> /* Class Declaration */ >> ClassDeclaration ::=3D SimpleClassDeclaration "{" {ClassMemberDeclaratio= n} "}" >> SimpleClassDeclaration ::=3D [abstract] "class" ClassIdentifier >> ["extends" FullyQualifiedClassIdentifier] ["implements" >> FullyQualifiedClassIdentifier {"," FullyQualifiedClassIdentifier}] >> >> ClassMemberDeclaration ::=3D ConstDeclaration | PropertyDeclaration | >> MethodDeclaration >> ConstDeclaration ::=3D [DocBlockStatement] "const" identifier "=3D" Lite= ral ";" >> PropertyDeclaration ::=3D [DocBlockStatement] Qualifier Variable ["=3D" = Literal] ";" >> MethodDeclaration ::=3D [DocBlockStatement] (PrototypeMethodDeclaration >> | ComplexMethodDeclaration) >> >> PrototypeMethodDeclaration ::=3D "abstract" Qualifier "function" >> MethodIdentifier "(" {ArgumentDeclaration} ");" >> ComplexMethodDeclaration ::=3D ["final"] Qualifier "function" >> MethodIdentifier "(" {ArgumentDeclaration} ")" "{" {Statement} "}" >> ArgumentDeclaration ::=3D SimpleArgumentDeclatation {"," >> SimpleArgumentDeclaration} >> SimpleArgumentDeclaration ::=3D [TypeHint] Variable ["=3D" Literal] >> TypeHint ::=3D ArrayTypeHint | FullyQualifiedClassIdentifier >> ArrayTypeHint ::=3D "array" >> >> >> Now it is easy to continue the work and add missing rules. =3D) >> >> >> >> Cheers, >> >> On Sat, Jan 1, 2011 at 12:46 PM, Rune Kaagaard wrote= : >>>> There has never been a language grammar, so there's been nothing to re= fer to at all. As for why no one's made one more recently, for fun I snagge= d the .l and .y files from trunk and W3C's version of EBNF from XML. In two= hours of hacking away, I managed to come up with this sort-of beginning to= a grammar, which I'm certain contains several errors, and only hints at a = syntax: >>> >>> I wanted to take your EBNF for a spin so I converted it to a format >>> that the python module "simpleparse" could read. I ironed out a couple >>> of kinks and fixed a bug. You can see it here: >>> >>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynn= e-raskind-example/php.ebnf >>> >>> Then I created a prettyprinter to output the parsetree of some very >>> simple PHP code. See it here: >>> >>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynn= e-raskind-example/parse_example.py >>> >>> and the output is here: >>> >>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynn= e-raskind-example/parse_example.output >>> >>>> Considering what it takes JUST to define namespaces, halt_compiler, ba= sic blocks, and the idea of a conditional statement... well, suffice to say= the "expr" production alone would be triple the size of this. It doesn't h= elp that there's no way I'm immediately aware of to check whether a grammar= like this is accurate. >>> >>> Thanks a lot for the example, that does not look so bad :) PHP syntax >>> is not simple so of course the EBNF will not be either. But still any >>> EBNF would be a lot better than none! >>> >>> Testability is a real issue and makes for a nice catch-22. A >>> hypothetical roadmap could _maybe_ look like this: >>> >>> 1) Create the EBNF and reference implementation while comparing it to >>> a stable release. >>> 2) Rewrite the Zend implementation to read from the EBNF. >>> 3) Repeat for all current releases. >>> >>> It's tough to try to guess about things you don't really understand. >>> Looks like major work though! >>> >>>> Nonetheless, it's a significant undertaking to deal with the complexit= y of the language. There are dozens of tiny little edge cases in PHP's pars= ing that require bunches of extra parser rules. An example from above is th= e difference between using "statement" and "inner-statement" for the two di= fferent forms of "if". Because "statement" includes basic blocks and labels= , the rule disallows writing "if: { xyz; } endif;", since apparently Zend d= oesn't support arbitrary basic blocks. All those cases wreak havoc on the g= rammar. In its present form, it will never reduce down to something nearly = as small as Python's. >>> >>> Just to have a solid, complete maintained EBNF would be a _major_ leap = forward! >>> >>> Thanks for your cool reply! >>> >>> Cheers >>> Rune >>> >>> -- >>> PHP Internals - PHP Runtime Development Mailing List >>> To unsubscribe, visit: http://www.php.net/unsub.php >>> >>> >> >> >> >> -- >> Guilherme Blanco >> Mobile: +55 (16) 9215-8480 >> MSN: guilhermeblanco@hotmail.com >> S=E3o Paulo - SP/Brazil >> > > > > -- > Guilherme Blanco > Mobile: +55 (16) 9215-8480 > MSN: guilhermeblanco@hotmail.com > S=E3o Paulo - SP/Brazil >