Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:32924 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 7918 invoked by uid 1010); 23 Oct 2007 03:39:36 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 7903 invoked from network); 23 Oct 2007 03:39:35 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Oct 2007 03:39:35 -0000 Authentication-Results: pb1.pair.com header.from=greg@chiaraquartet.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=greg@chiaraquartet.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain chiaraquartet.net from 38.99.98.18 cause and error) X-PHP-List-Original-Sender: greg@chiaraquartet.net X-Host-Fingerprint: 38.99.98.18 beast.bluga.net Linux 2.6 Received: from [38.99.98.18] ([38.99.98.18:38556] helo=mail.bluga.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id FC/92-21656-6FC6D174 for ; Mon, 22 Oct 2007 23:39:35 -0400 Received: from mail.bluga.net (localhost.localdomain [127.0.0.1]) by mail.bluga.net (Postfix) with ESMTP id 64445C0F87D; Mon, 22 Oct 2007 20:39:32 -0700 (MST) Received: from [192.168.0.106] (CPE-76-84-13-199.neb.res.rr.com [76.84.13.199]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.bluga.net (Postfix) with ESMTP id 74F08C0F87C; Mon, 22 Oct 2007 20:39:31 -0700 (MST) Message-ID: <471D6DDC.4050705@chiaraquartet.net> Date: Mon, 22 Oct 2007 22:43:24 -0500 User-Agent: Thunderbird 1.5.0.13 (X11/20070824) MIME-Version: 1.0 To: Stanislav Malyshev CC: 'PHP Internals' References: <471D1600.2030603@zend.com> <471D2FDC.8010505@chiaraquartet.net> <471D394F.50007@zend.com> In-Reply-To: <471D394F.50007@zend.com> X-Enigmail-Version: 0.94.2.0 Content-Type: multipart/mixed; boundary="------------050109080701040000020105" X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: import/use last call From: greg@chiaraquartet.net (Gregory Beaver) --------------050109080701040000020105 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Stanislav Malyshev wrote: >> Hold off for a bit - I may have a simple solution that solves the >> problem for class names, method names and functions, have to code the >> patch tonight first to prove it works. > > OK, please send it as soon as you have it :) Hi, The attached patch is for PHP 5.3, if it is acceptable, I will port it to PHP 6, which is not difficult, although it does involve a lot of cut/pasting. The patch does these things: 1) fixes an unrelated bug I found in implementation of LSB - "static" is not checked for in zend_do_import()/zend_do_namespace() and other places that we check for "self" and "parent" 2) fixes a rather serious error in the fix for Bug #42859 - missing parens in zend_do_import() 3) adds "import" and "namespace" as valid function/class names 4) allows any string for method names, just as we allow any string for variable names 5) fixes a bug in logic for $class->list where $class-> list (note the whitespace between -> and list) returns a T_LIST instead of T_STRING 6) It allows "import ::Classname as Blah" which is currently a parse error 7) class constants are unchanged - reserved words still error out. Note that the zend_compile.c fixes can all be committed directly as they are all bugfixes and not related to the import/namespace/reserved words fix. To implement this, I added several states to the lexer in order to return T_STRING whenever possible, which is after T_NEW, T_INTERFACE, T_CLASS, T_EXTENDS and in the T_IMPLEMENTS list. In addition, after T_FUNCTION outside of a class, it returns T_STRING for "import" and "namespace" but no other reserved words. After T_FUNCTION inside of a class (method declaration), it returns T_STRING for all possible strings. After :: or -> T_STRING is always returned. Also, rather than take the approach LSB does with T_STATIC, I have the lexer initialize the string value of T_IMPORT and T_NAMESPACE so we can preserve case for autoloading needs. The parser frees the unused char * when normal import/namespace declarations are called. In the parser, I use fully_qualified_class_name instead of namespace_name for both import syntaxes. This introduces a minor issue in that this is no longer a parse error: import static::oops as Classname; However, if "Classname" is used, this will simply result in "Fatal error: Class 'static::oops' not found in..." and shouldn't be too big of a deal. Also in the parser, I inserted T_IMPORT and T_NAMESPACE as aliases to T_STRING in global situations to allow for static method calls, class constants, and fully-qualified namespace calls. Basically this script is now possible with the patch: and results in this output: cellog@lot-49:~/workspace/php5$ sapi/cli/php -n testme.php import functionint(3) import::namespace::list I have not performed profiling on the patch, instead focusing on correctness for now. The patch looks complicated because of the additional states, but is really not that complicated, honest. :) Greg --------------050109080701040000020105 Content-Type: text/plain; name="mega_reserved_words.patch.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="mega_reserved_words.patch.txt" ? Zend/tests/zend_function_name.phpt Index: Zend/zend_compile.c =================================================================== RCS file: /repository/ZendEngine2/zend_compile.c,v retrieving revision 1.647.2.27.2.41.2.10 diff -u -r1.647.2.27.2.41.2.10 zend_compile.c --- Zend/zend_compile.c 17 Oct 2007 10:01:21 -0000 1.647.2.27.2.41.2.10 +++ Zend/zend_compile.c 23 Oct 2007 03:15:41 -0000 @@ -2975,7 +2975,7 @@ lcname = zend_str_tolower_dup(class_name->u.constant.value.str.val, class_name->u.constant.value.str.len); - if (!(strcmp(lcname, "self") && strcmp(lcname, "parent"))) { + if (!(strcmp(lcname, "self") && strcmp(lcname, "parent") && strcmp(lcname, "static"))) { efree(lcname); zend_error(E_COMPILE_ERROR, "Cannot use '%s' as class name as it is reserved", class_name->u.constant.value.str.val); } @@ -4582,7 +4582,9 @@ if (((Z_STRLEN(name->u.constant) == sizeof("self")-1) && !memcmp(lcname, "self", sizeof("self")-1)) || ((Z_STRLEN(name->u.constant) == sizeof("parent")-1) && - !memcmp(lcname, "parent", sizeof("parent")-1))) { + !memcmp(lcname, "parent", sizeof("parent")-1)) || + ((Z_STRLEN(name->u.constant) == sizeof("static")-1) && + !memcmp(lcname, "static", sizeof("static")-1))) { zend_error(E_COMPILE_ERROR, "Cannot use '%s' as namespace name", Z_STRVAL(name->u.constant)); } efree(lcname); @@ -4596,7 +4598,7 @@ { char *lcname; zval *name, *ns, tmp; - zend_bool warn = 0; + zend_bool warn = 0, shorthand = 0; if (!CG(current_import)) { CG(current_import) = emalloc(sizeof(HashTable)); @@ -4611,11 +4613,12 @@ char *p; /* The form "import A::B" is eqivalent to "import A::B as B". - So we extract the last part of compound name ti use as a new_name */ + So we extract the last part of compound name to use as a new_name */ name = &tmp; p = zend_memrchr(Z_STRVAL_P(ns), ':', Z_STRLEN_P(ns)); if (p) { ZVAL_STRING(name, p+1, 1); + shorthand = 1; } else { *name = *ns; zval_copy_ctor(name); @@ -4627,6 +4630,8 @@ if (((Z_STRLEN_P(name) == sizeof("self")-1) && !memcmp(lcname, "self", sizeof("self")-1)) || + ((Z_STRLEN_P(name) == sizeof("static")-1) && + !memcmp(lcname, "static", sizeof("static")-1)) || ((Z_STRLEN_P(name) == sizeof("parent")-1) && !memcmp(lcname, "parent", sizeof("parent")-1))) { zend_error(E_COMPILE_ERROR, "Cannot use '%s' as import name", Z_STRVAL_P(name)); @@ -4640,7 +4645,8 @@ ns_name[Z_STRLEN_P(CG(current_namespace))] = ':'; ns_name[Z_STRLEN_P(CG(current_namespace))+1] = ':'; memcpy(ns_name+Z_STRLEN_P(CG(current_namespace))+2, lcname, Z_STRLEN_P(name)+1); - if (zend_hash_exists(CG(class_table), ns_name, Z_STRLEN_P(CG(current_namespace)) + 2 + Z_STRLEN_P(name)+1)) { + /* if our new import name is simply the shorthand, skip this check */ + if ((!shorthand || !memcmp(ns_name, Z_STRVAL_P(ns), Z_STRLEN_P(ns) + 1)) && zend_hash_exists(CG(class_table), ns_name, Z_STRLEN_P(CG(current_namespace)) + 2 + Z_STRLEN_P(name)+1)) { zend_error(E_COMPILE_ERROR, "Import name '%s' conflicts with defined class", Z_STRVAL_P(name)); } efree(ns_name); Index: Zend/zend_language_parser.y =================================================================== RCS file: /repository/ZendEngine2/zend_language_parser.y,v retrieving revision 1.160.2.4.2.8.2.4 diff -u -r1.160.2.4.2.8.2.4 zend_language_parser.y --- Zend/zend_language_parser.y 1 Oct 2007 10:37:13 -0000 1.160.2.4.2.8.2.4 +++ Zend/zend_language_parser.y 23 Oct 2007 03:15:41 -0000 @@ -47,7 +47,7 @@ %} %pure_parser -%expect 2 +%expect 3 %left T_INCLUDE T_INCLUDE_ONCE T_EVAL T_REQUIRE T_REQUIRE_ONCE %left ',' @@ -142,7 +142,7 @@ %token T_END_HEREDOC %token T_DOLLAR_OPEN_CURLY_BRACES %token T_CURLY_OPEN -%token T_PAAMAYIM_NEKUDOTAYIM +%left T_PAAMAYIM_NEKUDOTAYIM %token T_NAMESPACE %token T_IMPORT %token T_NS_C @@ -160,6 +160,8 @@ namespace_name: T_STRING { $$ = $1; } + | T_IMPORT { $$ = $1; } + | T_NAMESPACE { $$ = $1; } | namespace_name T_PAAMAYIM_NEKUDOTAYIM T_STRING { zend_do_build_namespace_name(&$$, &$1, &$3 TSRMLS_CC); } ; @@ -168,9 +170,9 @@ | function_declaration_statement { zend_do_early_binding(TSRMLS_C); } | class_declaration_statement { zend_do_early_binding(TSRMLS_C); } | T_HALT_COMPILER '(' ')' ';' { zend_do_halt_compiler_register(TSRMLS_C); YYACCEPT; } - | T_NAMESPACE namespace_name ';' { zend_do_namespace(&$2 TSRMLS_CC); } - | T_IMPORT namespace_name ';' { zend_do_import(&$2, NULL TSRMLS_CC); } - | T_IMPORT namespace_name T_AS T_STRING ';' { zend_do_import(&$2, &$4 TSRMLS_CC); } + | T_NAMESPACE namespace_name ';' { efree($1.u.constant.value.str.val);zend_do_namespace(&$2 TSRMLS_CC); } + | T_IMPORT fully_qualified_class_name ';' { efree($1.u.constant.value.str.val);zend_do_import(&$2, NULL TSRMLS_CC); } + | T_IMPORT fully_qualified_class_name T_AS function_name_token ';' { efree($1.u.constant.value.str.val);zend_do_import(&$2, &$4 TSRMLS_CC); } | constant_declaration ';' ; @@ -298,6 +300,12 @@ '(' parameter_list ')' '{' inner_statement_list '}' { zend_do_end_function_declaration(&$1 TSRMLS_CC); } ; +function_name_token: + T_STRING { $$ = $1; } + | T_NAMESPACE { $$ = $1; } + | T_IMPORT { $$ = $1; } +; + unticked_class_declaration_statement: class_entry_type T_STRING extends_from { zend_do_begin_class_declaration(&$1, &$2, &$3 TSRMLS_CC); } @@ -636,7 +644,7 @@ ; function_call: - T_STRING '(' { $2.u.opline_num = zend_do_begin_function_call(&$1, 1 TSRMLS_CC); } + function_name_token '(' { $2.u.opline_num = zend_do_begin_function_call(&$1, 1 TSRMLS_CC); } function_call_parameter_list ')' { zend_do_end_function_call(&$1, &$$, &$4, 0, $2.u.opline_num TSRMLS_CC); zend_do_extended_fcall_end(TSRMLS_C); } | T_PAAMAYIM_NEKUDOTAYIM T_STRING '(' { $3.u.opline_num = zend_do_begin_function_call(&$2, 0 TSRMLS_CC); } @@ -661,6 +669,8 @@ fully_qualified_class_name: T_STRING { $$ = $1; } + | T_IMPORT { $$ = $1; } + | T_NAMESPACE { $$ = $1; } | T_STATIC { $$.op_type = IS_CONST; ZVAL_STRINGL(&$$.u.constant, "static", sizeof("static")-1, 1);} | T_PAAMAYIM_NEKUDOTAYIM T_STRING { zend_do_build_namespace_name(&$$, NULL, &$2 TSRMLS_CC); } | fully_qualified_class_name T_PAAMAYIM_NEKUDOTAYIM T_STRING { zend_do_build_namespace_name(&$$, &$1, &$3 TSRMLS_CC); } Index: Zend/zend_language_scanner.l =================================================================== RCS file: /repository/ZendEngine2/zend_language_scanner.l,v retrieving revision 1.131.2.11.2.13.2.2 diff -u -r1.131.2.11.2.13.2.2 zend_language_scanner.l --- Zend/zend_language_scanner.l 7 Oct 2007 05:22:03 -0000 1.131.2.11.2.13.2.2 +++ Zend/zend_language_scanner.l 23 Oct 2007 03:15:42 -0000 @@ -45,6 +45,11 @@ %x ST_COMMENT %x ST_DOC_COMMENT %x ST_ONE_LINE_COMMENT +%x ST_LOOKING_FOR_CLASSNAME +%x ST_LOOKING_FOR_EXTENDS +%x ST_LOOKING_FOR_IMPLEMENTS +%x ST_LOOKING_FOR_FUNCTION_NAME +%x ST_LOOKING_FOR_METHOD_NAME %option stack %{ @@ -969,143 +974,172 @@ %option noyywrap %% -"exit" { +"exit" { return T_EXIT; } -"die" { +"die" { return T_EXIT; } "function" { + if (CG(active_class_entry)) { + yy_push_state(ST_LOOKING_FOR_METHOD_NAME TSRMLS_CC); + } else { + yy_push_state(ST_LOOKING_FOR_FUNCTION_NAME TSRMLS_CC); + } + return T_FUNCTION; +} + +"function" { return T_FUNCTION; } -"const" { +"const" { return T_CONST; } -"return" { +"return" { return T_RETURN; } -"try" { +"try" { return T_TRY; } -"catch" { +"catch" { return T_CATCH; } -"throw" { +"throw" { return T_THROW; } -"if" { +"if" { return T_IF; } -"elseif" { +"elseif" { return T_ELSEIF; } -"endif" { +"endif" { return T_ENDIF; } -"else" { +"else" { return T_ELSE; } -"while" { +"while" { return T_WHILE; } -"endwhile" { +"endwhile" { return T_ENDWHILE; } -"do" { +"do" { return T_DO; } -"for" { +"for" { return T_FOR; } -"endfor" { +"endfor" { return T_ENDFOR; } -"foreach" { +"foreach" { return T_FOREACH; } -"endforeach" { +"endforeach" { return T_ENDFOREACH; } -"declare" { +"declare" { return T_DECLARE; } -"enddeclare" { +"enddeclare" { return T_ENDDECLARE; } -"instanceof" { +"instanceof" { return T_INSTANCEOF; } -"as" { +"as" { return T_AS; } -"switch" { +"switch" { return T_SWITCH; } -"endswitch" { +"endswitch" { return T_ENDSWITCH; } -"case" { +"case" { return T_CASE; } -"default" { +"default" { return T_DEFAULT; } -"break" { +"break" { return T_BREAK; } -"continue" { +"continue" { return T_CONTINUE; } -"echo" { +"echo" { return T_ECHO; } -"print" { +"print" { return T_PRINT; } "class" { + yy_push_state(ST_LOOKING_FOR_CLASSNAME TSRMLS_CC); + return T_CLASS; +} + +"class" { return T_CLASS; } "interface" { + yy_push_state(ST_LOOKING_FOR_CLASSNAME TSRMLS_CC); + return T_INTERFACE; +} + +"interface" { return T_INTERFACE; } "extends" { + yy_push_state(ST_LOOKING_FOR_EXTENDS TSRMLS_CC); + return T_EXTENDS; +} + +"extends" { return T_EXTENDS; } "implements" { + yy_push_state(ST_LOOKING_FOR_IMPLEMENTS TSRMLS_CC); + return T_IMPLEMENTS; +} + +"implements" { return T_IMPLEMENTS; } @@ -1118,31 +1152,47 @@ return T_OBJECT_OPERATOR; } -{LABEL} { +'&' { + return '&'; +} + +{WHITESPACE} { + zendlval->value.str.val = yytext; /* no copying - intentional */ + zendlval->value.str.len = yyleng; + zendlval->type = IS_STRING; + HANDLE_NEWLINES(yytext, yyleng); + return T_WHITESPACE; +} + +{LABEL} { yy_pop_state(TSRMLS_C); zend_copy_value(zendlval, yytext, yyleng); zendlval->type = IS_STRING; return T_STRING; } -{ANY_CHAR} { +{ANY_CHAR} { yyless(0); yy_pop_state(TSRMLS_C); } "::" { + yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC); return T_PAAMAYIM_NEKUDOTAYIM; } "new" { + yy_push_state(ST_LOOKING_FOR_PROPERTY TSRMLS_CC); return T_NEW; } -"clone" { +"clone" { + zend_copy_value(zendlval, yytext, yyleng); + zendlval->type = IS_STRING; return T_CLONE; } -"var" { +"var" { return T_VAR; } @@ -1178,79 +1228,83 @@ return T_UNSET_CAST; } -"eval" { +"eval" { return T_EVAL; } -"include" { +"include" { return T_INCLUDE; } -"include_once" { +"include_once" { return T_INCLUDE_ONCE; } -"require" { +"require" { return T_REQUIRE; } -"require_once" { +"require_once" { return T_REQUIRE_ONCE; } "namespace" { + zend_copy_value(zendlval, yytext, yyleng); + zendlval->type = IS_STRING; return T_NAMESPACE; } "import" { + zend_copy_value(zendlval, yytext, yyleng); + zendlval->type = IS_STRING; return T_IMPORT; } -"use" { +"use" { return T_USE; } -"global" { +"global" { return T_GLOBAL; } -"isset" { +"isset" { return T_ISSET; } -"empty" { +"empty" { return T_EMPTY; } -"__halt_compiler" { +"__halt_compiler" { return T_HALT_COMPILER; } -"static" { +"static" { return T_STATIC; } -"abstract" { +"abstract" { return T_ABSTRACT; } -"final" { +"final" { return T_FINAL; } -"private" { +"private" { return T_PRIVATE; } -"protected" { +"protected" { return T_PROTECTED; } -"public" { +"public" { return T_PUBLIC; } -"unset" { +"unset" { return T_UNSET; } @@ -1258,11 +1312,11 @@ return T_DOUBLE_ARROW; } -"list" { +"list" { return T_LIST; } -"array" { +"array" { return T_ARRAY; } @@ -1480,7 +1534,7 @@ return T_DNUMBER; } -"__CLASS__" { +"__CLASS__" { char *class_name = NULL; if (CG(active_class_entry)) { @@ -1496,7 +1550,7 @@ return T_CLASS_C; } -"__FUNCTION__" { +"__FUNCTION__" { char *func_name = NULL; if (CG(active_op_array)) { @@ -1512,7 +1566,7 @@ return T_FUNC_C; } -"__METHOD__" { +"__METHOD__" { char *class_name = CG(active_class_entry) ? CG(active_class_entry)->name : NULL; char *func_name = CG(active_op_array)? CG(active_op_array)->function_name : NULL; size_t len = 0; @@ -1533,13 +1587,13 @@ return T_METHOD_C; } -"__LINE__" { +"__LINE__" { zendlval->value.lval = CG(zend_lineno); zendlval->type = IS_LONG; return T_LINE; } -"__FILE__" { +"__FILE__" { char *filename = zend_get_compiled_filename(TSRMLS_C); if (!filename) { @@ -1551,7 +1605,7 @@ return T_FILE; } -"__NAMESPACE__" { +"__NAMESPACE__" { if (CG(current_namespace)) { *zendlval = *CG(current_namespace); zval_copy_ctor(zendlval); @@ -1561,6 +1615,38 @@ return T_NS_C; } +"," { + return ','; +} + +"&" { + return '&'; +} + +"::" { + return T_PAAMAYIM_NEKUDOTAYIM; +} + +{WHITESPACE} { + zendlval->value.str.val = yytext; /* no copying - intentional */ + zendlval->value.str.len = yyleng; + zendlval->type = IS_STRING; + HANDLE_NEWLINES(yytext, yyleng); + return T_WHITESPACE; +} + +{LABEL} { + yy_pop_state(TSRMLS_C); + zend_copy_value(zendlval, yytext, yyleng); + zendlval->type = IS_STRING; + return T_STRING; +} + +{ANY_CHAR} { + yyless(0); + yy_pop_state(TSRMLS_C); +} + (([^<]|"<"[^?%s<]){1,400})|"{WHITESPACE} { zendlval->value.str.val = yytext; /* no copying - intentional */ zendlval->value.str.len = yyleng; --------------050109080701040000020105--