Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:56028 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 53704 invoked from network); 3 Nov 2011 11:27:41 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Nov 2011 11:27:41 -0000 Authentication-Results: pb1.pair.com header.from=yohgaki@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=yohgaki@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.213.42 as permitted sender) X-PHP-List-Original-Sender: yohgaki@gmail.com X-Host-Fingerprint: 209.85.213.42 mail-yw0-f42.google.com Received: from [209.85.213.42] ([209.85.213.42:34862] helo=mail-yw0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 09/70-50864-BAA72BE4 for ; Thu, 03 Nov 2011 06:27:40 -0500 Received: by ywb26 with SMTP id 26so1330318ywb.29 for ; Thu, 03 Nov 2011 04:27:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=E/lDBAsTwne0p7HCEAXvuHnA6hFxHyfyoTefhVM6s78=; b=Gh04s47nnXhoE+IhV2YfXeQe8U9Qm6mitA86TwXfVmRdNjGw4R6O9PK/lNQ+kWeOn/ Ues2VvLzg2E5hqkXOEeioUu1jdttDdgToxEddXu+eZqpHmuaSfDl3xgE/g73hyU7UQcB Om+r90KnLlgriUspNNPQ7BmnV9NCCDgavDdDc= Received: by 10.100.235.40 with SMTP id i40mr1961872anh.39.1320319656102; Thu, 03 Nov 2011 04:27:36 -0700 (PDT) MIME-Version: 1.0 Sender: yohgaki@gmail.com Received: by 10.100.106.11 with HTTP; Thu, 3 Nov 2011 04:26:55 -0700 (PDT) In-Reply-To: References: <4EB23E3D.3010908@zend.com> <4EB25BF3.7040703@zend.com> Date: Thu, 3 Nov 2011 20:26:55 +0900 X-Google-Sender-Auth: y2IK6e3GHVBzzO8Cmo-LsmnfNB4 Message-ID: To: Gustavo Lopes Cc: "internals@lists.php.net" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Zend Multibyte support From: yohgaki@ohgaki.net (Yasuo Ohgaki) Hi Gustavo, Thanks for reply. As long as bison didn't understand multibyte chars, parser would not work well with them. Your reply is exactly what I expected. Thank you for clarification. -- Yasuo Ohgaki yohgaki@ohgaki.net On Thu, Nov 3, 2011 at 8:07 PM, Gustavo Lopes wrot= e: > Em Thu, 03 Nov 2011 10:31:47 -0000, Yasuo Ohgaki > escreveu: > >> One last quick question. >> Zend/tests/multibyte/multibyte_encoding_001.phpt sets >> mbstring.internal_encoding=3DSJIS. >> >> Does PHP 5.4+ suppose to work with SJIS(or other similar encoding) >> internal_encoding? >> > > No. What matters is that the parser generated by bison is able to recogni= ze > the tokens. In an ASCII (as opposed to EBCDIC) machine, this means the > encoding must be ASCII compatible. > > This is the table for SJIS: > =C2=A0http://icu-project.org/icu-bin/convexp?conv=3Dibm-943_P15A-2003&s= =3DALL > > It would appear that it was ASCII compatible =E2=80=93 \x20-\x7E represen= t > U+0020-U+007E, but if you take a closer look you'll see that these bytes = can > also appear as part of larger sequences. > > For instance, in this script: > > function a=E6=BC=BE() {} > > the character =E6=BC=BE is represented with =C2=A0\xE0\x40, where \x40 re= presents @ in > ASCII, so this would give an error, the same this would give an error: > > function a=C3=A0@() {} > > would give an error. In fact, If I save the first script as UTF-8 and the= n > run PHP: > > $ ./php -d zend.multibyte=3D1 -d zend.script_encoding=3DUTF-8 -d > mbstring.internal_encoding=3DSJIS sjis.php > php: Zend/zend_language_scanner.l:126: encoding_filter_script_to_internal= : > Assertion `internal_encoding && > zend_multibyte_check_lexer_compatibility(internal_encoding)' failed. > Aborted > > it gives an assertion error. > > -- > Gustavo Lopes > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >