Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:100688 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 12546 invoked from network); 17 Sep 2017 10:53:16 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Sep 2017 10:53:16 -0000 Authentication-Results: pb1.pair.com smtp.mail=rowan.collins@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=rowan.collins@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.128.173 as permitted sender) X-PHP-List-Original-Sender: rowan.collins@gmail.com X-Host-Fingerprint: 209.85.128.173 mail-wr0-f173.google.com Received: from [209.85.128.173] ([209.85.128.173:49595] helo=mail-wr0-f173.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 42/C9-19300-B145EB95 for ; Sun, 17 Sep 2017 06:53:15 -0400 Received: by mail-wr0-f173.google.com with SMTP id u96so4313612wrb.6 for ; Sun, 17 Sep 2017 03:53:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:user-agent:in-reply-to:references:mime-version :content-transfer-encoding:subject:to:from:message-id; bh=GDY9tUx7QEyleNsbcnlrsikq23JnoSwELAAAozqwkzU=; b=Q5jTn3p0frNr5N2WwiJGPlhKkSL/vuLT9itHsoDS9BiCOum6CSJqQ704LQnBNGlU+I tCBJG+YOaCbHOYPPHgV7CWaC1t4Vohp/Ylc7PiuMIQ5FKdyTOiWbQdBqRv7U3EjfgSsV Fowr18knKtnny/gYOxZq8B1Obc82PaOx9xzcapmzDTKJ5+L8FMdQ9lFa2jckC14fFCh9 QGUTLRse4i7wdpUNOZ0Y5ksPANf/OQgroIrlAgVUtecApvcw8bvD6i/g2evElITrAiNY HfhtLzB/mmg0/hrsLx9rzQw4TzBckCUpnyMCJ+trsoPAVd26Hb8qmRRbhxsqtWnJp2JE 0cwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:user-agent:in-reply-to:references :mime-version:content-transfer-encoding:subject:to:from:message-id; bh=GDY9tUx7QEyleNsbcnlrsikq23JnoSwELAAAozqwkzU=; b=lJJvLvyHNjMKNqFWi2XQb9LhHRH5kKoYwlZch10vDM/ZvpKVaO/NyuPrJAaJHwLpe1 dafBuBdMzdxLBicZLgFoDiPdYIVFGw6QlpeRPp91HX9fi533/OfqRxJhaanEOuAMNQ2n 5DqyJ42bk9zIiBsxWafqcTq+BsSdVCCouWBrTbmt29ZZsJcfiFHWdaIg160BwPER4KYH fjfhLf8cLzlF55r8EGD1N3k9jRBK5T80wtG3Y4ndXFlcBp4WudBkfkWtIDFBQzP27y32 qmeBuMq1LU2p7QE5AJzKX1JtSG1gnibMdZJvOshPAvSYinrWYt4EKINQIZY9c2RzM0D3 AiEw== X-Gm-Message-State: AHPjjUi96egaet001oqolc/GvRDIKqxdmYkxYMABroRAsODCvL7voA6W XgFbdnoygGIMAOLsklw= X-Google-Smtp-Source: AOwi7QCnaFRy829h92QHfgxC9Ol2iuo8Pwptf3O8xliG6vzEdQQySRdY/E293g/TrspiBACqw6q4VQ== X-Received: by 10.223.152.132 with SMTP id w4mr3744066wrb.264.1505645591930; Sun, 17 Sep 2017 03:53:11 -0700 (PDT) Received: from [10.62.0.184] (188.29.165.65.threembb.co.uk. [188.29.165.65]) by smtp.gmail.com with ESMTPSA id 109sm4644155wrc.25.2017.09.17.03.53.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 17 Sep 2017 03:53:10 -0700 (PDT) Date: Sun, 17 Sep 2017 11:53:07 +0100 User-Agent: K-9 Mail for Android In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable To: internals@lists.php.net,Lester Caine ,"internals@lists.php.net" Message-ID: <7E527061-26D5-4E0C-BAF7-A6F1A940053B@gmail.com> Subject: Re: [PHP-DEV] Progress or just 'a mess'? From: rowan.collins@gmail.com (Rowan Collins) On 17 September 2017 09:54:54 BST, Lester Caine wr= ote: > Just what character set is PHP7 >designed >to work with=2E Focusing on the answerable part of this, PHP actually allows a very wide v= ariety of characters in identifiers (names of variables, classes, functions= , etc)=2E I checked the PHP lang-spec repo expecting to find a set of Unicode classe= s, but it currently mentions "U+0080-U+00FF": https://github=2Ecom/php/php-= langspec/blob/master/spec/09-lexical-structure=2Emd#names That seems wrong = to me, unless I'm looking at the wrong definition - the first part of that = range is control characters, and you can have variables called things like = $=F0=9F=90=98 (with an emoji as the entire name)=2E That would definitely be the place to document the allowed characters, tho= ugh, and a rigorous definition of "case insensitive" could also be added=2E= I was wrong, by the way, to say that using "to case fold" rather than "to = lower case" would solve the Turkish I problem - the key for that is to defi= ne a single locale whose case folding you are using, independent of runtime= locale settings=2E Regards, --=20 Rowan Collins [IMSoP]