Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:77959 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 9072 invoked from network); 14 Oct 2014 09:04:14 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 14 Oct 2014 09:04:14 -0000 Authentication-Results: pb1.pair.com smtp.mail=aleksey.tulinov@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=aleksey.tulinov@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.215.54 as permitted sender) X-PHP-List-Original-Sender: aleksey.tulinov@gmail.com X-Host-Fingerprint: 209.85.215.54 mail-la0-f54.google.com Received: from [209.85.215.54] ([209.85.215.54:52465] helo=mail-la0-f54.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 86/15-15889-B07EC345 for ; Tue, 14 Oct 2014 05:04:12 -0400 Received: by mail-la0-f54.google.com with SMTP id gm9so8137537lab.27 for ; Tue, 14 Oct 2014 02:04:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=aLFfu1nK5WFD2PMp/Tb8WjpM7Vbx13iW9c51bWLTYS4=; b=X89hArB1B7oIP2ADJUi4kp8MfC1XMEhab+Ox6wmTZt47GRzdncaGsCRiZTLgW6fsYM tI2fKed+eE6FPHXlMTdQF3HDS5CmkkE2gXC5pnzKdz5ltVltpKTXpl2AZxJpkPzr/44/ TG8r6VpCDzv9p3vMOMdEO/rs4LU0ic90S4OFgdiUL5952TBD5S8jfwuz4fiNL6X6P5ko Jwqyi7MoPVI3FwEbi6D9bIg04xDCPvy/gfuDuzwJjcmZC1n/9BerPhVvfA4I6MQ09k9z MDRRTAQAyS51pt1g5+tlRlQ8GQLiTuwSBRgEihk5vXpfR8cWJRj1FAvzzag26T0BGDKE wYbw== X-Received: by 10.152.9.137 with SMTP id z9mr4106608laa.14.1413277448934; Tue, 14 Oct 2014 02:04:08 -0700 (PDT) Received: from [172.16.0.137] ([195.177.73.61]) by mx.google.com with ESMTPSA id lv10sm3328562lac.7.2014.10.14.02.04.07 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Oct 2014 02:04:08 -0700 (PDT) Message-ID: <543CE705.7030203@gmail.com> Date: Tue, 14 Oct 2014 12:04:05 +0300 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: PHP Internals Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Unicode support From: aleksey.tulinov@gmail.com (Aleksey Tulinov) Hey, I can't find any recent discussion in this mailing list on this topic, i think that most close one is http://grokbase.com/t/php/php-internals/143b6aevsp/unicode-strings. I was also reading papers like that: http://www.infoworld.com/article/2618358/application-development/php-5-4-emerges-from-the-collapse-of-php-6-0.html Latter is referring to difficulties like "excess memory usage" and "rewrite the language". I'm developing an open-source Unicode implementation library (nunicode), and it doesn't consume any heap at all, it also works on native binary strings, as PHP does. Hence i thinks that maybe it could help with at least these two problems. But i hardly understand if my work is even applicable here. My library is a rather pragmatic implementation, it's conformant to Unicode 7.0 and ISO/IEC 14651, but it does not implement the whole Unicode specification. I would appreciate if someone would point me to a good read or explain collective opinion on this topic. I'm basically interested in the following questions: 1. Is there a need for more Unicode support in PHP? 2. What is currently missing in that regard? 3. Is this a good place to ask such questions? Thanks.