Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:63227 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 8489 invoked from network); 24 Sep 2012 14:41:34 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 24 Sep 2012 14:41:34 -0000 Authentication-Results: pb1.pair.com smtp.mail=ivan.enderlin@hoa-project.net; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=ivan.enderlin@hoa-project.net; sender-id=unknown Received-SPF: error (pb1.pair.com: domain hoa-project.net from 95.130.12.24 cause and error) X-PHP-List-Original-Sender: ivan.enderlin@hoa-project.net X-Host-Fingerprint: 95.130.12.24 host1.trois-doubles.net Linux 2.6 Received: from [95.130.12.24] ([95.130.12.24:37900] helo=host1.trois-doubles.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 8B/5E-17579-D1170605 for ; Mon, 24 Sep 2012 10:41:33 -0400 Received: from Hwhost2.local (unknown [194.57.88.180]) by host1.trois-doubles.net (Postfix) with ESMTPA id 7CA0C2007CE for ; Mon, 24 Sep 2012 16:41:30 +0200 (CEST) Message-ID: <5060711A.2030704@hoa-project.net> Date: Mon, 24 Sep 2012 16:41:30 +0200 Reply-To: ivan.enderlin@hoa-project.net User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/17.0 Thunderbird/17.0a2 MIME-Version: 1.0 To: internals@lists.php.net References: <502020FB.7010502@hoa-project.net> In-Reply-To: <502020FB.7010502@hoa-project.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] mbstring, a proposition of additional functions From: ivan.enderlin@hoa-project.net ("Ivan Enderlin @ Hoa") Hi, I sent this email during this summer and nobody replied. I would like to = know your opinion. On 06/08/12 21:54, Ivan Enderlin @ Hoa wrote: > Hello, > > ext/mbstring is very useful, but from my point of view, some functions = > are missing. > I would like to propose the addition of the following functions. > > mb_append($str, $piece) and mb_prepend($str, $piece): > To add a $piece to, respectively, the end and the start of $string. It = > should consider the text direction, e.g. if $str is Right-to-Left,=20 > mb_append() will add $piece to the end, so the left, of $string. > > mb_concat($str, $piece): > If mb_append() and mb_prepend() are too hard to understand by users,=20 > mb_concat() would be pretty much simpler and will act in the same way. I would insert: mb_replace($str, $search, $replace): Complementary to str_replace and less simpler than preg_replace or=20 preg_filter (and with better performance because it does not need to=20 compile regex to automata etc.). > mb_pad($length, $piece, $end): > Pretty much like strpad but with multi-bytes considering. > Note that, like mb_append and mb_prepend, we do not speak about left=20 > and right, but start and end. > > mb_get_direction($str): > Return LTR or RTL. For one char, it is =93easy=94. For a string, it is = > more complex because we can have embedding directions (a string with=20 > many directions). > > mb_get_char($code) or mb_chr($code): > Get Unicode character from a code-point/decimal value representation. > > mb_get_code($char) or mb_ord($char): > Get code-point/decimal value representation from of Unicode character. > > Finally, it would be great if mb_substr() could consider text=20 > direction (or create a new dedicated function?). For example,=20 > mb_substr($str, 0, 1) will return the first char from the start (and=20 > not the left as it is implemented now). > > I think it could help developers to create nice libraries without=20 > requiring a lot of skills in Unicode. > Thoughts? > > Best regards :-). Recently, I crafted a String object that support most of the=20 propositions written here. Please, see=20 https://github.com/hoaproject/String (file String.php). The code is very = simple (it implies that it do not need a lot of work to implement it in=20 PHP I hope). Some interesting methods: =95 append; =95 prepend; =95 pad; =95 getIterator (equivalent of str_split); =95 \ArrayAccess methods (with support of out-of-bound indexes); =95 getByteAt (with support of out-of-bound indexes); =95 getCharDirection; =95 fromCode; =95 toCode. Hope this helps :-), Best regards. --=20 Ivan Enderlin Developer of Hoa http://hoa.42/ or http://hoa-project.net/ PhD. student at DISC/Femto-ST (Vesontio) and INRIA (Cassis) http://disc.univ-fcomte.fr/ and http://www.inria.fr/ Member of HTML and WebApps Working Group of W3C http://w3.org/