Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:75230 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 40556 invoked from network); 3 Jul 2014 12:39:09 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Jul 2014 12:39:09 -0000 Authentication-Results: pb1.pair.com header.from=tjerk.meesters@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=tjerk.meesters@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.220.175 as permitted sender) X-PHP-List-Original-Sender: tjerk.meesters@gmail.com X-Host-Fingerprint: 209.85.220.175 mail-vc0-f175.google.com Received: from [209.85.220.175] ([209.85.220.175:42757] helo=mail-vc0-f175.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 18/44-47713-AEE45B35 for ; Thu, 03 Jul 2014 08:39:07 -0400 Received: by mail-vc0-f175.google.com with SMTP id hy4so136862vcb.20 for ; Thu, 03 Jul 2014 05:39:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=frZDjd2HzI8Cipeu97q7H0TzszFgzmlWfPZKVA81qZ4=; b=fAN4JHurpp0CBArzSbS4GJq4DJynThHCZHwFzaw671p68uqZpIVORC+Hd7S7QBI8le EsWRPouIFK6gvhvGBhhLRDiZnesAsNssa+8nSZXo0ZK/xiv+gc/VcO7BkoVuPKT3WZM8 HVSgRAtyAbZpo+yI5RIBjZVWDYTnhTDPuVLM5EoF+WvRTVb4MXh0lv9853VoetAaxpJA k2aiResE/ylofVaZsjZiLq3WYvfzuJDgMdpALArVf2ycRLemII0o63vPP0MfcRAp9yYR IMP3VWJEqRKhUF0a72LgiWb0AhOy8qpLoptJrZz4t/KZtArYEwVEekC+73klfmY8JUJo o7nA== MIME-Version: 1.0 X-Received: by 10.52.120.83 with SMTP id la19mr110945vdb.68.1404391142861; Thu, 03 Jul 2014 05:39:02 -0700 (PDT) Received: by 10.58.132.71 with HTTP; Thu, 3 Jul 2014 05:39:02 -0700 (PDT) In-Reply-To: References: <679D0316-74C5-4AEC-9097-5E9793937469@ajf.me> <53B1590F.5070009@gmail.com> Date: Thu, 3 Jul 2014 20:39:02 +0800 Message-ID: To: Kris Craig Cc: Rowan Collins , PHP internals list Content-Type: multipart/alternative; boundary=089e0122f12e33047004fd494b16 Subject: Re: [PHP-DEV] Re: ucwords() vs title case From: tjerk.meesters@gmail.com (Tjerk Meesters) --089e0122f12e33047004fd494b16 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Wed, Jul 2, 2014 at 1:19 AM, Tjerk Meesters wrote: > Hi Kris, > > > On Tue, Jul 1, 2014 at 7:25 AM, Kris Craig wrote: > >> On Mon, Jun 30, 2014 at 5:33 AM, Rowan Collins >> wrote: >> >> > Andrea Faulds wrote (on 30/06/2014): >> > >> >> On 30 Jun 2014, at 12:54, Tjerk Meesters >> >> wrote: >> >> >> >> Hi internals, >> >>> >> >>> I came across this old bug: https://bugs.php.net/bug.php?id=3D34407 >> >>> >> >>> >> >>> >> >>> Personally I find that the latter is too much of a departure from >> what we >> >>> currently have; a compromise could be to treat punctuation as a word >> >>> delimiter. >> >>> >> >> Hmm. Why not make it follow what \b in a regex would do, looking for >> >> =E2=80=9Cword boundaries=E2=80=9D? >> >> >> > >> > Unfortunately, the cleverer you try to be, the more edge cases you fin= d. >> > For instance, using \b will capitalise the 's' after an apostrophe, >> e.g. in >> > "Andrea'S Suggestion". >> > >> > The function we have in our code base at the moment looks like this: >> > >> > function smart_uc_words($string) >> > { >> > $string =3D strtolower(trim($string)); >> > // Capitalise any word char preceded by a non-word char other >> than >> > an apostrophe >> > $string =3D preg_replace_callback('/(?> function($m){ >> > return strtoupper($m[1]); }, $string); >> > // Capitalise any word char which comes between an apostrophe >> and >> > another word char >> > $string =3D preg_replace_callback('/(?<=3D\')(\w)(?=3D\w)/', >> > function($m){ return strtoupper($m[1]); }, $string); >> > >> > return $string; >> > } >> > >> >> What about leaving the default behavior as-is but adding an optional >> argument to specify how to determine these boundaries? So if you did >> something like ucwords( "hello, world!", '\b' ) or ucwords( "hello, >> world!", array( ' ', '.', ... ) ), the user could control the behavior >> while existing ucwords( $arg ) code would behave as it does now without >> any >> BC. >> > > Yeah, that seems like an option, so basically how `trim()` works too; > treat these characters as word boundaries (default is " \t\r\n"). > > ucwords("hello (new) world", " ()"); > > I'll prepare a PR for this and see how far that takes us :) let me know i= f > you guys have any other ideas. > I've created a PR here: https://github.com/php/php-src/pull/706 If there are no objections I would like to commit this into 5.4 onwards somewhere next week. Thanks. > > > >> --Kris >> > > > > -- > -- > Tjerk > --=20 -- Tjerk --089e0122f12e33047004fd494b16--