Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:63763 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 31759 invoked from network); 6 Nov 2012 16:55:50 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 6 Nov 2012 16:55:50 -0000 Authentication-Results: pb1.pair.com smtp.mail=lester@lsces.co.uk; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=lester@lsces.co.uk; sender-id=unknown Received-SPF: error (pb1.pair.com: domain lsces.co.uk from 213.123.26.187 cause and error) X-PHP-List-Original-Sender: lester@lsces.co.uk X-Host-Fingerprint: 213.123.26.187 c2beaomr09.btconnect.com Received: from [213.123.26.187] ([213.123.26.187:21112] helo=mail.btconnect.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id EC/F0-25645-31149905 for ; Tue, 06 Nov 2012 11:55:48 -0500 Received: from host81-138-11-136.in-addr.btopenworld.com (EHLO _10.0.0.5_) ([81.138.11.136]) by c2beaomr09.btconnect.com with ESMTP id JQH80068; Tue, 06 Nov 2012 16:55:45 +0000 (GMT) Message-ID: <50994110.3000307@lsces.co.uk> Date: Tue, 06 Nov 2012 16:55:44 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120826 Firefox/15.0 SeaMonkey/2.12 MIME-Version: 1.0 To: PHP internals References: <509006DA.6030200@sugarcrm.com> <5098CEFA.2080902@lsces.co.uk> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mirapoint-IP-Reputation: reputation=Fair-1, source=Queried, refid=tid=0001.0A0B0301.50994110.0049, actions=tag X-Junkmail-Premium-Raw: score=7/50, refid=2.7.2:2012.11.6.163324:17:7.944, ip=81.138.11.136, rules=__MOZILLA_MSGID, __HAS_MSGID, __SANE_MSGID, __FW_1LN_BOT_MSGID, __HAS_FROM, __USER_AGENT, __MOZILLA_USER_AGENT, __MIME_VERSION, __TO_MALFORMED_2, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __CT, __CT_TEXT_PLAIN, __CTE, __ANY_URI, __URI_NO_WWW, __CP_URI_IN_BODY, BODY_ENDS_IN_URL, BODYTEXTP_SIZE_3000_LESS, BODY_SIZE_2000_2999, __MIME_TEXT_ONLY, RDNS_GENERIC_POOLED, HTML_00_01, HTML_00_10, BODY_SIZE_5000_LESS, RDNS_SUSP_GENERIC, RDNS_SUSP, BODY_SIZE_7000_LESS X-Junkmail-Status: score=10/50, host=c2beaomr09.btconnect.com X-Junkmail-Signature-Raw: score=unknown, refid=str=0001.0A0B0205.50994111.0048:SCFSTAT14830815,ss=1,re=-4.000,fgs=0, ip=0.0.0.0, so=2011-07-25 19:15:43, dmn=2011-05-27 18:58:46, mode=multiengine X-Junkmail-IWF: false Subject: Re: [PHP-DEV] [VOTE] [RFC] ICU UConverter implementation for ext/intl From: lester@lsces.co.uk (Lester Caine) Sara Golemon wrote: > As far as what's returned by convert() in the case of codepoint > errors, I thought that was explained in the usage sections, but I'll > put some content into the error handling section as well (and link to > the existing intl functions related to error surfacing, The problem with the wiki is then cross linking to the rest of the documentation. After 10 minutes I was still going around in circles ... once you know WHERE to look finding things is easier. > As to determining how many invalid codepoints were encountered, no. > There's not an API to surface that at ICU's level. I could add a > counter to the callback and always enable it, even when not > subclassing, but that's adding complexity where it in all likelihood > won't be used for something which can be done in userspace, so I'm > inclined against it. I thought that was the case, but again cross checking gets difficult. I'm more or less UFT8 based now content wise, and on the whole can just bounce UTF8 back, but identifying the number of 'extended' characters would be useful to pick up strings that would become unreadable. The odd character missed can be acceptable, but whole words are a different matter? Almost looking for a 'readability' scale rather than number of errors? > -Sara > > On Tue, Nov 6, 2012 at 12:48 AM, Lester Caine wrote: >> Sara Golemon wrote: >>> >>> There doesn't seem to be any further discussion. I figured if someone >>> has more objections during the week this stays open for voting they >>> can be addressed, and if need be the voting can be reset and resumed. >> >> >> This may already be covered elsewhere, but is there a return from the class >> that indicates easily the number of codepoint errors? >> It would be helpful if the likes of the error handling single line linked >> through to the relevant material? I've got back to the intl documentation, >> but it's not clear there what is returned if a codepoint error is detected. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk