Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:35223 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 57547 invoked by uid 1010); 6 Feb 2008 04:01:15 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 57532 invoked from network); 6 Feb 2008 04:01:14 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 6 Feb 2008 04:01:14 -0000 Authentication-Results: pb1.pair.com header.from=xuefer@gmail.com; sender-id=pass; domainkeys=bad Authentication-Results: pb1.pair.com smtp.mail=xuefer@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.146.177 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: xuefer@gmail.com X-Host-Fingerprint: 209.85.146.177 wa-out-1112.google.com Received: from [209.85.146.177] ([209.85.146.177:12892] helo=wa-out-1112.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 32/D0-48589-90139A74 for ; Tue, 05 Feb 2008 23:01:14 -0500 Received: by wa-out-1112.google.com with SMTP id l24so240986waf.17 for ; Tue, 05 Feb 2008 20:01:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=NlgnXTe/AXKHhv6B/6ECAzT7W6jiXyn8vY7hRCdFuIw=; b=sM0vkLMH2c2eBwjDrTKsn2tWPRuLk2jE4An2zsZgXTeIMoYBuvlwys0mQ6MklEPSEe81/pgsEVGvgqZU4qEqSe0Svco3O2SwlSOpniOHMeGt80aP1AinHEcJ2OJzBP71RNqxRDFE0yCla9WGtsx3O9lPNjBCDLqueKj2BI0z8yA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=IB2n5NxrqKUiHNKyryDharCXj2NU2JMQtIdr7QnZ73kYnRhaoX61XTI8tvkm180tTh0yHu8QVw6nmE1+AkOq/53/rIbP2NrngY2LQwTONKszPzEytw4ggzOkoN2ijtSLmETq8eD6u8oaOktxYd+WrbphSy3XJd+vbTkuboG+0IE= Received: by 10.114.76.1 with SMTP id y1mr3037652waa.38.1202270471366; Tue, 05 Feb 2008 20:01:11 -0800 (PST) Received: by 10.114.36.15 with HTTP; Tue, 5 Feb 2008 20:01:11 -0800 (PST) Message-ID: <28139bc0802052001y434adcccy6c68ed09830fec84@mail.gmail.com> Date: Wed, 6 Feb 2008 12:01:11 +0800 To: "Tomas Kuliavas" Cc: internals@lists.php.net In-Reply-To: <51526.78.61.224.253.1201585363.nsm@avilys.eik.lt> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200801241426.39756.arnaud.lb@gmail.com> <479A613C.8030604@zend.com> <3hpsp3hmn2de4fard8lkpentg24k70jrhg@4ax.com> <479E80D8.5020206@lerdorf.com> <479EB7ED.8070606@lerdorf.com> <51526.78.61.224.253.1201585363.nsm@avilys.eik.lt> Subject: Re: [PHP-DEV] [PATCH] Bug #43896 htmlspecialchars returns empty stringoninvalid unicode sequence From: xuefer@gmail.com (Xuefer) > iconv() will stop on conversion error and return partial string or empty > string. It will require even more complex code than 5.2.5 > htmlspecialchars() does. With htmlspecialchars you check for empty string > before and after the call. With iconv you check for php errors during > iconv call. the current impl of iconv() wrap in php is a pain in the @$$, i can only get truncated string if there's some invalid char, or rely on glibc "//IGNORE" flag. "string".encode() in python is much better that u can with 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' as the 2nd param of encode() i agree that ignoring the invalid char in "all case" is not good, but truncating in all case isn't either. there're some case acceptable like user post -> server accept it, ignoring invalid chars -> and user have chance to review his text later. too bad that the conversation implicit like __set/__get that you can't add one more optional paramter. hope there's some nice way to get this problem done.