Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:34937 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 98836 invoked by uid 1010); 26 Jan 2008 11:46:18 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 98821 invoked from network); 26 Jan 2008 11:46:18 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 26 Jan 2008 11:46:18 -0000 Authentication-Results: pb1.pair.com smtp.mail=arnaud.lb@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=arnaud.lb@gmail.com; sender-id=pass; domainkeys=bad Received-SPF: pass (pb1.pair.com: domain gmail.com designates 66.249.92.174 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: arnaud.lb@gmail.com X-Host-Fingerprint: 66.249.92.174 ug-out-1314.google.com Received: from [66.249.92.174] ([66.249.92.174:62048] helo=ug-out-1314.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 01/68-08850-88D1B974 for ; Sat, 26 Jan 2008 06:46:18 -0500 Received: by ug-out-1314.google.com with SMTP id u40so657385ugc.29 for ; Sat, 26 Jan 2008 03:46:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:message-id; bh=6gemRV9zHg7exeU7c545rIf/4sPu83KtCWd1RLVWhak=; b=RaK93FY8wPbCli0QK2sUNhx9ideaKvy5ccMmcRfexX11YFB2Xb9zwxKInL9urGPGuplhxjyuyv/o8S64oenYQMnYWnjBRgCpkYlEG05g4bD2J9NsQp9yTieSUBJSwLwd0hF6DqR2xJfvixGMrHl+RkyoHMdjTgPtAhIq8Vhkft8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:message-id; b=EzPX0GvYTTaxCIYOAqKhhzrkqcHqwr+0t571rU1j1Ylj+J6VlPVu4GluFu+tYtUOQ/ZQlgiutydy+o2grF5Hu72R1kQSs1FAYbgOqZjkCGlJkgIJzHI1bqoJPBEVXJGjjx40t2e5nQ9i9WlOUlv6FMHPV5DuKJ+cpFT6znKFemw= Received: by 10.66.252.18 with SMTP id z18mr851664ugh.37.1201347973930; Sat, 26 Jan 2008 03:46:13 -0800 (PST) Received: from 207-177-41-213.getmyip.com ( [213.41.177.207]) by mx.google.com with ESMTPS id d13sm6965903fka.19.2008.01.26.03.46.12 (version=SSLv3 cipher=OTHER); Sat, 26 Jan 2008 03:46:12 -0800 (PST) To: Stanislav Malyshev Date: Sat, 26 Jan 2008 12:47:53 +0100 User-Agent: KMail/1.9.7 Cc: internals Mailing List References: <200801241426.39756.arnaud.lb@gmail.com> <479A613C.8030604@zend.com> In-Reply-To: <479A613C.8030604@zend.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-ID: <200801261247.53115.arnaud.lb@gmail.com> Subject: Re: [PHP-DEV] [PATCH] Bug #43896 htmlspecialchars returns empty string on invalid unicode sequence From: arnaud.lb@gmail.com ("Arnaud.lb") > > Should really theses functions discard the whole string for a single > > incomplete sequence ? > > I think since it is not possible to recover true content of the string, > it is ok to return failure value. Cutting it in random places or > ignoring problems doesn't seem a good idea - it might lead to all kinds > of nasty things, such as security filtering checking one data and > database getting entirely different data. I dont think so. htmlspecialchars' job is to replace character sequences which may be interpreted as HTML special characters by the browser. Its job is not to validate a string or to check if it will be passed correctly to a DB. htmlspecialchars with my patch just achieves that. There are many chances to have an invalid unicode sequence in a user input. In normal situations, text typed in a form element will be sent in the correct encoding by the browser, but what about file uploads ? What if the browser itself send invalid sequences ? (e.g. copy/paste of word documents in a form and/or wysiwyg-enabled elements using IE). Bugs 43896, 43294 and 43549 also report theses problems. This new htmlspecialchars version will be a nightmare for many php users if it is left as is.