Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:71244 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 22978 invoked from network); 18 Jan 2014 13:39:34 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 18 Jan 2014 13:39:34 -0000 Authentication-Results: pb1.pair.com header.from=lester@lsces.co.uk; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=lester@lsces.co.uk; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain lsces.co.uk from 217.147.176.204 cause and error) X-PHP-List-Original-Sender: lester@lsces.co.uk X-Host-Fingerprint: 217.147.176.204 mail4.serversure.net Linux 2.6 Received: from [217.147.176.204] ([217.147.176.204:52900] helo=mail4.serversure.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id C4/72-08251-5148AD25 for ; Sat, 18 Jan 2014 08:39:34 -0500 Received: (qmail 15666 invoked by uid 89); 18 Jan 2014 13:39:30 -0000 Received: by simscan 1.3.1 ppid: 15659, pid: 15662, t: 0.0566s scanners: attach: 1.3.1 clamav: 0.96/m:52 Received: from unknown (HELO linux-dev4.lsces.org.uk) (lester@rainbowdigitalmedia.org.uk@81.138.11.136) by mail4.serversure.net with ESMTPA; 18 Jan 2014 13:39:30 -0000 Message-ID: <52DA84A2.2050305@lsces.co.uk> Date: Sat, 18 Jan 2014 13:41:54 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0 SeaMonkey/2.23 MIME-Version: 1.0 To: internals@lists.php.net References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] [RFC] Multibyte char handling From: lester@lsces.co.uk (Lester Caine) Yasuo Ohgaki wrote: > addslashes() could be vulnerable via char encoding based attacks. > It is needed to decide what counter measure we adopt. > This is RFC for this issue. > > https://wiki.php.net/multibyte_char_handling > > Please comment. Multibyte characters are still a contentious area, and the current compromise of supporting multibyte content, but being essentially 'single byte' for the programming structure as been a solution adopted in a few projects. Firebird is once again debating the same point that they and PHP last discussed 10 years ago, and was too difficult so PHP6 floundered and Firebird remained essentially single byte strings in the metadata. 10 years on isn't it time to re-open the debate on making the core unicode since 32 bit processors are more likely to be the norm these days. Certainly if everything internal is UTF8, then all of the encoding problems are moved to the client interface? (p.s rfc needs a little work via the spell checker and the link above is wrong) -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk