Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:62500 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 27063 invoked from network); 25 Aug 2012 22:45:06 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 25 Aug 2012 22:45:06 -0000 Authentication-Results: pb1.pair.com smtp.mail=yohgaki@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=yohgaki@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.170 as permitted sender) X-PHP-List-Original-Sender: yohgaki@gmail.com X-Host-Fingerprint: 74.125.82.170 mail-we0-f170.google.com Received: from [74.125.82.170] ([74.125.82.170:41373] helo=mail-we0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 02/D5-06857-F6559305 for ; Sat, 25 Aug 2012 18:45:04 -0400 Received: by weyr1 with SMTP id r1so1877446wey.29 for ; Sat, 25 Aug 2012 15:45:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=7NCD2unuxtT0WxOZRlqi8Li/rMLnUUmuuMU5nHbADPg=; b=L2ErCQR1IdgbzkcOIZcNc/9peaAMlcZZryIs0sQ2Io4Hl8fv9ZSrKGtKsPPskm+1Lt 4/5voFBwS3WW6L8nUAs22QKUCudlowhic9qTdCamSRJY7Rqnaamzu95p1o6ENP7T+CDH /R/t+LgpuqvFj+VQkVjBEeeI6CbpcWtfAoQ/TFt/4M7XmsyUC3dTy1WoqUBjdet26/2d An4AmJIevsNuvl8BD1VOD9MhCrLEJ018B5hR6BfQf4DEb0vhddamGjKIJoUL3djGqcCR gIdXJhw3xPKT7+6VVQ6awF5ZPddO9HNnKPAnePtEmepRh0ovpMDii8DUgECXMQteqx1u 2KpQ== Received: by 10.180.83.66 with SMTP id o2mr14554481wiy.14.1345934701092; Sat, 25 Aug 2012 15:45:01 -0700 (PDT) MIME-Version: 1.0 Sender: yohgaki@gmail.com Received: by 10.223.86.201 with HTTP; Sat, 25 Aug 2012 15:44:20 -0700 (PDT) In-Reply-To: <50392EAC.4020709@gmail.com> References: <5036551E.1030804@lerdorf.com> <5037F0D8.5080205@gmail.com> <5038053A.2000909@lerdorf.com> <50392EAC.4020709@gmail.com> Date: Sun, 26 Aug 2012 07:44:20 +0900 X-Google-Sender-Auth: vkIqowRCDsk3YO7nfdTVMZYxFPA Message-ID: To: =?ISO-8859-1?Q?=C1ngel_Gonz=E1lez?= Cc: Rasmus Lerdorf , PHP internals Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities From: yohgaki@ohgaki.net (Yasuo Ohgaki) Hi, 2012/8/26 =C1ngel Gonz=E1lez : > Even worse, HTML5 doesn't seem to have any provision for that, as it work= s > with characters. A user agent would have to protect himself from this by > making > those kind of utf-8 characters a hard error instead of trying to recover > from it. Right. I would like to have this behavior. However, who is going to use a browser that raise fatal error for bad encoding? While others just render it as safe as possible? Sending valid encoding is programmer's task. Enforcing and setting default_charset is more secure and best practice since 2000. Why not we set the default? I think UTF-8 is good for most users. BTW, Ruby on Rails depend on Ruby's exception for badly encoded output. We could do the same thing with output buffer and mb_check_encoding(), but programmer should validate inputs and ensure valid encoding in first place. Output time validation should be fail safe. IMHO. If there aren't any better idea, I'm willing to write patch for this. i.e. set default for default_charset=3DUTF-8, create system wide input/ output/internal encoding setting and use it as default. Any ideas? Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net