Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:62430 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 54508 invoked from network); 23 Aug 2012 16:15:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Aug 2012 16:15:08 -0000 Authentication-Results: pb1.pair.com header.from=rasmus@lerdorf.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=rasmus@lerdorf.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain lerdorf.com from 209.85.220.170 cause and error) X-PHP-List-Original-Sender: rasmus@lerdorf.com X-Host-Fingerprint: 209.85.220.170 mail-vc0-f170.google.com Received: from [209.85.220.170] ([209.85.220.170:40296] helo=mail-vc0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 72/03-40468-B0756305 for ; Thu, 23 Aug 2012 12:15:08 -0400 Received: by vcbgb30 with SMTP id gb30so1132437vcb.29 for ; Thu, 23 Aug 2012 09:15:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding:x-gm-message-state; bh=8MEDan8kCYTIfvvn8bHdIFgoKwmy0l5HMk5rj9+FrVg=; b=eb1Rtd80+ocWFKpGP19tWoNGzQtBo+l/Q1LHP4NP5gvW33h6EKU2h8N2XbBT9dV65f s9XAlg+ElaM5alc4S3i2sAH5s/90aM8zjVB3rpKceuc6jQJaOl+9u3RssJJ+lxy1mCHw dOz2gaVU0LxRhbCXsBV8/Xz95venzPaHiSZlDcS318wpr1gpMU92RKrJEMnooQDRq8mO LaUYGIW4A8bvfBPpnhwkl1JkWWdneCzj4f04N2uHszrqgi/OVVuSxEAnBIXWQVxxSdYk 6Zg7XzPfcKPhQpoDt+pbmjCBYte5FKSsVnC1i1fFCtOLGKDLtoMovMTAHfkHwFu7MRKY qNnA== Received: by 10.220.150.15 with SMTP id w15mr1713538vcv.68.1345738504580; Thu, 23 Aug 2012 09:15:04 -0700 (PDT) Received: from [192.168.200.148] (c-50-131-44-225.hsd1.ca.comcast.net. [50.131.44.225]) by mx.google.com with ESMTPS id a10sm3468050vez.10.2012.08.23.09.15.02 (version=SSLv3 cipher=OTHER); Thu, 23 Aug 2012 09:15:03 -0700 (PDT) Message-ID: <50365704.6020504@lerdorf.com> Date: Thu, 23 Aug 2012 09:15:00 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Andrew Faulds CC: PHP internals References: <5036551E.1030804@lerdorf.com> <503655C8.9070406@ajf.me> In-Reply-To: <503655C8.9070406@ajf.me> X-Enigmail-Version: 1.5a1pre Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQlU4zzAx33Pay1QUU/hK4zktgifwLj87PwGPcqFMl2nVFjvRbM5nnq/CY5g4BCsUcEFkTYM Subject: Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities From: rasmus@lerdorf.com (Rasmus Lerdorf) On 08/23/2012 09:09 AM, Andrew Faulds wrote: > Personally, I think you should have just two encodings: page_encoding > and internal_encoding. The former is for form input and page output > (could be latin-1, for instance), and internal_encoding is the internal > representation (default to utf-8 - you can deal with all of, say, > latin-1, as well as unicode entities). Input and output, on the web at > least, are almost always going to match. No, we need 3. The internal/script encoding doesn't have to be the same as the input encoding. It isn't common in the Western world, but elsewhere people do write their scripts in their local encoding which may very well be different from their input and/or output encodings. -Rasmus