Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:92772 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 43642 invoked from network); 26 Apr 2016 09:07:25 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 26 Apr 2016 09:07:25 -0000 Authentication-Results: pb1.pair.com header.from=yohgaki@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=yohgaki@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.220.195 as permitted sender) X-PHP-List-Original-Sender: yohgaki@gmail.com X-Host-Fingerprint: 209.85.220.195 mail-qk0-f195.google.com Received: from [209.85.220.195] ([209.85.220.195:33786] helo=mail-qk0-f195.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 6A/3C-02401-CCF2F175 for ; Tue, 26 Apr 2016 05:07:24 -0400 Received: by mail-qk0-f195.google.com with SMTP id q184so540252qkf.0 for ; Tue, 26 Apr 2016 02:07:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=gQBMmb92p4RNvqcsd77o8OiA3JelLr7s7dcab8EPd2k=; b=gEGYZ0m160hpOs25XHKuOwfu+svv+6j2cviz8Uu581xWavLb/77S8A9hgBEoPRpJfm AxxUA6FZbTWw3YFmI2nv2MrI0F7C/hTqVRApxQm6NdNMH6n6o3PJKZUVa1rMFo8ySaDN cmm8iirikg6apiNAiOpxb/svtBMzvoDjFmtR8JAO62lmbGbvaZeuTzYaQfu+2qD44AbI mhRYTjfd7o4Kybsw4lm6uixwdk1Yx9JCSuzNPh7xjX5mrJmc51TQJFvv4CYIiuV9NOek OSkM80RRnBxkyWxAaW6bxSknIQKljWCKUYQXgPzONG43f2TlH76ulkylO1EXNUkKbkA0 mXbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=gQBMmb92p4RNvqcsd77o8OiA3JelLr7s7dcab8EPd2k=; b=ApzArVeQnxtuuSEuyqRlCnpuMz1VvYfJ105jBeyPrUn+WUT8XniCEsbipwAqcv72Ht 6IQUFZNoDcEM76J66/C8cCjw7yBn4iwYo6bzr3+n2W0s7h7SIN+Z6DkiiGJBY3NFmIaw I+yfMEq7CFOls1jxjKwLfnxLA+VefBlsJDXlZF13xwSilD7CJFZPCREc42rjCepXGlWY AvMGJZ9T0ub0IV9AA5OZ972cJgH9vjgasEIDllMH7TNvRPbBexkxDOrRwQW2Z+c8W0EY 3APev51W/fdudMym8g5ZofZxWWDQO7EH/+tdXR7Z8i8exZX8eAKfJVUiKODyotn+mucO rDAA== X-Gm-Message-State: AOPr4FXZ5zvKAnRMm1nlN0uYx0+ghu3qP9NGeY5vJ3BzsC3oaGx4zSKV4p6dxsS8tjfdruyvkTAbqdy8dqy9gw== X-Received: by 10.55.117.146 with SMTP id q140mr1128923qkc.193.1461661641103; Tue, 26 Apr 2016 02:07:21 -0700 (PDT) MIME-Version: 1.0 Sender: yohgaki@gmail.com Received: by 10.140.27.133 with HTTP; Tue, 26 Apr 2016 02:06:41 -0700 (PDT) In-Reply-To: References: Date: Tue, 26 Apr 2016 18:06:41 +0900 X-Google-Sender-Auth: zkw2wpy_gzfxgzq0VHfZLyibqx0 Message-ID: To: Sara Golemon Cc: PHP internals Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] [RFC] IntlCharsetDetector From: yohgaki@ohgaki.net (Yasuo Ohgaki) Hi Sara, On Tue, Apr 12, 2016 at 7:54 AM, Sara Golemon wrote: > With a light push from Stas, I've decided to go ahead and put up > IntlCharsetDetector for discussion. > https://wiki.php.net/rfc/intl.charset-detector > > I'm still not personally convinced this API is trustworthy enough, but > it's worth a formal discussion period at least. Things might have been changed, but as you've mentioned encoding detection is unstable and ICU is poor compared to mbstring's detection at least for Japanese encodings. Developers should not rely on encoding detector, but they should validate encoding. Problem is there are cases that developers cannot determine used encoding... If we are going to have this API, it would be better to validate string with detected encoding by default and disable encoding validation optionally. There are cases that developers have to deal with broken string data on occasion. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net