Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:77332 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 59151 invoked from network); 19 Sep 2014 09:58:31 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 Sep 2014 09:58:31 -0000 Authentication-Results: pb1.pair.com smtp.mail=pierre.php@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=pierre.php@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.192.48 as permitted sender) X-PHP-List-Original-Sender: pierre.php@gmail.com X-Host-Fingerprint: 209.85.192.48 mail-qg0-f48.google.com Received: from [209.85.192.48] ([209.85.192.48:57198] helo=mail-qg0-f48.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 6C/32-44461-64EFB145 for ; Fri, 19 Sep 2014 05:58:30 -0400 Received: by mail-qg0-f48.google.com with SMTP id f51so2475657qge.35 for ; Fri, 19 Sep 2014 02:58:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=6aUoZWhbJhvmKQRU8wemCIx4pLz9DRhmHUfp+wKiMuQ=; b=qrFALKChKJFJgVPV8kZSNkZmFNU0DkrHQ+rOk44RZaZzN8Mw8OdT3ncXeZNkpnC4bx Owu+vJ1scrLmwD8h8FPUUaz7j/BgkYvyBmutVvJeo5NPYjTZ4q3Dty/bmOXNnoUiCfaz Ls9N3OV0BBBhoTQq84uPsikyHI/hBtmu6KzXxZ9gOKAAgaTM8KbHbYifOHMbwMa4+RMF ncxOSBjBnRxZP5HLBxWguTiIhX3mGPpq5sUeGDE7jK2lTJESbFD78TMYnlecmff8uzxe iJe3fPCQH8L5pNdkUuuo0rlycyR+NN3o23EkL4NTeqP2PWobKBGUmpDBCjpdaCVI3pD+ wAeg== MIME-Version: 1.0 X-Received: by 10.140.23.17 with SMTP id 17mr6660189qgo.30.1411120708184; Fri, 19 Sep 2014 02:58:28 -0700 (PDT) Received: by 10.140.22.51 with HTTP; Fri, 19 Sep 2014 02:58:28 -0700 (PDT) Received: by 10.140.22.51 with HTTP; Fri, 19 Sep 2014 02:58:28 -0700 (PDT) In-Reply-To: References: Date: Fri, 19 Sep 2014 11:58:28 +0200 Message-ID: To: Chris Wright Cc: PHP internals , =?UTF-8?Q?K=C3=A9vin_Dunglas?= Content-Type: multipart/alternative; boundary=001a11c12aca8bf19d0503682452 Subject: Re: [PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL From: pierre.php@gmail.com (Pierre Joye) --001a11c12aca8bf19d0503682452 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, On Sep 19, 2014 4:03 PM, "Chris Wright" wrote: > > K=C3=A9vin > > On 18 September 2014 21:26, K=C3=A9vin Dunglas wrote: > > Hello, > > > > I'm working on enhancing the FILTER_VALIDATE_URL filter ( > > https://github.com/php/php-src/pull/826). > > The current implementation does not support validation of internationalized > > domain names (i.e: http://www.acad=C3=A9mie-fran=C3=A7aise.fr/ > > ). > > > > Support of IDN validation can be easily added using ICU's uidna_toASCII() > > function. > > > > Is it acceptable to add a dependency to ICU for ext/filter? > > Another option is to add a HAVE_ICU constant in main/php_config.h and t= o > > validate IDN only if ICU is present. > > > > What strategy is preferred? > > I've done some work around this area previously, and all I will say > is: be careful with what you do with this from a userland PoV. > > PHP does not natively support IDN in stream open routines or SSL > verification routines. It will never support these things without at > least one of: > - a core dependency on ICU, libidn or similar > - moving streams into an extension so a dependency can be introduced > there (probably not sanely possible) > - an in-house NAMEPREP implementation (this is the hard part of IDN, > punycode itself is pretty trivial to implement once you have a > canonical set of codepoints) > > These things can be implemented with *a lot* of boilerplate in > userland when you have ext/intl, but it's not pretty. libcurl *can* > support IDN if it was built against libidn, I'm not sure if this is > currently the case in common distributions or not. Since one almost > never just validates a URL string, it's usually a precursor to > attempting to open it, this could lead to some pretty hefty wtfs. > > In short, while I'm generally for ext/filter being able to handle IDN, > I *do not* believe it should do it implicitly, it should require an > explicit flag, because it will break *a lot* of code if IDN is > suddenly treated as valid where it previously wasn't. I am really not sure about that especially the enabling by default part. The doc is pretty clear about what this filter supports and allowing idn may break a lot of codes out there. From an implementation point of view we may not need ICU to support IDN. Windows does not use it and there are license friendly decoder implementations too. Cheers, Pierre --001a11c12aca8bf19d0503682452--