Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:67993 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 58658 invoked from network); 28 Jun 2013 02:54:45 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 28 Jun 2013 02:54:45 -0000 Authentication-Results: pb1.pair.com smtp.mail=tjerk.meesters@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=tjerk.meesters@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.216.43 as permitted sender) X-PHP-List-Original-Sender: tjerk.meesters@gmail.com X-Host-Fingerprint: 209.85.216.43 mail-qa0-f43.google.com Received: from [209.85.216.43] ([209.85.216.43:36034] helo=mail-qa0-f43.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 02/28-34034-3FAFCC15 for ; Thu, 27 Jun 2013 22:54:44 -0400 Received: by mail-qa0-f43.google.com with SMTP id d13so284314qak.2 for ; Thu, 27 Jun 2013 19:54:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=RoigUN4+WQzcoDrIgF4Yjq/XkJ407TFw3hdBxem3lnw=; b=eUtPfn2j4n6VBlwtufdEkehHd2phxCkVDhvIEtFQ3ZTWw/gKF3JcZgkbSkY9AhaBkL snpCTCghcbQpHUy0iKKglYXhL0E3Sy/gMViPhYwLdZ6B+G9JVF/CQYKQbMEJdkPBBfdU sRRRfwJ+jIoCxbI4jt0DXt6K+/meDqB7/hbEGmc0VwCbQfZBZnUNMPg52YMNO5d62QWP sjY62fiFuxk7F9ryec/79qU4+wQTFofuh2J1qa8TWquATptuKPa8NyY/+9oJ/0nrvjBK GLRnwEBcpnqhetPjvfpRVlQNJ/GEGwCtzHqq2H1S4Wbi5rO3Q2R9pEjiTDneQ/icSAmZ HIbg== MIME-Version: 1.0 X-Received: by 10.49.29.106 with SMTP id j10mr14637067qeh.37.1372388081105; Thu, 27 Jun 2013 19:54:41 -0700 (PDT) Sender: tjerk.meesters@gmail.com Received: by 10.49.99.67 with HTTP; Thu, 27 Jun 2013 19:54:41 -0700 (PDT) In-Reply-To: References: Date: Fri, 28 Jun 2013 10:54:41 +0800 X-Google-Sender-Auth: CuWQ2_3c7PRu980sAniGfkJWPGE Message-ID: To: Kris Craig Cc: Yasuo Ohgaki , PHP internals list Content-Type: multipart/alternative; boundary=047d7bdca47811823304e02e00eb Subject: Re: [PHP-DEV] ENT_ALL or similar option for htmlspecialchars[_decode]? From: datibbaw@php.net (Tjerk Anne Meesters) --047d7bdca47811823304e02e00eb Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jun 27, 2013 at 4:42 PM, Kris Craig wrote: > On Thu, Jun 27, 2013 at 12:03 AM, Yasuo Ohgaki wrote: > > > > > 2013/6/27 Kris Craig > > > >> I just noticed that htmlspecialchars_decode doesn't convert entities > like > >> and . > >> > > > > I think htmlspecialchars_decode() only decodes > > > > ext/standard/html_tables.h > > static const entity_stage3_row stage3_table_be_apos_00000[] = { > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {"quot", 4} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {"amp", 3} } }, {0, { > > {"apos", 4} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > > {NULL, 0} } }, > > {0, { {"lt", 2} } }, {0, { {NULL, 0} } }, {0, { {"gt", 2} } }, {0, { > > {NULL, 0} } }, > > }; > > > > IIRC > > I may be wrong. > > > > > >> Is there a bitmask I'm missing or are those simply not > >> supported right now? If the latter, any thoughts on adding something > >> along > >> the lines of ENT_ALL to convert all valid entities from/to their > >> respective > >> characters? > >> > > > > What you are looking for is html_entity_decode(), I think. > > > > $ php -n -r 'var_dump(html_entity_decode(" ="));' > > string(2) " > > =" > > > > > Yeah I tried html_entity_decode already, but it just returned NULL. On the > same input string, htmlspecialchars_decode returned the input string but > with *some* special characters decoded; 10 and 13 ("\r\n", I think) were > left in their encoded state. I'm not sure why there wouldn't be an option > to decode all html special characters. > The html_entity_decode() function shouldn't return NULL, but even an empty string sounds like a bug, could you file a report for this and provide a reproducible test code? > > --Kris > -- -- Tjerk --047d7bdca47811823304e02e00eb--