Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:67994 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 64366 invoked from network); 28 Jun 2013 04:20:55 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 28 Jun 2013 04:20:55 -0000 Authentication-Results: pb1.pair.com smtp.mail=kris.craig@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=kris.craig@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.214.182 as permitted sender) X-PHP-List-Original-Sender: kris.craig@gmail.com X-Host-Fingerprint: 209.85.214.182 mail-ob0-f182.google.com Received: from [209.85.214.182] ([209.85.214.182:50567] helo=mail-ob0-f182.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id C3/D8-34034-72F0DC15 for ; Fri, 28 Jun 2013 00:20:55 -0400 Received: by mail-ob0-f182.google.com with SMTP id va7so1539334obc.27 for ; Thu, 27 Jun 2013 21:20:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=CbAxGVH8FtQBF2xB7L/rwG6czFM0RLbxlQh7RiDKwP0=; b=qPkuFEYs/WuhYghWMCDN573WregviyDYjikXC35x6/ytjj6NxBf+ZvijktVFfGjtAl +8FQPgG3RM8m5fx07Ly3E9Y7oSxNEKIwR/unveLwUmQNPFs21RQSRM51zlZwA2410VVC 4K54AjDT7LYsrYk5oIuZbFIAp8s29YR6SR/++iy8Vji3mGvoHKYon2Q+AclKCzI0FER6 YCyBvIzL1eME1UpB6u9fK/7Nt9rodu3sb5jfusFgzeuGsBonhW/PIIJ+sz8bldx4K8qR UIH0cskf/TAn6DxHbtelpABW8HBKkf9c8+SVOR3CXiQyfIxkIGcmE9dNbuBSI3G7/Tue IDdg== MIME-Version: 1.0 X-Received: by 10.60.36.230 with SMTP id t6mr4356254oej.39.1372393252627; Thu, 27 Jun 2013 21:20:52 -0700 (PDT) Received: by 10.182.65.102 with HTTP; Thu, 27 Jun 2013 21:20:52 -0700 (PDT) In-Reply-To: References: Date: Thu, 27 Jun 2013 21:20:52 -0700 Message-ID: To: Tjerk Anne Meesters Cc: Yasuo Ohgaki , PHP internals list Content-Type: multipart/alternative; boundary=089e01183f2850aa6104e02f34c6 Subject: Re: [PHP-DEV] ENT_ALL or similar option for htmlspecialchars[_decode]? From: kris.craig@gmail.com (Kris Craig) --089e01183f2850aa6104e02f34c6 Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jun 27, 2013 at 7:54 PM, Tjerk Anne Meesters wrote: > > > > On Thu, Jun 27, 2013 at 4:42 PM, Kris Craig wrote: > >> On Thu, Jun 27, 2013 at 12:03 AM, Yasuo Ohgaki >> wrote: >> >> > >> > 2013/6/27 Kris Craig >> > >> >> I just noticed that htmlspecialchars_decode doesn't convert entities >> like >> >> and . >> >> >> > >> > I think htmlspecialchars_decode() only decodes >> > >> > ext/standard/html_tables.h >> > static const entity_stage3_row stage3_table_be_apos_00000[] = { >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {"quot", 4} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {"amp", 3} } }, {0, { >> > {"apos", 4} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { >> > {NULL, 0} } }, >> > {0, { {"lt", 2} } }, {0, { {NULL, 0} } }, {0, { {"gt", 2} } }, {0, { >> > {NULL, 0} } }, >> > }; >> > >> > IIRC >> > I may be wrong. >> > >> > >> >> Is there a bitmask I'm missing or are those simply not >> >> supported right now? If the latter, any thoughts on adding something >> >> along >> >> the lines of ENT_ALL to convert all valid entities from/to their >> >> respective >> >> characters? >> >> >> > >> > What you are looking for is html_entity_decode(), I think. >> > >> > $ php -n -r 'var_dump(html_entity_decode(" ="));' >> > string(2) " >> > =" >> > >> > >> Yeah I tried html_entity_decode already, but it just returned NULL. On >> the >> same input string, htmlspecialchars_decode returned the input string but >> with *some* special characters decoded; 10 and 13 ("\r\n", I think) were >> >> left in their encoded state. I'm not sure why there wouldn't be an option >> to decode all html special characters. >> > > The html_entity_decode() function shouldn't return NULL, but even an empty > string sounds like a bug, could you file a report for this and provide a > reproducible test code? > Yeah I admit it could be an empty string as opposed to NULL. I wasn't using a var_dump() so I just assumed. I'll take another look at it and get those details. --Kris --089e01183f2850aa6104e02f34c6--