Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:67925 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 143 invoked from network); 27 Jun 2013 08:42:30 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 27 Jun 2013 08:42:30 -0000 Authentication-Results: pb1.pair.com smtp.mail=kris.craig@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=kris.craig@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.214.179 as permitted sender) X-PHP-List-Original-Sender: kris.craig@gmail.com X-Host-Fingerprint: 209.85.214.179 mail-ob0-f179.google.com Received: from [209.85.214.179] ([209.85.214.179:47229] helo=mail-ob0-f179.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 78/49-51393-5FAFBC15 for ; Thu, 27 Jun 2013 04:42:30 -0400 Received: by mail-ob0-f179.google.com with SMTP id xk17so432482obc.24 for ; Thu, 27 Jun 2013 01:42:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=px1UAcYK4swxSevcQpEp3SyGUQLGImOk6aeNNGIU2M4=; b=NM9DVbUXLH/ZSS/EotC49wnql4rKcBRh9sND5ehTW5RvJp6Icv/BLHQNk6PR+LJpKn F9WYlJBsdPr1bYJOUNmcvQeRfePsPqt+rttEzdtdXnbGpu8u77xqrArGK8FyGc4BRgyz +32ftWUGY4oYpaA3dJBqZd4VQBU6KFuajriSfs6NSyKm1MY2DMR+ivvRyczzOizp0wcq xhtezaKHWYwMNx00iKLCRLD0CaQCPH6XuIYq1RYpGxHDz/XMTIY4YKB4XNoGmGMUHcOk nNCcqViGng6a3Yn688vak+X/R9AyUDFSeDcPqsy+7Vxi1rrldm8fBaka0bJqFWUVVKHB ghHQ== MIME-Version: 1.0 X-Received: by 10.182.39.168 with SMTP id q8mr3642508obk.72.1372322547024; Thu, 27 Jun 2013 01:42:27 -0700 (PDT) Received: by 10.182.138.1 with HTTP; Thu, 27 Jun 2013 01:42:26 -0700 (PDT) In-Reply-To: References: Date: Thu, 27 Jun 2013 01:42:26 -0700 Message-ID: To: Yasuo Ohgaki Cc: PHP internals list Content-Type: multipart/alternative; boundary=001a11c3059eeecd6c04e01ebdcb Subject: Re: [PHP-DEV] ENT_ALL or similar option for htmlspecialchars[_decode]? From: kris.craig@gmail.com (Kris Craig) --001a11c3059eeecd6c04e01ebdcb Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jun 27, 2013 at 12:03 AM, Yasuo Ohgaki wrote: > > 2013/6/27 Kris Craig > >> I just noticed that htmlspecialchars_decode doesn't convert entities like >> and . >> > > I think htmlspecialchars_decode() only decodes > > ext/standard/html_tables.h > static const entity_stage3_row stage3_table_be_apos_00000[] = { > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {"quot", 4} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {"amp", 3} } }, {0, { > {"apos", 4} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { {NULL, 0} } }, {0, { > {NULL, 0} } }, > {0, { {"lt", 2} } }, {0, { {NULL, 0} } }, {0, { {"gt", 2} } }, {0, { > {NULL, 0} } }, > }; > > IIRC > I may be wrong. > > >> Is there a bitmask I'm missing or are those simply not >> supported right now? If the latter, any thoughts on adding something >> along >> the lines of ENT_ALL to convert all valid entities from/to their >> respective >> characters? >> > > What you are looking for is html_entity_decode(), I think. > > $ php -n -r 'var_dump(html_entity_decode(" ="));' > string(2) " > =" > > Yeah I tried html_entity_decode already, but it just returned NULL. On the same input string, htmlspecialchars_decode returned the input string but with *some* special characters decoded; 10 and 13 ("\r\n", I think) were left in their encoded state. I'm not sure why there wouldn't be an option to decode all html special characters. --Kris --001a11c3059eeecd6c04e01ebdcb--