Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:104774 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 83290 invoked from network); 17 Mar 2019 16:43:00 -0000 Received: from unknown (HELO mail-lj1-f193.google.com) (209.85.208.193) by pb1.pair.com with SMTP; 17 Mar 2019 16:43:00 -0000 Received: by mail-lj1-f193.google.com with SMTP id l7so1862084ljg.6 for ; Sun, 17 Mar 2019 06:34:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=beberlei-de.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=KHtWv5VIPvvr1I3Ok0T6wLDZCqOAT4g0CUVsqdSSUf8=; b=CBqVhh2dBehdprIEy2IXXPaZtcyLhwbpV3M6S3SgFK0mhKP2u+jIraRoD2h7DE9C+6 5S/D+SC+cx4YR6kW3/x4z1No+vYhC7y2SUxkj8b5GwsVtMxEu58KpdFqb7bnU0p4na5u q5fS+cnuj5/bRBqdS4xhXUcaiIzrN3ydLyLbF+EIEn4swALnkA0Qa6fg8r2fOMuiWB9V 22JrmJSI/CBADNJ8FXQ54UX+/4QOXVvRUwt3BMMUQQOUz8tcGbXXZ44lz6VAtmJL5lAb APm4SYH/pAC1fqOfoDnncGECdxPHuHimxu+1mskys8TxUBCPH5W7YPryKLcy3Ys9dFh5 5l3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KHtWv5VIPvvr1I3Ok0T6wLDZCqOAT4g0CUVsqdSSUf8=; b=or0reOX2u1ttep0616tw5g+f16gckixf8Pk35vAZTb6J3LuSrjPu2r3v9N4c/y/h4u ArfRU38s7TWQL5mPAJumOv3rROvwHhKH0nPP9P/nTlFi37j7vBeMB+k3ipKaUtMnp5OU SwkdEGlvFGjZ7af65RU6y67WvPTK3Zoe1Gi9RL+csli0EHuniL6sgrcJ+cQWrBdUb/iU OWJXF1C6VUQSXvK6erQWO0jkVdChIvvBKLcfy8Zvn7itvxZKLTFIMQz8jrOa9vbsR5aY Her8vyAcdk6RQFLdVUAD7n0NI60hFLSU+VVXmkF9ufeuctriKuF9jA0g7wLr8m3BaS8I I8Iw== X-Gm-Message-State: APjAAAWsufobM53Tfsb8Ci3loeqrU/jGxNEKri74V/10hCSoiGw3BKxq aZmpQvwhpBDeQNwzz0hff9dtYcaeacMobwEVzzaNHg== X-Google-Smtp-Source: APXvYqy2X496SUOaRS+zYHJgrxGKvGf7OUoDuzq+PyN2ad/++UdtanoDzDXHSQxxBtIZKUlg60bXTZaOcY2qRdwuisc= X-Received: by 2002:a2e:9c42:: with SMTP id t2mr7791596ljj.149.1552829648415; Sun, 17 Mar 2019 06:34:08 -0700 (PDT) MIME-Version: 1.0 References: <08e09ea4-20d0-277d-8919-4e3d4387699c@ctindustries.net> In-Reply-To: <08e09ea4-20d0-277d-8919-4e3d4387699c@ctindustries.net> Date: Sun, 17 Mar 2019 14:33:56 +0100 Message-ID: To: Rob Richards Cc: Pierre Joye , PHP internals Content-Type: multipart/alternative; boundary="00000000000097450a05844a530f" Subject: Re: [PHP-DEV] On fixing DOMNameSpaceNode and DOM NS API Inconsistency Problems From: kontakt@beberlei.de (Benjamin Eberlei) --00000000000097450a05844a530f Content-Type: text/plain; charset="UTF-8" On Wed, Mar 13, 2019 at 4:36 PM Rob Richards wrote: > There were specific reasons for it tho I can't recall all the details off > the top of my head. Part of it related to the underlying structures of > attributes and namespace nodes not being the same and cannot be user > interchangeable in the code, unless only inspected at the nodeptr level. > Trying to manipulate namespace nodes as attributes will cause memory > corruption so they need to be handled differently. I don't remember if this > was the only reason or there was more for the special class but while I > agree with basic DOM usage no one should ever need to be dealing directly > with namespace declarations but once you get into more complex situation, > especially that involve xpath, xslt, canonicalization and even > interoperability this starts coming into play and is quite important to > deal with namespace declarations directly. > That is a very helpful insight, thank you Rob. From the few hours i poked in the code base memory management seems to be tricky abstracing libxml behind this powerful DOM API. Also some of the examples in previous messages are way overly simplistic so > the conclusions are not 100% accurate. i.e. removeAttributeNS. I am > assuming that its just due to overly aggressive namespace reconciliation > happening and removing the declaration since its no longer in use. With a > nested document which uses the namespace in child nodes I would expect the > namespace declaration to remain, tho possible it may be moved to the first > child using it (been a while since I looked at that code). > I'll take a look again how that extra attr gets removed. However it doesn't change the fact that semantically DOMElement::removeAttributeNS(" http://www.w3.org/2000/xmlns/", "foo") doesnt work. > > My recommendation would be to tread lightly here, look at every method any > changes to namespace declaration nodes would touch and make sure they are > all handle properly otherwise you run the risk of introducing memory > corruption issues which is far worse than some inconsistency for supposedly > unused and unneeded functionality ;) > > BTW I have recently started looking at the libxml code base again and > considering getting more involved here again, assuming the toxicity level > has gone done from the past. Not sure this is the first thing I would look > at but feel free to shoot over any questions. > This sounds awesome :-) I can't comment on toxicity levels of the past as I only follow this list for about 3-4 years now, but it has gone down in those years I feel. It is still a draft but Thomas and I have started working on an RFC and code to update ext/dom to cover the latest standard release: https://wiki.php.net/rfc/dom_living_standard_api - we plan on proposing that soon, maybe you have some feedback. > Rob > > On 3/12/19 1:50 AM, Pierre Joye wrote: > > Hi, > > Only adding Rob just in case :) > > best, > > On Wed, Mar 13, 2019, 5:54 AM Benjamin Eberlei > wrote: > >> Hi everyone, >> >> While looking for things to work on in php-src my friend Thomas pointed me >> to a peculiar special case in ext/dom that leads to massive inconsistency >> problems in the API. >> >> There is an undocumented class DOMNameSpaceNode that gets returned from >> DOMElement::getAttributeNode(NS) if you select an attribute that >> represents >> a namespace (xmlns). This special case is intentionally handled in the >> code, contrary to Pythons DOM Extension which doesn't do this and contrary >> to the DOM Specification, which does not have a special DOMNameSpaceNode. >> Its all DOMAttr. >> >> Problematically DOMNameSpaceNode doesn't extend from DOMAttr, you cannot >> pass this class to DOMElement::removeAttributeNode(DOMAttr $attr): >> >> Fatal error: Uncaught TypeError: Argument 1 passed to >> DOMElement::removeAttributeNode() must be an instance of DOMAttr, instance >> of DOMNameSpaceNode given >> >> Code example here: https://3v4l.org/jkC5s >> >> In addition the DOMNameSpaceNode renames all properties compared to >> DOMAttr, clearly violating the interface documented here >> http://php.net/manual/de/domelement.getattributenode.php >> >> Two potential fixes come to mind: >> >> 1. Have DOMNameSpaceNode extend DOMAttr - maybe this was the originally >> intended behavior, becuase the properties are all named differently? >> 2. Have DOMElement::removeAttributeNode accept also a DOMNameSpaceNode >> (3.) Remove the non standard DOMNameSpaceNode class and use a DOMAttr >> instead >> >> I think approach #1 is the right one from a BC perspective and also fixing >> the DOM API to be more consistent with the standard. >> >> But I am also not opposed to #3 in the longer term, looking at Github >> DOMNamespaceNode in the open source world is only used for one apc test, >> in >> PHP Compat and when dumping it in Symfony Var Dumper. I imagine its more >> deeply used in closed source code working deeply with XML. >> >> The second problem that this inconsistency creates is >> DOMElement::removeAttributeNS($namespaceUri, $localName). When you want to >> use it to remove a namespace attribute (xmlns). By default the xmlns >> namespace has the uri http://www.w3.org/2000/xmlns/. Hence removing >> xmlns:$name namespace would expected to be: >> >> DOMElement::removeAttributeNS("http://www.w3.org/2000/xmlns/", "foo") // >> removes xmlns:foo >> >> But thats not how it works, there is special code implemented that >> requires >> you to pick the URI from xmlns:foo value: >> >> >> >> DOMElement::removeAttributeNS("urn:foo", "foo"); >> >> But if your node has an attribute in this namespace with the same name, it >> gets removed as well: >> >> >> >> Would incorrectly become this, removing 2 attributes: >> >> >> >> This is a bug that cannot be fixed in a BC way. I have no clue how to fix >> this completly broken behavior in a good way. I propose to break BC here >> and requiring "http://www.w3.org/2000/xmlns/" as namespaceURI from PHP >> 8.0 >> to delete a namespace (xmlns) attribute. >> >> Should I put both issues into an RFC or are these rather bugs that can be >> fixed without an RFC? >> >> greetings >> Benjamin >> > > --00000000000097450a05844a530f--