Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:104777 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 30581 invoked from network); 17 Mar 2019 20:44:59 -0000 Received: from unknown (HELO mail-lj1-f195.google.com) (209.85.208.195) by pb1.pair.com with SMTP; 17 Mar 2019 20:44:59 -0000 Received: by mail-lj1-f195.google.com with SMTP id n18so10020449ljg.11 for ; Sun, 17 Mar 2019 10:36:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=beberlei-de.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9DCK37VCTgqRJCpJfXw5Ok9cs7b9TYpA2k+U9+VGnZs=; b=t7uUR5WIHXPPlv887Zn4tN+1XrPijeQYJ9+0TPXIkYPDyUZ0SFN0BrieLJJrqrDVjF DcrYFRcyT4ZI1rVi2HOReZZecx512AH6UNMcyaklIt5FLbfXhWP8B2BMzVth5dJlbEAp CeHSGe5zPzyLXxAV3CTGZepF7mFtu0cqKiJ1COiMkkuwcPtF6ev8DxFMjlU93yifoXRz D202+h+MYjPOOjoTp+w+/oXVSmdVn+6OANo7qrEZLu7BHY3nSs/S7gCiQ/qsq3P5wfk/ jtyUC3OgEltqn6x9iTA6wxF4lsFogWY1oyXWbbSzfyIhsNQYNv+cmvhKvFIQ4aL611/d 4kKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9DCK37VCTgqRJCpJfXw5Ok9cs7b9TYpA2k+U9+VGnZs=; b=RD7a65G9eKgx+6LY4HEYZoUjBuZkFUDhe5v6J9xx5lVs6/O/MXzE9KMZIvWypLCGgV Su5dOIqvcidExW08FTIh3VFkvsWRaWDZsoJW+YodMv+q5hE+MKp8A5kE6IaWrjwIpq/2 Me0sCoZZ83ikjabM5ZMcNcBZpIeHrJqCO68FONfD+G9gvpO22DsYl/SACuKkgjwZ7Dvh RWMKcRbAXFhqV4fnOUNP8Pv33kq6nVFGD5F/V4SzRy6qIHGbRfepa6gwboLUTeOhO8cr 3GdvqiPUNOdWoW7J76vLVAwqNXQlHh/HCkq6q7umfxNZKsL19QS80WWsCQzsVBHMMZbo iIVQ== X-Gm-Message-State: APjAAAXzrvq7vgAIKGuGlW1ikzCvCGIm1xeLjcgTwRUYeBZBstIcLDcI dT+xmK9Iyt/aAVNG+gvw8DvjNy/zTfNcdb9VS390Sw== X-Google-Smtp-Source: APXvYqzIqBnGvJfVLzeLwlVqobEsYm6ayXleKaaT6nvT7NWPX6/o1wXpp3BajaYQWApzlZZMvmmoxmzgM9P1APhMPKI= X-Received: by 2002:a2e:5bd7:: with SMTP id m84mr6843606lje.144.1552844170039; Sun, 17 Mar 2019 10:36:10 -0700 (PDT) MIME-Version: 1.0 References: <08e09ea4-20d0-277d-8919-4e3d4387699c@ctindustries.net> In-Reply-To: Date: Sun, 17 Mar 2019 18:35:57 +0100 Message-ID: To: "C. Scott Ananian" Cc: Rob Richards , Pierre Joye , PHP internals Content-Type: multipart/alternative; boundary="00000000000025afaa05844db5ce" Subject: Re: [PHP-DEV] On fixing DOMNameSpaceNode and DOM NS API Inconsistency Problems From: kontakt@beberlei.de (Benjamin Eberlei) --00000000000025afaa05844db5ce Content-Type: text/plain; charset="UTF-8" On Sun, Mar 17, 2019 at 2:52 PM C. Scott Ananian wrote: > On Sun, Mar 17, 2019, 9:34 AM Benjamin Eberlei > wrote: > >> >> It is still a draft but Thomas and I have started working on an RFC and >> code to update ext/dom to cover the latest standard release: >> https://wiki.php.net/rfc/dom_living_standard_api - we plan on proposing >> that soon, maybe you have some feedback. >> > > Updating the DOM extension would be something the Wikimedia Foundation > would very much like to see happen. It's more complicated than just adding > some new methods, though: there are significantly spec-compliance issues > with the current code and performance problems too. We've been porting > code from JS to PHP which (in the JS version) used a good spec-compliance > DOM implementation, and have been keeping a list of all the crazy bugs and > workarounds that have been necessary. > > Start from the basic fact that the modern DOM requires Node#nodeName to be > uppercase for HTML elements, and the current code uses all lowercase. It's > hard to see how that could be addressed without breaking backward compat. > > Here are our notes/discussions/etc: > https://phabricator.wikimedia.org/T215000 > That is a really good resource of thinks we should look at :-) But it is not fully true that this javascript library is DOM spec compliant. It does provide extra features that https://dom.spec.whatwg.org/ doesn't have such as "body", "title", "head" properties on DOMDocument, or innerHtml/outerHtml attributes on elements. I didn't find a spec where these were defined, the html spec also doesn't mention them. The DOM Spec also doesn't impelement HtmlElement, that is from the HTML spec. The way forward without BC break like uppercase nodeName or getAttribute returning NULL and not empty strings in newer implementations could be to allow users to specify which implementation they want the DOMDocument to follow. > > https://mediawiki.org/wiki/Parsoid/PHP/Help_wanted > (and there's more where that came from) > --scott > > PS. My personal feeling at this time is that it would be better to put the > core libxml abstractions in an extension, to allow fast xpath and perhaps > parse/serialize, but that the actual DOM should be built as a php library > on top of that, in order to allow rapid changes (the WHATWG is pretty > actively making additions/changes to the we spec these days) which are > decoupled from the PHP release cycle. > It would still require libxml to be an extension, which would only happen in a newer version, so it is not going to help without requiring that version. I don't think the effort is worth it though, compared to just working on the existing ext/dom. --00000000000025afaa05844db5ce--