Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:121091 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 67172 invoked from network); 18 Sep 2023 18:00:29 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 18 Sep 2023 18:00:29 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 1A0F4180504 for ; Mon, 18 Sep 2023 11:00:29 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com [209.85.218.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 18 Sep 2023 11:00:28 -0700 (PDT) Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-9a58dbd5daeso641683966b.2 for ; Mon, 18 Sep 2023 11:00:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695060027; x=1695664827; darn=lists.php.net; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=vdj7xlH3iRO0aYSoyXBC6U/Cpj7lAFM7+UfBbxekCYc=; b=Yje8lM0RpydkN8Elp86V5+csS1Ekcp+lpG3uXmRD1MUTm2wziv5wEyXY0CrK6C5Xr4 gGAR8ffA4HTHlwN7BlXyIyYrZr6awQVwAOxDy2Rhe0RIVjq7afXzvYjJwgNjRv61Q1uJ QcPiW5j1Te0xpJSnHs8/3zeTC+3zbyKkFhoVsMWqJgUb3tKxaDe72CmwydL54+lyD11B cEoWkgUmObnbxviArUCykDOX3mJczUjv5XsfJj8A9HY/GolWyfrHbcm1UYe8iW8TSTWx LdiDFDgfg2EXzq0ehqeR9Qz4ZATtr97NJ90HIuYtc9eh2RaFV0/MOQ2WRRHk/fYeRFd6 7yzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695060027; x=1695664827; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vdj7xlH3iRO0aYSoyXBC6U/Cpj7lAFM7+UfBbxekCYc=; b=GubUFoEI31I6EfwRW1uQ5S5iomwgOJoed7h6JVFZM/MWeHBqqhh0SVeCw+H+f0Dmsp wZskMHn4W0euVdsK6cb9DG6huct7B9PL+sT9NvDZfFZkNg6LqIqhwbA3Y5Zez5dRCaQ2 9QhRCxj/ZHe+yMPpwbswTtJbNUcVptyoTaCqKXfeZ20/xDnWiKL4gmqkYSO/fCJsGPFH a4QnmhSZAZEGkFt/ZeY1mcpo0v7yCKu34R24cDAwvTN48ZNj1wicBMDBWBbVuzOfid+3 yISH6O0qcsHsQOdkBciqQA/asH0B5X82vGtVy5BesWfJsv2/q3UWtF6sn1TwNNu02ED+ JGJQ== X-Gm-Message-State: AOJu0YydMO9q1e+Jd7L83+chpLaC11lp+w7/1MZWUZviW3/VHhclM8j7 PBfbISzxgpwesohH44ZN0J3Zub7nca4= X-Google-Smtp-Source: AGHT+IHxq2+O4mUbBxrgbRYfHuQiXM8ey/3Z+dm8Vho2mDLIteXzf/HZt65IET/ExnnK56pJS0y3pQ== X-Received: by 2002:a17:906:20ce:b0:9a5:d48f:c906 with SMTP id c14-20020a17090620ce00b009a5d48fc906mr10012257ejc.15.1695060026892; Mon, 18 Sep 2023 11:00:26 -0700 (PDT) Received: from ?IPV6:2a02:1811:cc83:ee50:280e:1e36:3a00:824? (ptr-dtfv08akcem5xburtic.18120a2.ip6.access.telenet.be. [2a02:1811:cc83:ee50:280e:1e36:3a00:824]) by smtp.gmail.com with ESMTPSA id jx13-20020a170907760d00b009ade1a4f78dsm4479771ejc.163.2023.09.18.11.00.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 18 Sep 2023 11:00:26 -0700 (PDT) Message-ID: Date: Mon, 18 Sep 2023 20:00:24 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Stephen Reay Cc: PHP Internals References: <29eb53a7-9aef-4e48-999e-97574f27a9df@gmail.com> <4C6FD028-4E34-4578-AF04-EDAC120E3E94@koalephant.com> Content-Language: en-US In-Reply-To: <4C6FD028-4E34-4578-AF04-EDAC120E3E94@koalephant.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] [RFC] [Discussion] DOM HTML5 parsing and serialization support From: dossche.niels@gmail.com (Niels Dossche) Hi Stephen On 18/09/2023 08:46, Stephen Reay wrote: > > >> On 17 Sep 2023, at 18:28, Niels Dossche wrote: >> >> Hi Alexandru >> >> On 9/17/23 11:59, Alexandru Pătrănescu wrote: >>> On Sat, Sep 16, 2023, 02:17 Niels Dossche wrote: >>> >>>> >>>> We'll add a common abstract base class DOM\Document (name taken from the >>>> DOM spec & Javascript world). >>>> DOM\Document contains the properties and abstract methods common to both >>>> HTML and XML documents. >>>> >>>> >>> Hi, >>> >>> Yes looks a lot better. >>> Great work overall! And thank you for taking on this effort. >>> >>> I would have a small suggestion: to make the abstract class an interface. >>> This will allow even more flexibility on how things can be build further, >>> suggesting composition over inheritance. >>> In user land we cannot have interfaces with properties (yet) but in php >>> internal interfaces we have example with interface UnitEnum that has name >>> property, extendes by BackedEnum that adds value property. >>> >> >> Right, we discussed the use of an interface internally too. >> Indeed as you suggest, we chose an abstract class over an interface because of the property limitation. >> Looking at UnitEnum & BackedEnum (https://github.com/php/php-src/blob/bae30682b896b26f177f83648bd58c77ba3480a8/Zend/zend_enum.stub.php ) I don't see the properties defined on the interface. In practice, all enums get the properties of course, just not via the interface. >> So as we cannot represent the properties on the interfaces (yet), we'll stick with the abstract class for now. >> >> >>> Thank you, >>> Alex >>> >> >> Kind regards >> Niels >> >> --  >> PHP Internals - PHP Runtime Development Mailing List >> To unsubscribe, visit: https://www.php.net/unsub.php > Hi Niels, > > Can you expand on the reasoning for two of the decisions in your proposal? I'm not sure I really see the reason/benefit : > > 1. fromX() methods are on the individual classes, rather than the parent, which as I understand it, you're using as a poor-mans interface with properties. I'd have thought that at the very least the parent should declare those methods as abstract > At least the fromEmptyDocument signature differs between HTMLDocument & XMLDocument, so it's not possible to declare that method on the parent. It's possible to define the other two on the parent. However, I choose against that because: 1) They have different behaviour if called on XML vs HTML, because they return their respective new instance. At least the other methods have the same behaviour between XML & HTML. Furthermore it would be weird to have 2 out of 3 factory methods on the parent, but the other one not. 2) It makes some sense at least to call e.g. createComment() while not knowing the concrete type of the document (e.g. when supporting both XML & HTML documents), so having that method in the parent makes sense. For example: function addCopyright(DOM\Document $doc) { $doc->append($doc->createComment("(c) foo 2023")); } However, you're not going to call the factory methods on a variable that can be either XML or HTML document, as you wouldn't know a priori what you're getting back. > > 2. Why "fromEmptyDocument()" rather than just allowing `new DOM\{XML,HTML}Document` via a public constructor? While technically possible, I find it confusing to have both a normal constructor and factory methods. When people see factory methods, they'll know - from experience - that every construction needs to go via factories. Otherwise, people are going to call the constructor and search for a loadHTML/loadHTMLFile method like they do with DOMDocument. > > The only mention of 'constructor' I see in the email or the RFC is the line below: > >> the properties set by DOMDocument's constructor are overridden by its load methods, which is surprising > > But I don't really see how making the constructor private and forcing use of a static method changes this aspect, at all - it just introduces a different way to create an instance. > The problem with DOMDocument's approach is the following: $dom = new DOMDocument(encoding: "bla"); $dom->loadXML(...); // Oops encoding gone! With the new classes, you load the document at construction time (or no document at all). Therefore, this confusion/API misuse is prevented by the API design. You can see it as having 3 different constructors that disallow loading another document on the same instance afterwards. > > Otherwise, it's great to see some activity on DOM handling classes. > > > Cheers > > > Stephen  > Cheers Niels