Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:120981 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 50847 invoked from network); 4 Sep 2023 20:00:23 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 4 Sep 2023 20:00:23 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 947B31804C6 for ; Mon, 4 Sep 2023 13:00:22 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 4 Sep 2023 13:00:21 -0700 (PDT) Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-500c63c9625so2982923e87.2 for ; Mon, 04 Sep 2023 13:00:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1693857620; x=1694462420; darn=lists.php.net; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=6fXH07cG/jNZvK7vVMOq4CRZ02Y+9gDHP/8vC8XmK+o=; b=foX/vclAM4IWMczpk1RYEaNM71B8eJDrXAy3+aelMstSPTkj44MRbRn10QwA5XANhl 1WWi0aB7k/xvtXLx/6eCywYLKv6lKW+fTXp89N4bzdEOGPskLOVNdgq95/jkgDPYoOte wmxZpCGSzEvlGwXEAL1+4DXsspPp3MU/3Yhe+MPlpv6cjUAvK39tJBIBzCzozK9uWMs4 roQQ6Z6BMg//nx0t4KZ+xs18BnIL91Sbt2nsld7Y3IO07S8jsdid3i16E7+8VLQ5MW4Y RjFJN1+rOrRzohNSRIB7THMK7rb5MycxBVfc5Ql02xYOiujoXoGcVuv+JAPkCzTRYSCr jqng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693857620; x=1694462420; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6fXH07cG/jNZvK7vVMOq4CRZ02Y+9gDHP/8vC8XmK+o=; b=dxIA/3mauuZJL8kZfRSew6oE2T3c/mBL8VrCPBxVpc9a9ZeUlNoMLbOTVKB4T35fnD V7JZ2sqXFSWeSolqPfhGTlD/Qr64z6GsjhFrhdAWBMOuwzq8lnoSQ33EGmxtP4MG2VFm 20DHYm1x2Gr2nSoplZYtC5yJ1eopteKuwj39pygxjlKk7XMO0xzUBfsqXdE5HVPez4Q+ u/GVD/PCy6Ykt5vvWNOmNwjHyY5RlzP/Zrwtyu7r5I4LY3Srg+PtLOccmgsUieIRjW4S QbDMHApQ3oy/qr3Zx+utyztuoHxEioZJDkBcmm1vLZqEYL38Mo8f4SURygCSWpGa+Uu1 JlzA== X-Gm-Message-State: AOJu0YzT+MERzCOJFCafr5sVHb85cHtWRtZO9gdp7Ida8Gc2p7nE6h+s iEUUJrNMcfH6rXpVKYzUxLgTO/Mzf4g= X-Google-Smtp-Source: AGHT+IFMfEMheNYRvmCzWqFy9/YKhHSuxMdbK7yQP3k0BodE43tGXA0oZj32uJjzmAqGmwszdmFeug== X-Received: by 2002:a05:6512:443:b0:4fb:772a:af12 with SMTP id y3-20020a056512044300b004fb772aaf12mr6209968lfk.21.1693857619619; Mon, 04 Sep 2023 13:00:19 -0700 (PDT) Received: from ?IPV6:2a02:1811:cc83:ee50:280e:1e36:3a00:824? (ptr-dtfv08akcem5xburtic.18120a2.ip6.access.telenet.be. [2a02:1811:cc83:ee50:280e:1e36:3a00:824]) by smtp.gmail.com with ESMTPSA id v23-20020aa7d817000000b00521f4ee396fsm6235300edq.12.2023.09.04.13.00.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 04 Sep 2023 13:00:19 -0700 (PDT) Message-ID: <757f4b2b-a3b4-4f19-a9b9-301d175a16c9@gmail.com> Date: Mon, 4 Sep 2023 22:00:18 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: naitsirch@e.mail.de, internals@lists.php.net References: <9c67a472ffc83d0ee4490b446d03cebbbc2df611@mail.de> Content-Language: en-US In-Reply-To: <9c67a472ffc83d0ee4490b446d03cebbbc2df611@mail.de> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] [RFC] [Discussion] DOM HTML5 parsing and serialization support From: dossche.niels@gmail.com (Niels Dossche) Hey Christian Thank you for going through my proposal. On 04/09/2023 09:23, naitsirch@e.mail.de wrote: > Am 02-Sep-2023 21:41:50 +0200 schrieb dossche.niels@gmail.com: >> Hello internals >> >> I'm opening the discussion for my RFC "DOM HTML5 parsing and serialization support". >> https://wiki.php.net/rfc/domdocument_html5_parser >> >> Kind regards >> Niels >> >> -- >> PHP Internals - PHP Runtime Development Mailing List >> To unsubscribe, visit: https://www.php.net/unsub.php >> > > Hi Niels, > > thank you for your proposal and your work on this. > >> This proposal introduces the DOM\HTML5Document class that extends the >> DOMDocument class. The reason we introduce a new class instead of replacing >> the methods of the existing class is to ensure full backwards compatibility. > > Although I do not dislike your idea with a new class for HTML5 parsing I have one question. Why not make the decision, which parser to use, dependent from the doctype declaration at the start of the html document? There's three major reasons to not depend solely on the doctype: 1) Not all documents have a doctype. So in that case, do we have to pick the old parser or the HTML5 parser? 2) People create DOM documents from scratch, without loading some code or file first. In that case, we can't choose upfront if the document is an HTML5 document or a legacy document. I believe that by using a class, and thus having a clear choice upfront, this minimizes surprises. 3) Having a separate class allows us to use the type system to restrict some users to only allow HTML5 documents. > > Best regards > Christian Kind regards Niels