Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:121079 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 94642 invoked from network); 17 Sep 2023 17:10:12 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 17 Sep 2023 17:10:12 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 1673318005B for ; Sun, 17 Sep 2023 10:10:11 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS24940 176.9.0.0/16 X-Spam-Virus: No X-Envelope-From: Received: from chrono.xqk7.com (chrono.xqk7.com [176.9.45.72]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 17 Sep 2023 10:10:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bastelstu.be; s=mail20171119; t=1694970608; bh=hGJqX41H8+YbCuFdkO+j2YTJJ+ND9keAWD6k+pkwKyc=; h=Message-ID:Date:MIME-Version:Subject:To:References:From: In-Reply-To:Content-Type:from:to:cc:subject:message-id; b=L13YRZkv28vkjyuKSqm3coG1QXxwd3XFI6fgk0CN4mtYyN6qNt2r5L5l7YgstJa4t dNvVc1+HWOSRsWdYxbaZVrIiKSpY+Dv9n94zgAJ2cj0dF3cR+tr4cJ12OpeUv4EepS ex3E+IrLgTizhFZdGmoOXrZ2+ES06bQdLQLv960Vq5PsPmnmc94U83UXLjPqIikV8e Z1U694/Pw0nAGzx3RAyzdK/6jnALabvzl93S3/GkDExYvY2WK7zggxKnJQjJ1I+Ofg jAL6Veh2tMbR+el0aLlYF141VeV2+dXaDkIqi7Qj9BGERNnkMgPgS13Boe8+fp18f0 yyk+eDWW491ZQ== Message-ID: Date: Sun, 17 Sep 2023 19:10:07 +0200 MIME-Version: 1.0 Content-Language: en-US To: internals@lists.php.net References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] Re: [RFC] [Discussion] DOM HTML5 parsing and serialization support From: tim@bastelstu.be (=?UTF-8?Q?Tim_D=c3=bcsterhus?=) Hi On 9/16/23 01:17, Niels Dossche wrote: > […] Thank you for the fruitful discussion. This updated API is much better. When reading the updated RFC yesterday, I initially wanted to complain about the "The options argument" section still existing, but the $options argument nowhere to be seen. But I see that you now added it. One thing I now note that could also be improved, now that we make some larger improvements to the API: The $options argument still takes an integer and expects the developer to use the bitwise-or operator to combine multiple constants to set the appropriate flags. Despite this being a commonly used pattern in the standard library, I think it provides for a bad developer experience and makes it harder to extend the list of options for non-boolean options. I would never design an API like this when creating a userland library. I have two suggestions that might or might not be better: 1. Make the options explicit arguments with a default value and expect the user to leverage named arguments to override specific arguments, e.g. HtmlDocument::fromString( string $string, bool $ignoreErrors = false, // LIBXML_NOERROR bool $reducedMemoryUsage = false, // LIBXML_COMPACT bool $impliedTags = true, // LIBXML_HTML_NOIMPLIED ); This would provide for improved discoverability and the options checking would be implicit and easily available in IDEs. 2. Make the option parameter an array that takes an array of flags. This could either be an associative array (like used with unserialize, setcookie or password_hash) or a list of self-descriptive options. The latter would allow for a clean(er) migration to tagged-unions if/when they make it into PHP. Example usage: HtmlDocument::fromString( '', [ LIBXML_NOERROR, LIBXML_COMPACT, LIBXML_HTML_NOIMPLIED, ], ); With tagged-unions it could look like this in a future version, note the HtmlVersion option that takes a parameter. HtmlDocument::fromString( '', [ Dom\LoadingOption::IgnoreError, Dom\LoadingOption::ReducedMemoryUsage, Dom\LoadingOption::NoImpliedTags, Dom\LoadingOption::HtmlVersion(5), ], ); Even if not yet reaping the full benefits of the "list of options", I think it more naturally splits across multiple lines compared to 'bitwise-or'ing and it can also be more useful in stack traces for error handlers that are able to handle arrays, because there won't be some non-descript integer you have to decode manually. What do you think? Best regards Tim Düsterhus