Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:130638 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id E02581A00BC for ; Wed, 15 Apr 2026 13:39:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1776260375; bh=FCD7k2BXTACHyevz6FkmB4Xg/hBXi2/zs5lKp0bWZUE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=g5zt1sFujHAwCesQgfSX5exVlmcmICDYVmF+PiQ0jIVzf1J63OrO/fDMmnpfw0UMr pz9fwzpDSQ5cG+c3q9JV47cTgCUfS2xitDn5mwrS5AEU06Zi+4+SqRNAw03Hv1kKvw Z6HR+NN05Gr7gYvZsBXbVc05ew6+RYdK329xNKFsoblnj1Ci+gK/XEnoy0NMmAwTmq 3bNc1fmcudcsy3VdKEOmmkhmLE4/oPcMgJVfVAm+3zOFDSiZciInE9g8zhWHbqEcPI 667bRS6eS0bhr1PBpDA0t90qexGsNpfWktv6YQe6Ibx/qHYrsfuVTQSK0AXdwGn4PZ kI1aZeOeldHeA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id C76CF180077 for ; Wed, 15 Apr 2026 13:39:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_50, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com [209.85.221.53]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 15 Apr 2026 13:39:33 +0000 (UTC) Received: by mail-wr1-f53.google.com with SMTP id ffacd0b85a97d-43d734223e4so2379468f8f.0 for ; Wed, 15 Apr 2026 06:39:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1776260367; cv=none; d=google.com; s=arc-20240605; b=QlONrcfjgiA5WOj/F3uzbpcmoalh3iUrS/hO6Lw8CeCqvBk7Zpg3kq5apXX/OXbyHJ MB40JlsXjCRQ27GOSWKqa7KQmCWOHWM+c32lOA8e3hMHafRFnTYF0871iPuxzIPRtsVb UPCTMlEWCdYY6T4KmJ2SkSQdZYMw01AQceCfdeGK/pQHbuLuLPEL35zYz1Y3JOtkVr9B /Wm17m1Ub+rVvLjBr1esaRz0itp3XtiP+VT5HYJaHHAaqMw90YTB64gCzKXfnHt858Xt YDSdAWD1B/ZZLGmhG2u5PtyN0G9HXHl9qoJaTdXLu4HQm7KpQm3ATdGAb8fwls+JbNnO 14tw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=xwODCiOyFXzJWZt4r9DrdCAsRW2zetRNQU4k8uLPHqo=; fh=GbLssEDpFkOs4sxVotj6CryJN2OQBdvy0mK7+XaIMqE=; b=jKYBVIAv9wcu5TrLOYTqAFqPexmwY2gByhWrjZ9K0qWrD3RYWTgaZIbMD9FXZrpeKa xvG64m8lkbcEa1Oi5Ou+jPbQVk1X3Um1k0Sd8kTrdgWZnl4a3qbuhIUImvGHLDxLwctX N7i7aWDeyG7/AKuD4eeDzx0UjZUrQ3D6MZrPj6eWWlnfb1nea7uxraLetiMRlEBKHI9c q289aEEgyxRzv14YTqSGi+Z0pr7eR7WNKqao1hjOc/Zv898/O43FjQJW9NfvX0pn4dc9 z0cvGWmRbIWjP/7ew3ehTWeIBvKLZ5wNO6FzpV0+j5FX1/m9oKHWewKS4RHKDhKH1tYy wOAQ==; darn=lists.php.net ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mariadb.com; s=google; t=1776260367; x=1776865167; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=xwODCiOyFXzJWZt4r9DrdCAsRW2zetRNQU4k8uLPHqo=; b=UJSwdfW8KsNEwvLL4j2a/4TDSj4z4XksvkndFTybPz5eQtrtand3qimRDYqfQmw7/v iNFB8rDRg2hiJtUQEuZS9inFNFAF/RdlySc6IGJsNZ2gYb4LiSjiWxOtTfscBI9nJdym 0ugRQ/lTq4NgnXZFDkEAP+I7wpAFF3UPtn28E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776260367; x=1776865167; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=xwODCiOyFXzJWZt4r9DrdCAsRW2zetRNQU4k8uLPHqo=; b=B3yhRzJ/Kox3F7i56jhYqUtog9HS2nj7crY3JUIXLBJXsUe7r1TQr+MSYWh/BNKPu6 Ou1y1/RWDMVN2eJJNDdqGYIe0UHZo02aIi2Ui9aC6f1w6T2GKEyc8wLlIfID2/EylNxK 2hX6tmICBO0IYwdNSvsyrgp7UxppHa63lKOHvTpgXvgMNNrE3S+vBrqrT5EiXckBqtPh 9TRBRmB1a84dok9WlEMKnA2/qL7Y+Bgm3Pl7+mvGpRRdwePEe2nx2uZQf8rtZ4On8Bis V8nSJIzuatyWNSsTNyuQ8Szs/EnyQiajYz85clIIg79bBHxfq9f0Ykq3HaHsW3uYMVRJ jM7Q== X-Gm-Message-State: AOJu0YywJEmB8MLKJ3N6EcldV/HmWuFomJgD6f0oEzr9XyxtfSg2ZAEJ DOrnx2Yv6kSYRMLQXxmwoW1GwFzoeHQ+6XRG5MPnQlYUaKZqvv7ThIPrujH/EN0Ah/wJ9HyDDzC atA7pnJFinUyG8E5Rnmqjg8hEe/Uc3B6igTWEBReazBQTlUoI2KD6Gj0= X-Gm-Gg: AeBDiesugI7j4p3O5Z2VCcBuXJH0fywsUDsqvaA5fMx9qiJ/36HbmujT35iPpixOTMp dAOUmd4xG+2Hve3DZQO5Ifn+xeyMnfCCxhWkX+VyNNF020iRT1VWOx6JuBPiXMIR+I+CWJ+318p i4AFYFL9OntT7MxOax7rFDveWm0BIDRkiToER8mxRkqbbegmyzQfGS0lLMQyidNA1t76dagVmY7 FCaKNyJRTD4orRIMoOT9nX/DNYh/7NFohhVVQDgOVkCysMjQ7/80DKkujuEo0C2p4/ukh/yFPaP naPfzu4S06wUpivWMqw= X-Received: by 2002:a05:6000:2c06:b0:43e:ad44:c1c7 with SMTP id ffacd0b85a97d-43ead44c1e9mr3934120f8f.29.1776260367105; Wed, 15 Apr 2026 06:39:27 -0700 (PDT) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: In-Reply-To: Date: Wed, 15 Apr 2026 15:39:15 +0200 X-Gm-Features: AQROBzCQ7DYQ-HYRlBxflwPu5Vw1zdq6Rlj5Y4m-dSS8uDuq4NDLkaDZ6WerR84 Message-ID: Subject: Re: [PHP-DEV] [RFC][Discussion] Add MariaDB-specific features to mysqlnd and mysqli To: Kamil Tekiela Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="00000000000035d8ab064f7fd842" From: georg@mariadb.com (Georg Richter) --00000000000035d8ab064f7fd842 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Kamil, > Regarding the progress indicator: it looks to me like it will be very > difficult to implement properly in PHP, but I might be completely > misunderstanding the design. If you think a PoC is possible, could you > please prepare one and maybe then we can come back to this discussion. > > ---------------------------- > We have callback function support in several extension - the closest matching one is probably curl: curl_setopt($resource, CURLOPT_PROGRESSFUNCTION, 'progressCallback'); Even if no progress callback function was specified, we will always read the progress information from server to avoid possible read timeouts for long running operations. > This is exactly what exceptions and errors are for. The PHP user > SHOULD NOT be concerned with what capabilities the server offers and > which are compatible with PHP. Even if it could be potentially useful > to PHP developers, we should not expose this information as part of > mysqli API. Either mysqli supports the feature or it doesn't; there > should be no maybe. > > ---------------------------- > I disagree =E2=80=94 relying on try/catch is a waste of resources, especial= ly in high-load environments, when the information is already available locally. It forces the application to parse error messages or perform unnecessary network round-trips just to 'discover' server limitations. More importantly, this approach is dangerous for data integrity. For example, if you attempt to insert 1,000 rows via execute_many() on a MyISAM table and the last row contains a feature (like an indicator variable) that the server does not support, 999 rows will be committed before the error is triggered. You cannot 'catch' your way out of a partially written batch in a non-transactional engine. Capability checking is a standard and necessary pattern in PHP. ext/gd uses a bitmask for supported formats so developers can choose the correct logic path before processing begins. Furthermore, ext/mysqli already exposes several capability flags. While some were historically used for internal purposes or inherited from libmysql (like SSL_VERIFY_SERVER_CERT), the precedent for exposing capabilities to the user is already firmly established in the current API. > > > If mysqlnd is going to serialize it all into a large internal buffer > then it still needs to read all the data from the iterator before > making the query? It's still going to use the same amount of memory. > How is it going to maintain constant memory usage? > Exactly. Because mysqlnd sits on top of PHP streams =E2=80=94 which do not = expose direct access to their internal network buffers (at least not at the time when Andrey wrote mysqlnd) =E2=80=94mysqlnd must allocate its own memory bu= ffer for each command before transferring it via the Stream API. The memory usage is determined by the size of the protocol packet being sent. To maintain efficiency, mysqlnd manages this buffer dynamically: it reallocates the buffer only when necessary using an exponential growth strategy to minimize overhead. Regardless of whether the input is an array or a stream, the buffer must be large enough to hold the serialized command. The 'constant memory' aspect refers to the fact that we aren't duplicating the entire dataset multiple times in different formats; rather, we are streaming the serialized data into a managed internal buffer that mysqlnd already relies on for network communication. > > A generator might be a neat trick for users to prepare data on the go, > but it doesn't reduce the memory usage if the consumer needs to use > all of the data in one go. As I understand, execute_many will send it > all in one batch, so the memory footprint stays the same. > > ---------------------------- > No, the peak memory usage is significantly lower. Consider the difference between file() and fread(): - file()/Array: You must store the entire dataset in PHP memory as zvals, then duplicate it into the mysqlnd network buffer. You are essentially double-buffering. - Generator/Stream: PHP memory remains constant because it only holds one row at a time. mysqlnd pulls that row, serializes it directly into the network buffer, and moves on. By using a stream, you eliminate the massive overhead of the intermediate PHP array, even if the final network packet requires a large buffer. > > Re execute_many parameters: > > Despite your convincing arguments for better network utilization by > providing the types, I still think we should not offer the possibility > of specifying the types. I don't know what other PHP developers on > this mailing list think about it, but for me the type feature goes > against the nature of PHP. Making the parameter optional is very good > choice and eases my concerns slightly, so if I am outnumbered in my > opinion, I won't be upset. > > The number of mysqli users grows increasingly smaller. Out of this, > the number of people who will use execute_many and who will need to > optimize for TINYINT is unbelievably tiny. Any string easily > overshadows the numerical data. Thus, this feature won't see much > legitimate use. > I agree that 99% of users likely won't specify types. However, there will always be cases=E2=80=94such as limited memory or restricted CPU=E2=80=94wh= ere this optional parameter is essential. It reduces the footprint and eliminates the overhead of type conversions on both the client and the server. Even if the primary use case is 'tiny,' a low-level driver like mysqlnd should provide the tools for maximum efficiency, especially when the implementation cost for the engine is minimal but the potential performance gain for the user is high. > > Because the MariaDB bulk protocol requires type declarations in the > packet header before the data rows are sent, the driver cannot "autodetec= t" > binary widths from a Generator without reading the entire stream into > memory first. Providing the $types string acts as a contract that allows > for true, constant-memory streaming. > > Isn't that what it does anyway? You need to read all the data > (serialize) before you make the EXECUTE command, correct? I don't > understand why you can't prepare the type specification automatically > while serializing the data. > That works for arrays, but not for streams or generators. Unlike an array, a stream is a one-way 'pull' mechanism=E2=80=94we don't know the type of th= e second or hundredth row until we have already consumed the last. To determine types automatically, we would have to buffer the entire stream into memory first to inspect it, which completely defeats the purpose of using a stream to save memory. Providing the types upfront allows mysqlnd to serialize the stream directly to the wire in a single pass. > > Regarding the control parameter: > > Why not make it a callback? Provide the $row as an argument and let > the user modify it inside the callback, substituting values for > mysqli_indicator, and returning the row to be inserted. It would offer > a lot more flexibility to the user and would make the implementation > simpler. This way, you don't need to implement Null or None anymore. > > ---------------------------- > A callback would actually be a major step backward for performance. Invoking a PHP user-land closure 100,000 times in a single execute_many() call introduces a massive overhead due to context switching between the C-engine and the PHP VM. This would effectively negate the performance gains we are trying to achieve. My planned implementation already solves the flexibility problem via Generators. Since execute_many() accepts an iterable, a user can already use a generator to perform row-by-row logic. This is far more efficient than a callback. Furthermore, a callback does not eliminate the need for mysqli_indicator::None. Even inside a function, the driver still needs a clear 'metadata signal' to know whether to pull a value from the data source or to treat it as an override. I was asking you to list in the RFC all the possible client errors > that are added as part of this implementation. For example, "Row %lu > is not an array". This should be part of the RFC, in my opinion, as we > may want to discuss the error conditions and messages too. > > ---------------------------- > I haven't introduced any new error codes; all data validation errors use CR_INVALID_PARAMETER_NO with UNKNOWN_SQLSTATE. I didn't see a requirement in the RFC guidelines to list every specific error message, but if it is mandatory for the process, I can add them to the document of course. > > Result sets: > > That's not what I meant. I was asking whether it could be implemented > with MARIADB_CLIENT_BULK_UNIT_RESULTS instead. When users execute a > SELECT with 2 data rows, I would like to see it return 2 mysqli_result > objects. Same with INSERT statements, it should return a separate > result for each insertion. If there are arguments against that, they > should be explained in the RFC. > Expecting 1,000 separate result objects for 1,000 inserted rows would cause a massive performance collapse. Each result set would require its own network packet and redundant metadata headers, completely defeating the purpose of a bulk execution API. Regarding MARIADB_CLIENT_BULK_UNIT_RESULTS: this is a very recnt feature (introduced in MariaDB 11.5) that allows the server to return a single result package containing multiple status rows. I did not include it in the current RFC because it is not yet widely available in LTS releases. Furthermore, for the MySQL fallback (where this capability is absent), mysqlnd would have to 'artificially' construct these result sets in memory, which adds significant overhead. The goal of execute_many is maximum throughput, which is best achieved by providing a summary of the bulk operation rather than individual results for every row. > > Anyway, the RFC should clearly explain how result fetching works with > all 3 methods (unbuffered, stored, and get_result) and what are the > possible gotchas. > > ---------------------------- > The RFC notes that execute_many() can return a result set (e.g., when using a RETURNING clause or similar). The PHP documentation already clearly defines how result sets are retrieved from prepared statements: via bind_result() with store_result()/use_result(), or via get_result(). Since execute_many() follows the existing mysqli_stmt behavior and does not change how results are buffered or fetched, adding a redundant explanation of standard mysqli mechanisms would only clutter the RFC. > > > For transactional engines like InnoDB, atomicity is guaranteed. If a > protocol error occurs or a constraint is hit mid-batch, the server handle= s > the rollback, ensuring the database remains in a consistent state. > > > Transaction Safety & Atomicity: In native MariaDB bulk mode, the entire > batch is sent as a single unit. In the fallback emulation, rows are > executed one by one. For non-transactional engines, a failure on row 500 > would leave the first 499 rows committed. To maintain consistency, we > should recommend that users wrap execute_many() in an explicit transactio= n > when portability across MySQL and MariaDB is required. > > I am confused. Aren't both of these statements stating the same? Why > can't you wrap the fallback in an automatic transaction to make it > work exactly the same as the native MariaDB solution? > > If execute_many implies an automatic transaction but only with > transactional engines, it should be clearly stated in the RFC so that > it can be later documented in PHP manual too. > > ---------------------------- > > I am against wrapping the fallback in an automatic transaction because a low-level driver should not modify the session's transactional state behind the scenes. If mysqli were to automatically inject START TRANSACTION and COMMIT, it could unexpectedly commit a user's existing work or interfere with their manual transaction logic. Furthermore, an 'automatic' transaction would be a false promise on engines like MyISAM, where BEGIN and COMMIT are simply ignored, still resulting in partial inserts. I've updated the RFC to clarify this. > > Limited Indicator Support: Since the standard MySQL COM_STMT_EXECUTE > protocol does not understand MariaDB-specific indicators, the fallback wi= ll > only support mysqli_indicator::Null (translated to a standard SQL NULL) a= nd > mysqli_indicator::None. Indicators like DEFAULT and IGNORE are technicall= y > impossible to implement in the fallback without complex SQL string > manipulation/rewriting, which would introduce unacceptable CPU overhead. > > And for this reason, I think that maybe we shouldn't even implement > the control parameter at all. It sounds like a neat feature, but it > costs performance in a function that is all about improving > performance, and it is DB-version specific. The new execute_many > function doesn't need the control parameter to function properly, and > in my opinion, it would be better to keep it as simple as possible. > But I am curious to see what others think. > Could you please explain why you expect a performance loss? I believe the opposite is the case. Consider the following common scenario: A system collects records from various external APIs. The 'IDs' from these sources are not unique, so we need the database to generate its own AUTO_INCREMENT keys. Additionally, for privacy compliance (GDPR/CCPA), we must mask phone numbers during the import. The 'expensive' way would be to modify the existing data source. In the worst case=E2=80=94for instance, if you need to log the original data after= the import=E2=80=94you would have to create a full copy of the data first. You = then have to iterate over the entire dataset to: - Set every id to null to trigger AUTO_INCREMENT. - Overwrite every phone number with a masked string (e.g., +XX XXX-XXXXXXX)= . By specifying a control parameter, the source data remains completely untouched. The driver handles both the nullification and the masking at the C-level. Example: /* Raw external data: [External_ID, Name, Phone] */ $external_data =3D [ [101, 'John Doe', '555-1234'], [102, 'Jane Doe', '555-5678'], ... [100000, "Rasmus Lerdorf', '431-1233939'] ]; /* Control parameter: - Column 0 (ID): Force NULL to trigger AUTO_INCREMENT - Column 1 (Name): Use mysqli_indicator::None (Keep original data) - Column 2 (Phone): Scalar override for privacy masking */ $control =3D [ mysqli_indicator::Null, mysqli_indicator::None, "+XX XXX-XXXXXXX" ]; $stmt->execute_many($external_data, control: $control); /Georg --=20 Georg Richter, Staff Software Engineer Client Connectivity MariaDB Corporation Ab --00000000000035d8ab064f7fd842 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Kamil,

Regarding the progress indicator: it looks to me like it will be very
difficult to implement properly in PHP, but I might be completely
misunderstanding the design. If you think a PoC is possible, could you
please prepare one and maybe then we can come back to this discussion.

----------------------------
We have callback function= support in several extension - the closest matching one is probably curl:= =C2=A0
curl_setopt($resource, CURLOPT_PROGRESSFUNCTION, =
'progressCallback');
Eve= n if no progress callback function was specified, we will always read the p= rogress information from server=C2=A0to avoid possible read timeouts for lo= ng running operations.
=C2=A0
This is exactly what exceptions and errors are for. The PHP user
SHOULD NOT be concerned with what capabilities the server offers and
which are compatible with PHP. Even if it could be potentially useful
to PHP developers, we should not expose this information as part of
mysqli API. Either mysqli supports the feature or it doesn't; there
should be no maybe.

----------------------------
I disagree =E2=80=94 rely= ing on try/catch is a waste of resources, especially in high-load environme= nts, when the information is already available locally. It forces the appli= cation to parse error messages or perform unnecessary network round-trips j= ust to 'discover' server limitations.

More importantly, this= approach is dangerous for data integrity. For example, if you attempt to i= nsert 1,000 rows via execute_many() on a MyISAM table and the last row cont= ains a feature (like an indicator variable) that the server does not suppor= t, 999 rows will be committed before the error is triggered. You cannot = 9;catch' your way out of a partially written batch in a non-transaction= al engine.
Capability checking is a standard and necessary pattern= in PHP. ext/gd uses a bitmask for supported formats so developers can choo= se the correct logic path before processing begins. Furthermore, ext/mysqli= already exposes several capability flags. While some were historically use= d for internal purposes or inherited from libmysql (like SSL_VERIFY_SERVER_= CERT), the precedent for exposing capabilities to the user is already firml= y established in the current API.

If mysqlnd is going to serialize it all into a large internal buffer
then it still needs to read all the data from the iterator before
making the query? It's still going to use the same amount of memory. How is it going to maintain constant memory usage?
Exactly. Because mysqlnd sits on top of PHP streams =E2=80=94 w= hich do not expose direct access to their internal network buffers (at leas= t not at the time when Andrey wrote mysqlnd) =E2=80=94mysqlnd must allocate= its own memory buffer for each command before transferring it via the Stre= am API.

The memory usage is determined by the size of the protocol p= acket being sent. To maintain efficiency, mysqlnd manages this buffer dynam= ically: it reallocates the buffer only when necessary using an exponential = growth strategy to minimize overhead.

Regardless of whether the inpu= t is an array or a stream, the buffer must be large enough to hold the seri= alized command. The 'constant memory' aspect refers to the fact tha= t we aren't duplicating the entire dataset multiple times in different = formats; rather, we are streaming the serialized data into a managed intern= al buffer that mysqlnd already relies on for network communication.
=C2=A0

A generator might be a neat trick for users to prepare data on the go,
but it doesn't reduce the memory usage if the consumer needs to use
all of the data in one go. As I understand, execute_many will send it
all in one batch, so the memory footprint stays the same.

----------------------------
No, the peak memory usage= is significantly lower. Consider the difference between file() and fread()= :
- file()/Array: You must store the entire dataset in PHP memory as zva= ls, then duplicate it into the mysqlnd network buffer. You are essentially = double-buffering.
- Generator/Stream: PHP memory remains constant becaus= e it only holds one row at a time. mysqlnd pulls that row, serializes it di= rectly into the network buffer, and moves on.
By using a stream, you eli= minate the massive overhead of the intermediate PHP array, even if the fina= l network packet requires a large buffer.=C2=A0

Re execute_many parameters:

Despite your convincing arguments for better network utilization by
providing the types, I still think we should not offer the possibility
of specifying the types. I don't know what other PHP developers on
this mailing list think about it, but for me the type feature goes
against the nature of PHP. Making the parameter optional is very good
choice and eases my concerns slightly, so if I am outnumbered in my
opinion, I won't be upset.

The number of mysqli users grows increasingly smaller. Out of this,
the number of people who will use execute_many and who will need to
optimize for TINYINT is unbelievably tiny. Any string easily
overshadows the numerical data. Thus, this feature won't see much
legitimate use.

I agree that 99% of use= rs likely won't specify types. However, there will always be cases=E2= =80=94such as limited memory or restricted CPU=E2=80=94where this optional = parameter is essential. It reduces the footprint and eliminates the overhea= d of type conversions on both the client and the server.

Even if the= primary use case is 'tiny,' a low-level driver like mysqlnd should= provide the tools for maximum efficiency, especially when the implementati= on cost for the engine is minimal but the potential performance gain for th= e user is high.


> Because the MariaDB bulk protocol requires type declarations in the pa= cket header before the data rows are sent, the driver cannot "autodete= ct" binary widths from a Generator without reading the entire stream i= nto memory first. Providing the $types string acts as a contract that allow= s for true, constant-memory streaming.

Isn't that what it does anyway? You need to read all the data
(serialize) before you make the EXECUTE command, correct? I don't
understand why you can't prepare the type specification automatically while serializing the data.

That wor= ks for arrays, but not for streams or generators. Unlike an=20 array, a stream is a one-way 'pull' mechanism=E2=80=94we don't = know the type of=20 the second or hundredth row until we have already consumed the last.

= To determine types automatically, we would have to buffer the entire=20 stream into memory first to inspect it, which completely defeats the=20 purpose of using a stream to save memory. Providing the types upfront=20 allows mysqlnd to serialize the stream directly to the wire in= a single pass.


=C2=A0

Regarding the control parameter:

Why not make it a callback? Provide the $row as an argument and let
the user modify it inside the callback, substituting values for
mysqli_indicator, and returning the row to be inserted. It would offer
a lot more flexibility to the user and would make the implementation
simpler. This way, you don't need to implement Null or None anymore.
----------------------------
A callback would actually= be a major step backward for performance. Invoking a PHP user-land closure= 100,000 times in a single execute_many() call introduces a massive overhea= d due to context switching between the C-engine and the PHP VM. This would = effectively negate the performance gains we are trying to achieve.

M= y planned implementation already solves the flexibility problem via Generat= ors. Since execute_many() accepts an iterable, a user can already use a gen= erator to perform row-by-row logic. This is far more efficient than a callb= ack.

Furthermore, a callback does not eliminate the need for mysqli_= indicator::None. Even inside a function, the driver still needs a clear = 9;metadata signal' to know whether to pull a value from the data source= or to treat it as an override.=C2=A0=C2=A0

I was asking you to list in the RFC all the possible client errors
that are added as part of this implementation. For example, "Row %lu is not an array". This should be part of the RFC, in my opinion, as we=
may want to discuss the error conditions and messages too.

----------------------------
I haven't introduced any n= ew error codes; all data validation errors use CR_INVALID_PARAMETER_NO with= UNKNOWN_SQLSTATE. I didn't see a requirement in the RFC guidelines to = list every specific error message, but if it is mandatory for the process, = I can add them to the document of course.

=C2=A0

Result sets:

That's not what I meant. I was asking whether it could be implemented with MARIADB_CLIENT_BULK_UNIT_RESULTS instead. When users execute a
SELECT with 2 data rows, I would like to see it return 2 mysqli_result
objects. Same with INSERT statements, it should return a separate
result for each insertion. If there are arguments against that, they
should be explained in the RFC.

Expecting 1,000 separate result objects for 1,000 inserted rows w= ould cause a massive performance collapse. Each result set would require it= s own network packet and redundant metadata headers, completely defeating t= he purpose of a bulk execution API.
Regarding MARIADB_CLIENT_BULK_UNIT_R= ESULTS: this is a very recnt feature (introduced in MariaDB 11.5) that allo= ws the server to return a single result package containing multiple status = rows. I did not include it in the current RFC because it is not yet widely = available in LTS releases.
Furthermore, for the MySQL fallback (where th= is capability is absent), mysqlnd would have to 'artificially' cons= truct these result sets in memory, which adds significant overhead. The goa= l of execute_many is maximum throughput, which is best achieved by providin= g a summary of the bulk operation rather than individual results for every = row.

Anyway, the RFC should clearly explain how result fetching works with
all 3 methods (unbuffered, stored, and get_result) and what are the
possible gotchas.

----------------------------
The RFC notes that execut= e_many() can return a result set (e.g., when using a RETURNING clause or si= milar). The PHP documentation already clearly defines how result sets are r= etrieved from prepared statements: via bind_result() with store_result()/us= e_result(), or via get_result().

Since execute_many() follows the ex= isting mysqli_stmt behavior and does not change how results are buffered or= fetched, adding a redundant explanation of standard mysqli mechanisms woul= d only clutter the RFC.
=C2=A0

> For transactional engines like InnoDB, atomicity is guaranteed. If a p= rotocol error occurs or a constraint is hit mid-batch, the server handles t= he rollback, ensuring the database remains in a consistent state.

> Transaction Safety & Atomicity: In native MariaDB bulk mode, the e= ntire batch is sent as a single unit. In the fallback emulation, rows are e= xecuted one by one. For non-transactional engines, a failure on row 500 wou= ld leave the first 499 rows committed. To maintain consistency, we should r= ecommend that users wrap execute_many() in an explicit transaction when por= tability across MySQL and MariaDB is required.

I am confused. Aren't both of these statements stating the same? Why can't you wrap the fallback in an automatic transaction to make it
work exactly the same as the native MariaDB solution?

If execute_many implies an automatic transaction but only with
transactional engines, it should be clearly stated in the RFC so that
it can be later documented in PHP manual too.

----------------------------

I am against wrapping the fallback in an automatic tr= ansaction because a low-level driver should not modify the session's tr= ansactional state behind the scenes.

If mysqli were to automatically= inject START TRANSACTION and COMMIT, it could unexpectedly commit a user&#= 39;s existing work or interfere with their manual transaction logic. Furthe= rmore, an 'automatic' transaction would be a false promise on engin= es like MyISAM, where BEGIN and COMMIT are simply ignored, still resulting = in partial inserts.

I've updated the RFC to clarify this.
<= div>=C2=A0
> Limited Indicator Support: Since the standard MySQL COM_STMT_EXECUTE p= rotocol does not understand MariaDB-specific indicators, the fallback will = only support mysqli_indicator::Null (translated to a standard SQL NULL) and= mysqli_indicator::None. Indicators like DEFAULT and IGNORE are technically= impossible to implement in the fallback without complex SQL string manipul= ation/rewriting, which would introduce unacceptable CPU overhead.

And for this reason, I think that maybe we shouldn't even implement
the control parameter at all. It sounds like a neat feature, but it
costs performance in a function that is all about improving
performance, and it is DB-version specific. The new execute_many
function doesn't need the control parameter to function properly, and in my opinion, it would be better to keep it as simple as possible.
But I am curious to see what others think.
Could you please= explain why you expect a performance loss? I believe the opposite is the c= ase.

Consider the following common scenario: A system collects recor= ds from various external APIs. The 'IDs' from these sources are not= unique, so we need the database to generate its own AUTO_INCREMENT keys. A= dditionally, for privacy compliance (GDPR/CCPA), we must mask phone numbers= during the import.

The 'expensive' way would be to modify t= he existing data source. In the worst case=E2=80=94for instance, if you nee= d to log the original data after the import=E2=80=94you would have to creat= e a full copy of the data first. You then have to iterate over the entire d= ataset to:

- Set every id to null to trigger AUTO_INCREMENT.
- Ov= erwrite every phone number with a masked string (e.g., +XX XXX-XXXXXXX).
By specifying a control parameter, the source data remains completely = untouched. The driver handles both the nullification and the masking at the= C-level.

Example:

/* Raw external data: [External_ID, Name, = Phone] */
$external_data =3D [
=C2=A0 =C2=A0 [101, 'John Doe'= , '555-1234'],
=C2=A0 =C2=A0 [102, 'Jane Doe', '555-= 5678'],
=C2=A0 =C2=A0 =C2=A0...
=C2=A0 =C2=A0 [100000, "Ras= mus Lerdorf', '431-1233939']
];

/* Control parameter:=
=C2=A0 =C2=A0- Column 0 (ID): Force NULL to trigger AUTO_INCREMENT
= =C2=A0 =C2=A0- Column 1 (Name): Use mysqli_indicator::None (Keep original d= ata)
=C2=A0 =C2=A0- Column 2 (Phone): Scalar override for privacy maskin= g */
$control =3D [
=C2=A0 =C2=A0 mysqli_indicator::Null,
=C2=A0 = =C2=A0 mysqli_indicator::None,
=C2=A0 =C2=A0 "+XX XXX-XXXXXXX"=
];

$stmt->execute_many($external_data, control: $control);
/Georg
--
Georg Richter, Staff Software Engin= eer
Client Connectivity
MariaDB Corporation Ab
<= /div>
--00000000000035d8ab064f7fd842--