Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:130628 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id C3E861A00BC for ; Tue, 14 Apr 2026 12:22:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1776169328; bh=KjT2ERUbrJQgQrXvYiHJ14d5WKLylVJoHfONcdq9hx0=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=JCVWG5SkQDGCB2TM0Mwe5/SYjpuBm4+H6bjhOxAPYtoxTOUyaClE5uMIMVNFKddrg JQf8GV9+mzENmEXkqVZldYU4olHXn722Bw6VcUdtEf2PK+vnT1BStMvlD78Y4dHNQ9 dDd5SHeXr+MMgD2I1bJl2wo7Z3+99bzoiuxmpG9O8fkvC0dqzjKVZ64P3/ZMu1pX+N uztBFws+4jPii7PVERLoUM0QvrgYHkqLuMJbmh6vGp1G9NyZFK5wBGJKeE6ENKluhy MMHCydnXgPBaBepsLpJHQDCvtBDFdGu+bTFPNZv0QU/EiF4H2tReYBkGDV33qnveed fMpeI4gEk1G4A== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 4F581180053 for ; Tue, 14 Apr 2026 12:22:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_50, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 14 Apr 2026 12:22:06 +0000 (UTC) Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-43d7badbd7dso999234f8f.2 for ; Tue, 14 Apr 2026 05:22:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1776169320; cv=none; d=google.com; s=arc-20240605; b=lpu0F65PHCZBdHv2OPCNSOBjKHklMK0kh1VDWOFmvA2kJ0qYX8pvSabBtTR4BPDtyP pMqkcf8ZQpDqEOsBnvKtBTJ9RH6d6dwY78OoawyMXxIgon+wZtjEDx3OC11hNzaXveQt KF45pYS3/KimJYGT1Qsb1Z4Lnp8LnDYzTVPtUfFKECL++iZMQehG09nAU/SyvpvahBlM mlzvXnV0T+idZGn9ygbhwDZtaNTFT7NXJewo5vNgnpH/bsMP/hIVGh4qCIXs0SirPETG R+moSHF5Je3Sysczw1H39tV9jTyXJ4dBJoevdu3Nn0P8rVgc9IX8EUC2OrXKXn/NJBOw tN3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=phUDi+bC2AJEzy1Um3uWUQ1CDcdUHWiLhoBpaNrqFng=; fh=GbLssEDpFkOs4sxVotj6CryJN2OQBdvy0mK7+XaIMqE=; b=W3/paJ8cLrVv4vO5VymNo9TmOU2aECfx5gf3JIwavj3UyGmZ/G9ImvYdh32MbplgBj QmA1OFMr3WrQ9dzQVrDKB8fgNnN9zLKkPADTIXzHxi+hqw4zPxuxkgBWnIoYofAdiJ5X xxF+62zLBzRbyFYxR1hoCRe+mAdPk6ZP8XoOFuJLX+tjxnEQ4U+5kywwY/3RqdehGMR6 pqcFJHdZ6UVYE+Vwp+kYOHvVA6YatGe8uNvOxr/ZhPmGcYn80K23sr7S35JjYNKfNCD0 9bJrZBM8NXJwKurd0z3W0PMjkB5Sgpr1WUOimbMve8iS8f7dS7XidsH8+piITcWiNtM3 Wn0g==; darn=lists.php.net ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mariadb.com; s=google; t=1776169320; x=1776774120; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=phUDi+bC2AJEzy1Um3uWUQ1CDcdUHWiLhoBpaNrqFng=; b=PGHBfFcXTAK4l87cANDD8QDr3ldRiFTsKnFYhLpp28efADzINUubJ1+sr7IsapZGrf Nvpr2OqTyZG/KL+jk43+XJKgmCJG1rWqi7Q8IBsgB3b1R27vKMrVdNNP54nLgcy6kjzj bFleJAYslXph5uxPnYC7GUmtvDnrdsRYjdXK0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776169320; x=1776774120; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=phUDi+bC2AJEzy1Um3uWUQ1CDcdUHWiLhoBpaNrqFng=; b=miKAj7qKf0aTWJ27ILdOehQf1erUUWRvtEiDk4Y1zhEVel5gZGctvusuGdH4/95yjm n1kkiIOQF2ybl+O1FC6QFUiOKHU6aBwrE44DLwgwCxKWpSr8AFhP9+jdDABqxHeiiTqr +stPlfd8d2VKVVrzFFHQH/ok74xQUZy2ZqaHkul72DkOu4reAnFf9QBdeArCD+Hkjn93 Kx8BHjDVOMSrLleqgCSfHU8jjbOrWqGjhDRS67umwvnW6UPU6XeN0ksQmrUgYfh3XYws 5/ubfuPpGdM8oROLB0wweJb/rOWBrXMkbnp9SpAMdd0VYDkWVj7dmXmq7FS6WnraQurv D7jQ== X-Gm-Message-State: AOJu0YxwnjH4bX9mQ81rPGiGtopfKWiuB5OJYIstpUYZ518RkeCq9QWD Mcm5KtTYsC7rrp3n64pzvEzkV3iSDtIbAB28gXfBoi0/kJ2PlOsa6sZTZimNg8Cij0i9SnEHvMu 2OGDGgjWiAebdS5CFirJXdxU0N902ibIKdZ/zneorqlv6zUjEpTRUKs8= X-Gm-Gg: AeBDiesmjazBtof/+FfSqsAULTsR2t7fAhH93EqUCeXMNGNc4+II4Op90HQgYwlEoKs 0i3hIIyRJDjmFQKivMITavw58APByjmuglGUGO6rqt/VNvFEPAxqr/H+vkPF4dciHnNiPzddND0 MfJWVWX+T+U7p+bDDRjUGaN1Uomrc66LNZvrd9I8iWCwrqFmB5O5Eoxb4vf4Tj4cF5ZkEpvuvdp UCWIOysVA/ZEk6TCSYXBP1oS9zWgZcNGvYNhyFXMZUKwoh4sINgk0Fiax0fkjIf9Ec78wNNVr47 NkjPdWEwF/gFlMwtO1c= X-Received: by 2002:a5d:5849:0:b0:43d:7261:8759 with SMTP id ffacd0b85a97d-43d726187ccmr14364729f8f.28.1776169319574; Tue, 14 Apr 2026 05:21:59 -0700 (PDT) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: In-Reply-To: Date: Tue, 14 Apr 2026 14:21:48 +0200 X-Gm-Features: AQROBzBRug7pn8MavDIt0E1qW9oQmGi8dWQAi8N8bXtOYdfyYIZ-pT2PSWFfOkk Message-ID: Subject: Re: [PHP-DEV] [RFC][Discussion] Add MariaDB-specific features to mysqlnd and mysqli To: Kamil Tekiela Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="0000000000005ac1bd064f6aa5bf" From: georg@mariadb.com (Georg Richter) --0000000000005ac1bd064f6aa5bf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Kamil, Thank you for the detailed feedback. I=E2=80=99ve updated the RFC to reflec= t several of your points, specifically regarding the separation of concerns. To keep things clean, everything not directly visible to the user (metadata caching, extended metadata, auth plugins) will be moved to separate PRs. > Extended Metadata: > It looks like you have some formatting issues there. I don't > understand what this feature is, how to use it, nor how it will be > exposed. It needs more explanation. > > In the table, you mention that it doesn't require any user-facing > changes, but it does. The returned structure is extended with two new > fields from what I can see. Additionally, the PoC changes existing > field values too. > Extended metadata will go to a separate PR - we only need to map MYSQL_TYPE_LONGTYPE + json to MYSQL_TYPE_JSON. and MYSQL_TYPE_BLOB + vector to MYSQL_TYPE_VECTOR. > > ---------------------------- > > Prepared Statement Metadata Caching: > In theory I am all for adding transparent improvements and they don't > usually require an RFC, but in this case, I don't understand how the > caching will be done. Are you saying the user will need to cache the > metadata somewhere? Is the cache shared across requests? How will the > cache be invalidated/cleaned? > Right this will also be implemented via PR > > From the implementation, it seems like the cache is only per prepared > statement, i.e. only the first execute() call will fetch the metadata. > If that's all it is then it can be made into a simple PR without RFC. > But in the table you say "Avoids resending metadata for repeated > prepares" which confuses me. > The caching mechanism is designed to be transparent within the mysqlnd layer. Historically, MySQL and older MariaDB versions sent metadata during the PREPARE phase and again for every subsequent EXECUTE. MariaDB has optimized this: the server now only sends metadata during EXECUTE if the table structure or definition has changed. By implementing caching in mysqlnd, we can skip the redundant allocation and parsing of metadata packets. This is managed internally by the driver and does not require the user to manually clean or invalidate the cache. > > Performance Improvements: Existing PHP functions that prepare and > execute statements, such as mysql_execute_query() (introduced in PHP 8.2)= , > could be updated to use execute_direct when connected to a MariaDB server= . > This would allow one-time statements to bypass the extra round-trip and > metadata fetch, providing immediate performance gains. > > Does this refer to caching? How does caching improve the performance > of a one-time executed prepared statement? If that's not what it > refers to, then what did you mean? > Sorry I forgot to remove this before I published it. MariaDB supports direct execution by sending prepare and execute together - this doesn't seem to work with php_streams. > > ---------------------------- > > Progress Indication: > I'm quite worried about this one. PHP doesn't have native asynchronous > support yet. How do you intend for this to work when a built-in PHP > function can be suspended and possibly interrupted completely while > the MySQL is processing the request? > To clarify: this is not asynchronous in the Fiber sense. It is a synchronous "keep-alive" mechanism. During long-running operations (like ALTER TABLE, LOAD DATA, or complex INSERT...SELECT statements), the server sends "progress" packets. Currently, mysqlnd ignores these, which can lead to connection timeouts or a lack of feedback for the user. By supporting these packets, the driver stays active, preventing timeouts and allowing the application to provide real-time feedback to the user or logs, ensuring the client knows the server is still alive and working. > > ---------------------------- > > PDO Considerations: > Why is PDO out of scope? PDO is the main database extension in PHP. If > any new feature is added, then it should probably be exposed via PDO > first. What do you mean by "PDO_MySQL currently does not support array > binding or bulk execution semantics"? > PDO is currently out of scope only to ensure a manageable and focused RFC. Implementing these features in mysqlnd provides the necessary binary foundation. A future "PDO Bulk RFC" can then leverage this work across all drivers (PostgreSQL, SQLite, etc.) without needing to reinvent the protocol-specific logic for the MySQL/MariaDB ecosystem. We are building the engine first; the unified PDO interface can follow. > > ---------------------------- > > Server capabilities: > I don't think it's more robust to detect MariaDB server by the lack of > CLIENT_LONG_PASSWORD than it is by the server_version. What if MySQL > decides to remove that flag in the future? What if derived products > don't follow the same convention? > > Why is it necessary to expose this information to the PHP user via > server_capabilities and ext_server_capabilities? I can't find anywhere > that you explain what information is presented in these properties or > how to use it. > There is a fundamental difference between what a server might do (Version) and what it can do (Capabilities). A version string is a static label. Capability flags, however, reflect the server=E2=80=99s current build-time = features and runtime configuration. Exposing ext_server_capabilities allows developers and frameworks to implement reliable feature detection. Without these flags, a developer cannot safely use functions without risking a fatal error or needing to wrap every call in a try-catch block. ---------------------------- > > Memory Consumption: > Why is using generators going to use less memory? How does the new > function use generators that it's able to provide performance > improvements without reading all of the data from the generator > beforehand? This needs to be clearly explained so that it can be > explained in the PHP manual later. What is the difference between > these two: > > ``` > $stmt->executemany("sd", $generator); > $stmt->executemany("sd", iterator_to_array($generator)); > ``` > > The performance improvement is not derived from the Generator itself, but from the reduction of network round-trips. - standard foreach + execute(): 1,000 rows =3D 1,000 "Send" packets + 1,000 "Response" packets. - execute_many(): The driver pulls rows from the iterable and serializes them into a large internal network buffer. 1,000 rows =3D 1 large Send pack= et + 1 Response packet. Regarding iterator_to_array($generator): This forces PHP to allocate memory for every single row simultaneously. If you are importing 1 million rows, PHP must allocate memory for 1 million arrays before the first byte is ever sent. By using the Generator directly, we maintain O(1) memory usage, fetching and serializing only one row at a time. > ---------------------------- > > mysqli_stmt::executemany implementation: > You only mention the OO-style method in the RFC. Will the procedural > style counterpart also be implemented? > yes - OO and procedural functions use the same code. > > The name doesn't follow mysqli naming standard. It should be called > mysqli_stmt::execute_many. However, I wonder what made you land on > this name instead of mysqli_stmt::bulk_execute? > > There should be no $types parameter IMHO. The data should either > always be treated as text, or it should be bound using the PHP type, > i.e. autodetected. I believe it was a mistake that bind_param() has a > $types parameter. All it serves is to cast the data silently into the > specified type. It has been a source of many silent bugs and a lot of > confusion. Additionally, giving users more type choices in > executemany() will not only create an inconsistency with bind_param(), > but also confuse PHP users who don't know if a variable is 8, or 16, > or 32, or 64-bit. > You are absolutely right=E2=80=94the name should follow the underscore conv= ention, so it will be execute_many. Regarding the parameter order and the $types question, the proposed signature could be: public mysqli_stmt::execute_many(iterable $data, ?iterable $control =3D nul= l, ?string $types =3D null): bool public mysqli_stmt_execute_many(mysqli_stmt $stmt, iterable $data, ?iterable $control =3D null, ?string $types =3D null): bool By placing $control and $types at the end as optional parameters, we can leverage PHP 8 named parameters to resolve the conflict between ease-of-use and technical optimization. 1. The "All-Strings" Path (Implicit Autodetection) To address your concern about autodetection: providing $types is not mandatory. If the user simply calls $stmt->execute_many($data), the driver will implicitly treat all scalar values as strings. Since MariaDB and MySQL handle implicit casting efficiently, this provides the seamless "it just works" experience you requested for the majority of use cases. 2. The "Strict-Typed" Path (Memory & Bandwidth Optimization) While autodetection is convenient, specifying $types remains critical when resource optimization is a factor: - Binary Efficiency: By specifying types like i (LONG) or 1 (TINYINT), we significantly reduce the network payload compared to sending everything as strings (up to an 80% reduction for certain numeric columns). - O(1) Memory Safety for Streams: Because the MariaDB bulk protocol requires type declarations in the packet header before the data rows are sent, the driver cannot "autodetect" binary widths from a Generator without reading the entire stream into memory first. Providing the $types string acts as a contract that allows for true, constant-memory streaming. 3. The "Control" Parameter and Indicators I agree that the Indicator Enum should be moved to mysqli (mysqli_indicator). The $control array serves three critical architectural purposes: - Non-Destructive Transformation: It allows developers to apply logic (like DEFAULT or NULL overrides) without modifying the original $data array or stream. This is vital if the data source is a read-only Generator or is being reused elsewhere in the application. - Security (Sensitive Data Overwrite): The $control array acts as a secure template. A developer can pass a raw data array but use the control array to force a status column to "active" or a last_modified_by column to a specific ID, ensuring sensitive fields are overwritten regardless of the input data's content. -Flexibility: It supports both a Global Mode (one row that acts as a mask for the entire batch) and a Streaming Mode (an iterable that advances in parallel with the data). This design ensures that execute_many remains simple for beginners while providing the granular control required by high-performance implementations= . > Please be explicit what the failure conditions may be. > Like for all other functions you have to retrieve error messages via mysqli_stmt::error/errno. > How does get_result() work with executemany()? For example, if I > execute SELECT ?,? with [[1,2], [3,4]] will I get that same structure > in the result set? Is the sequence guaranteed? Is there a way to > distinguish which row came from which input data row? What if one > SELECT produces two rows and the other zero? Will it still be a single > mysqli_result? > As mentioned in the beginning of the RFC, execute_many is used for UPSERT (insert, replace,delete and update) operations. However it might contain a result set, e.g. by having a RETURNING clause. > > How does it work with store_result() and also with unbuffered result > sets? I assume it's all treated the same as a normal execute(), > correct? > Yep > > How does this affect insert_id? MariaDB documentation says that only > "the first auto-generated ID" is returned, which is contrary to what I > expect: the last. Does the value propagate to PHP? If so, why? > This behavior matches the standard MySQL C API and protocol where mysql_insert_id() returns the first ID of a multi-row insert, not the last: MySQL [test]> truncate table t1; Query OK, 0 rows affected (0.075 sec) MySQL [test]> insert into t1 (name) values ('foo'),('bar'),('PHP'); Query OK, 3 rows affected (0.017 sec) Records: 3 Duplicates: 0 Warnings: 0 MySQL [test]> select last_insert_id()\G; *************************** 1. row *************************** last_insert_id(): 1 > > From what I can see, it is possible to request that MariaDB reply with > individual result sets. Wouldn't that be better? How would that impact > the performance? > If you require the IDs of every row in a bulk insert, the MariaDB RETURNING clause should be used: INSERT INTO t1 (name) VALUES (?) RETURNING id > How about atomicity? If I make 3 INSERTS and the middle one fails, > will the other two execute? > For transactional engines like InnoDB, atomicity is guaranteed. If a protocol error occurs or a constraint is hit mid-batch, the server handles the rollback, ensuring the database remains in a consistent state. > > Indicator enum should contain only two values: IGNORE and DEFAULT. > Null is already a PHP data type and does not require an indicator. > None serves no purpose (what would even happen if someone used it). > Adding an Indicator enum will create an inconsistency with the > existing feature set which do not support it. > An enum should be defined by the extension in which it will be used, > i.e. mysqli. The mysqlnd extension shuld not expose any user-facing > APIs. > See my comments about new execute_many parameters (control). Even None is used to indicate to use the original value from $data block. NULL and mysqli_indicator::Null are indeed the same - however it is more consistant with other drivers (e.g. ODBC and OCI) to have a Null indicator. > > When connected to the MySQL server, how will executemany() behave? I > assume there will be an emulated support for it in mysqlnd, but how > will you ensure consistency between these two implementations? Making > the function no-op with MySQL should be out of the question. > When connected to a server that does not support the native bulk protocol (such as standard MySQL), execute_many() will transition to a fallback emulation layer. This ensures that the function remains a reliable, portable API for developers regardless of the backend. However, there are inherent limitations to this emulation: - Limited Indicator Support: Since the standard MySQL COM_STMT_EXECUTE protocol does not understand MariaDB-specific indicators, the fallback will only support mysqli_indicator::Null (translated to a standard SQL NULL) and mysqli_indicator::None. Indicators like DEFAULT and IGNORE are technically impossible to implement in the fallback without complex SQL string manipulation/rewriting, which would introduce unacceptable CPU overhead. - Transaction Safety & Atomicity: In native MariaDB bulk mode, the entire batch is sent as a single unit. In the fallback emulation, rows are executed one by one. For non-transactional engines, a failure on row 500 would leave the first 499 rows committed. To maintain consistency, we should recommend that users wrap execute_many() in an explicit transaction when portability across MySQL and MariaDB is required. - Performance Delta: While the fallback provides a consistent API, it does not offer the network round-trip savings of the native protocol. Each row in the fallback requires a separate request/response cycle, effectively acting as a "C-speed" foreach loop. /Georg --=20 Georg Richter, Staff Software Engineer Client Connectivity MariaDB Corporation Ab --0000000000005ac1bd064f6aa5bf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Kamil,

Thank you for the detail= ed feedback. I=E2=80=99ve updated the RFC to reflect several of your points= , specifically regarding the separation of concerns. To keep things clean, = everything not directly visible to the user (metadata caching, extended met= adata, auth plugins) will be moved to separate PRs.
=C2=A0
Extended Metadata:
It looks like you have some formatting issues there. I don't
understand what this feature is, how to use it, nor how it will be
exposed. It needs more explanation.

In the table, you mention that it doesn't require any user-facing
changes, but it does. The returned structure is extended with two new
fields from what I can see. Additionally, the PoC changes existing
field values too.

Extended metadata wil= l go to a separate PR - we only need to map MYSQL_TYPE_LONGTYPE=C2=A0+ json= to MYSQL_TYPE_JSON. and MYSQL_TYPE_BLOB=C2=A0+ vector to MYSQL_TYPE_VECTOR= .=C2=A0
=C2=A0

----------------------------

Prepared Statement Metadata Caching:
In theory I am all for adding transparent improvements and they don't usually require an RFC, but in this case, I don't understand how the caching will be done. Are you saying the user will need to cache the
metadata somewhere? Is the cache shared across requests? How will the
cache be invalidated/cleaned?
Right this will also be = implemented via PR

From the implementation, it seems like the cache is only per prepared
statement, i.e. only the first execute() call will fetch the metadata.
If that's all it is then it can be made into a simple PR without RFC. But in the table you say "Avoids resending metadata for repeated
prepares" which confuses me.
=C2=A0
The= caching mechanism is designed to be transparent within the mysqlnd layer. = Historically, MySQL and older MariaDB versions sent metadata during the PRE= PARE phase and again for every subsequent EXECUTE.

MariaDB has optim= ized this: the server now only sends metadata during EXECUTE if the table s= tructure or definition has changed. By implementing caching in mysqlnd, we = can skip the redundant allocation and parsing of metadata packets. This is = managed internally by the driver and does not require the user to manually = clean or invalidate the cache.


> Performance Improvements: Existing PHP functions that prepare and exec= ute statements, such as mysql_execute_query() (introduced in PHP 8.2), coul= d be updated to use execute_direct when connected to a MariaDB server. This= would allow one-time statements to bypass the extra round-trip and metadat= a fetch, providing immediate performance gains.

Does this refer to caching? How does caching improve the performance
of a one-time executed prepared statement? If that's not what it
refers to, then what did you mean?

Sorr= y I forgot to remove this before I published it. MariaDB supports direct ex= ecution by sending prepare and execute together - this doesn't seem to = work with php_streams.=C2=A0

----------------------------

Progress Indication:
I'm quite worried about this one. PHP doesn't have native asynchron= ous
support yet. How do you intend for this to work when a built-in PHP
function can be suspended and possibly interrupted completely while
the MySQL is processing the request?

To= clarify: this is not asynchronous in the Fiber sense. It is a synchronous = "keep-alive" mechanism. During long-running operations (like ALTE= R TABLE, LOAD DATA, or complex INSERT...SELECT statements), the server send= s "progress" packets.

Currently, mysqlnd ignores these, wh= ich can lead to connection timeouts or a lack of feedback for the user. By = supporting these packets, the driver stays active, preventing timeouts and = allowing the application to provide real-time feedback to the user or logs,= ensuring the client knows the server is still alive and working.
=C2=A0

----------------------------

PDO Considerations:
Why is PDO out of scope? PDO is the main database extension in PHP. If
any new feature is added, then it should probably be exposed via PDO
first. What do you mean by "PDO_MySQL currently does not support array=
binding or bulk execution semantics"?

<= div>PDO is currently out of scope only to ensure a manageable and focused R= FC. Implementing these features in mysqlnd provides the necessary binary fo= undation. A future "PDO Bulk RFC" can then leverage this work acr= oss all drivers (PostgreSQL, SQLite, etc.) without needing to reinvent the = protocol-specific logic for the MySQL/MariaDB ecosystem. We are building th= e engine first; the unified PDO interface can follow.=C2=A0

----------------------------

Server capabilities:
I don't think it's more robust to detect MariaDB server by the lack= of
CLIENT_LONG_PASSWORD than it is by the server_version. What if MySQL
decides to remove that flag in the future? What if derived products
don't follow the same convention?

Why is it necessary to expose this information to the PHP user via
server_capabilities and ext_server_capabilities? I can't find anywhere<= br> that you explain what information is presented in these properties or
how to use it.

There is a fundamental d= ifference between what a server might do (Version) and what it can do (Capa= bilities). A version string is a static label. Capability flags, however, r= eflect the server=E2=80=99s current build-time features and runtime configu= ration.

Exposing ext_server_capabilities allows developers and frame= works to implement reliable feature detection. Without these flags, a devel= oper cannot safely use functions without risking a fatal error or needing t= o wrap every call in a try-catch block.

=C2=A0-----------= -----------------

Memory Consumption:
Why is using generators going to use less memory? How does the new
function use generators that it's able to provide performance
improvements without reading all of the data from the generator
beforehand? This needs to be clearly explained so that it can be
explained in the PHP manual later. What is the difference between
these two:

```
$stmt->executemany("sd", $generator);
$stmt->executemany("sd", iterator_to_array($generator));
```

The performance improvement is not derived from the G= enerator itself, but from the reduction of network round-trips.

- st= andard foreach + execute(): 1,000 rows =3D 1,000 "Send" packets += 1,000 "Response" packets.
- execute_many(): The driver pulls = rows from the iterable and serializes them into a large internal network bu= ffer. 1,000 rows =3D 1 large Send packet + 1 Response packet.

Regard= ing iterator_to_array($generator): This forces PHP to allocate memory for e= very single row simultaneously. If you are importing 1 million rows, PHP mu= st allocate memory for 1 million arrays before the first byte is ever sent.= By using the Generator directly, we maintain O(1) memory usage, fetching a= nd serializing only one row at a time.

=C2=A0
----------------------------

mysqli_stmt::executemany implementation:
You only mention the OO-style method in the RFC. Will the procedural
style counterpart also be implemented?

= yes - OO and procedural functions use the same code.=C2=A0

The name doesn't follow mysqli naming standard. It should be called
mysqli_stmt::execute_many. However, I wonder what made you land on
this name instead of mysqli_stmt::bulk_execute?

There should be no $types parameter IMHO. The data should either
always be treated as text, or it should be bound using the PHP type,
i.e. autodetected. I believe it was a mistake that bind_param() has a
$types parameter. All it serves is to cast the data silently into the
specified type. It has been a source of many silent bugs and a lot of
confusion. Additionally, giving users more type choices in
executemany() will not only create an inconsistency with bind_param(),
but also confuse PHP users who don't know if a variable is 8, or 16, or 32, or 64-bit.

You are absolutely ri= ght=E2=80=94the name should follow the underscore convention, so it will be= execute_many. Regarding the parameter order and the $types question, the p= roposed signature could be:

public mysqli_stmt::execute_many(iterabl= e $data, ?iterable $control =3D null, ?string $types =3D null): bool
pub= lic mysqli_stmt_execute_many(mysqli_stmt $stmt, iterable $data, ?iterable $= control =3D null, ?string $types =3D null): bool

By placing $control= and $types at the end as optional parameters, we can leverage PHP 8 named = parameters to resolve the conflict between ease-of-use and technical optimi= zation.

1. The "All-Strings" Path (Implicit Autodetection)=

To address your concern about autodetection: providing $types is no= t mandatory. If the user simply calls $stmt->execute_many($data), the dr= iver will implicitly treat all scalar values as strings. Since MariaDB and = MySQL handle implicit casting efficiently, this provides the seamless "= ;it just works" experience you requested for the majority of use cases= .

2. The "Strict-Typed" Path (Memory & Bandwidth Optim= ization)

While autodetection is convenient, specifying $types remain= s critical when resource optimization is a factor:

- Binary Efficien= cy: By specifying types like i (LONG) or 1 (TINYINT), we significantly redu= ce the network payload compared to sending everything as strings (up to an = 80% reduction for certain numeric columns).
- O(1) Memory Safety for Str= eams: Because the MariaDB bulk protocol requires type declarations in the p= acket header before the data rows are sent, the driver cannot "autodet= ect" binary widths from a Generator without reading the entire stream = into memory first. Providing the $types string acts as a contract that allo= ws for true, constant-memory streaming.

3. The "Control" P= arameter and Indicators

I agree that the Indicator Enum should be mo= ved to mysqli (mysqli_indicator). The $control array serves three critical = architectural purposes:

- Non-Destructive Transformation: It allows = developers to apply logic (like DEFAULT or NULL overrides) without modifyin= g the original $data array or stream. This is vital if the data source is a= read-only Generator or is being reused elsewhere in the application.
- = Security (Sensitive Data Overwrite): The $control array acts as a secure te= mplate. A developer can pass a raw data array but use the control array to = force a status column to "active" or a last_modified_by column to= a specific ID, ensuring sensitive fields are overwritten regardless of the= input data's content.
-Flexibility: It supports both a Global Mode = (one row that acts as a mask for the entire batch) and a Streaming Mode (an= iterable that advances in parallel with the data).

This design ensu= res that execute_many remains simple for beginners while providing the gran= ular control required by high-performance implementations.


Please be explicit what the failure conditions may be.

Like for all other functions you have to retrieve error mes= sages via mysqli_stmt::error/errno.
=C2=A0
How does get_result() work with executemany()? For example, if I
execute SELECT ?,? with [[1,2], [3,4]] will I get that same structure
in the result set? Is the sequence guaranteed? Is there a way to
distinguish which row came from which input data row? What if one
SELECT produces two rows and the other zero? Will it still be a single
mysqli_result?

As mentioned in the begi= nning of the RFC, execute_many is used for UPSERT (insert, replace,delete a= nd update) operations. However it might contain a result set, e.g. by havin= g a RETURNING clause.
=C2=A0

How does it work with store_result() and also with unbuffered result
sets? I assume it's all treated the same as a normal execute(),
correct?

Yep=C2=A0

How does this affect insert_id? MariaDB documentation says that only
"the first auto-generated ID" is returned, which is contrary to w= hat I
expect: the last. Does the value propagate to PHP? If so, why?

This behavior matches the standard MySQL C API and = protocol where mysql_insert_id() returns the first ID of a multi-row insert= , not the last:

MySQL [test]> trunca= te table t1;
Query OK, 0 rows affected (0.075 sec)

MySQL [test]&g= t; insert into t1 (name) values ('foo'),('bar'),('PHP&#= 39;);
Query OK, 3 rows affected (0.017 sec)
Records: 3 =C2=A0Duplicat= es: 0 =C2=A0Warnings: 0

MySQL [test]> select last_insert_id()\G;<= br>*************************** 1. row ***************************
last_i= nsert_id(): 1
=C2=A0

From what I can see, it is possible to request that MariaDB reply with
individual result sets. Wouldn't that be better? How would that impact<= br> the performance?

If you require the IDs= of every row in a bulk insert, the MariaDB RETURNING clause should be used= :

INSERT INTO t1 (name) VALUES (?) RETURNING id


How about atomicity? If I make 3 INSERTS and the middle one fails,
will the other two execute?

For transac= tional engines like InnoDB, atomicity is guaranteed. If a protocol error oc= curs or a constraint is hit mid-batch, the server handles the rollback, ens= uring the database remains in a consistent state.
=C2=A0

Indicator enum should contain only two values: IGNORE and DEFAULT.
Null is already a PHP data type and does not require an indicator.
None serves no purpose (what would even happen if someone used it).
Adding an Indicator enum will create an inconsistency with the
existing feature set which do not support it.
An enum should be defined by the extension in which it will be used,
i.e. mysqli. The mysqlnd extension shuld not expose any user-facing
APIs.

See my comments about new execute= _many parameters (control). Even None is used to indicate to use the origin= al value from $data block.
NULL and mysqli_indicator::Null are in= deed the same - however it is more consistant with other drivers (e.g. ODBC= and OCI) to have a Null indicator.
=C2=A0

When connected to the MySQL server, how will executemany() behave? I
assume there will be an emulated support for it in mysqlnd, but how
will you ensure consistency between these two implementations? Making
the function no-op with MySQL should be out of the question.

When connected to a server that does not support the = native bulk protocol (such as standard MySQL), execute_many() will transiti= on to a fallback emulation layer. This ensures that the function remains a = reliable, portable API for developers regardless of the backend. However, t= here are inherent limitations to this emulation:

- Limited Indicator= Support: Since the standard MySQL COM_STMT_EXECUTE protocol does not under= stand MariaDB-specific indicators, the fallback will only support mysqli_in= dicator::Null (translated to a standard SQL NULL) and mysqli_indicator::Non= e. Indicators like DEFAULT and IGNORE are technically impossible to impleme= nt in the fallback without complex SQL string manipulation/rewriting, which= would introduce unacceptable CPU overhead.

- Transaction Safety &am= p; Atomicity: In native MariaDB bulk mode, the entire batch is sent as a si= ngle unit. In the fallback emulation, rows are executed one by one. For non= -transactional engines, a failure on row 500 would leave the first 499 rows= committed. To maintain consistency, we should recommend that users wrap ex= ecute_many() in an explicit transaction when portability across MySQL and M= ariaDB is required.

- Performance Delta: While the fallback provides= a consistent API, it does not offer the network round-trip savings of the = native protocol. Each row in the fallback requires a separate request/respo= nse cycle, effectively acting as a "C-speed" foreach loop.
<= div>=C2=A0
/Georg

--
Georg Richter, Staff Software Engineer
Client Connectivity
MariaDB Corporation Ab
--0000000000005ac1bd064f6aa5bf--