Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122575 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id BCBB41AD8F6 for ; Wed, 6 Mar 2024 00:22:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1709684574; bh=qDmusg/s3YgQ/vGy0jVjOObOWF3gk0VXHE+/yt2sGRE=; h=References:In-Reply-To:From:Date:Subject:To:From; b=P5EmaZN9u4uISHdMx9WFNS8QOlHE32H7hX9xN3f/K5NUEJBxepG/YvlJD4ejwN74/ lDlGJXxR56/Qfggqignj8LFLXqMDgbVexp0f8WiRwvWQVxQKlrs6FWzvMBqoPFcDX0 10tnmQJ/liJOHpym5OgJVu79nVr10ob8o/OdPp83Ao7bP2JmWch5VwlA3s47GvZpQI 20EB0NCEetqIeMC+9YJXjNsWS4aPwEng9h2K7HXWiVdrIDepv2rNCeTZ3NUhyRUbKT A/kKmwgMRfmW3iwdjDLJFrd7DlWAhpQem1a6CfL2j+24/7W7Rpxsl052e2dVLQOFzd kRLTRLmiLYKww== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id D022118004F for ; Wed, 6 Mar 2024 00:22:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_20,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 6 Mar 2024 00:22:53 +0000 (UTC) Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-412f0655d81so5717695e9.2 for ; Tue, 05 Mar 2024 16:22:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709684559; x=1710289359; darn=lists.php.net; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8jtj/gUGH2RyquWTQbBktn6GZScpfLq9twptXVsN8gY=; b=FUeByNVgBNf+BzGDArJ719saL/Wa2vMFHowpKthSTfrFZZjMMPBn70NWmUlThZLiyF P3Cy4ugL/VPSXXUqxSysFNV/9Gk9lUwnLQpvC5djB+tUdhSZtomFS8coDPv7+scAWvU1 NA+VpzHeY8Dz/B4Edpd2SyjJCgf149mu+Vz6cOQmJRaUQNVX6UjdDAn2U2Bq9hM0LtAa JrBMpsDs/e4bSz34UllMDNDFReK4/c1821JJ6gp0QdcAABTlPiRZitvBtDj4kYYP6OOV yeFwYLNUEHdurqlPD4Xl6L4hc+R+Fi8nPenfiskvjSbNDNG4/rADsyqB1OwutN+4n4HA miNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709684559; x=1710289359; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8jtj/gUGH2RyquWTQbBktn6GZScpfLq9twptXVsN8gY=; b=a95faAm4ZGdLVu7ee+pzHdneaEhfaXbeKwA+PVbbFCS7ueTj8ou9X9sEaRhO0eT4hg k2h0ErcK4aypHulLfKjLjf2Yvr02+xOpiVpEWG9IH++57fxW74FqzoDQ/kxsfAIy7H/4 qmyzsWc5KBGZh9pS1GJ+Yq+vdMdrCdqgmDgFHaQECeS+eTgb6lPFLlgVCRzhCvXGS1zy CDf5GZ5cOTSELMFdHvbv0z4DPKG7KdcY+juH8vPyTeAVjLegbj/vHtVEawN9p7UqIjKE 5gm9RpPDLLr/Tbk0VgeKPiO90z7yDaIVgib3GW1nL6pQ3oFlbiBdAKmACU15Rx9XZlBY Ia9Q== X-Gm-Message-State: AOJu0Yzl9CO4JbZAicT10RR5Egq4lBJ/rWnOnbVmbUSuPjEZAdVvPQ8H e0xVDLnSaXhmwldhMxeTc1BlkIplZhmWkZSQvq9+rwd+UuiAKmtNt9PYIMuPjA0z5yBFCXWDx1T knsiFMg8C5SYVTMUvMFh1D7lt/Wb3uYelLgUF X-Google-Smtp-Source: AGHT+IFf5fwIr+YQQiL31xZZ1lmQSvyGk6s/UldOn8kg+iL9VOITwaA40HorVoY0sJAvvYdbzKBDFsIMn+MR9bc7v5M= X-Received: by 2002:adf:cf0e:0:b0:33d:6301:91c5 with SMTP id o14-20020adfcf0e000000b0033d630191c5mr9397463wrj.3.1709684559206; Tue, 05 Mar 2024 16:22:39 -0800 (PST) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 References: <51122d7e-f218-4243-bbb8-dd59de2165b4@app.fastmail.com> <5817daaf-813c-430e-a12f-32908d1f7ec7@gmail.com> In-Reply-To: <5817daaf-813c-430e-a12f-32908d1f7ec7@gmail.com> Date: Wed, 6 Mar 2024 09:22:28 +0900 Message-ID: Subject: Re: [PHP-DEV] [Discussion] grapheme cluster for str_split function To: internals@lists.php.net Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: youkidearitai@gmail.com (youkidearitai) Hi, Larry Hi, Niels 2024=E5=B9=B43=E6=9C=886=E6=97=A5(=E6=B0=B4) 6:47 Niels Dossche : > > Hi Larry > Hi Yuya > > So first of all, I meant the error handling in cases like these: https://= github.com/php/php-src/pull/13580/files#diff-b8fe038d9d7539694593978bea5605= f38dde4bcb6a016865130590e45e23202eR852-R860 > The implementation still returns NULL here, so the signature is still inc= orrect. Either it should return false to match the other functions, or thro= w something and not return a value. > > On 05/03/2024 18:40, Larry Garfield wrote: > > On Tue, Mar 5, 2024, at 7:25 AM, youkidearitai wrote: > >> 2024=E5=B9=B43=E6=9C=885=E6=97=A5(=E7=81=AB) 5:52 Niels Dossche : > >>> > >>> Hi Yuya > >>> > >>> This sounds useful. > >>> > >>> I do have a question about the function signature: > >>> function grapheme_str_split(string $string, int $length =3D 1): array= {} > >>> > >>> This always returns an array. > >>> However, looking at your PR it seems you return NULL on failure, but = the return type in the signature isn't nullable. > >>> Also, from a quick look, it seems other functions return false instea= d of null on failure. So perhaps the return type should be array|false. > >>> > >>> What do you think? :) > >>> > >>> Kind regards > >>> Niels > >>> > >>> On 03/03/2024 00:21, youkidearitai wrote: > >>>> Hi, Internals > >>>> > >>>> I noticed PHP does not have grapheme cluster for str_split function.= , > >>>> Until now, you had to use the PCRE function's \X. > >>>> > >>>> Therefore, I try create `grapheme_str_split` function. > >>>> https://github.com/php/php-src/pull/13580 > >>>> It is possible to convert array per emoji and variation selectors us= ing ICU. > >>>> > >>>> If it's fine, I'll create an RFC. > >>>> > >>>> Regards > >>>> Yuya > >>>> > >> > >> Hi, Niels > >> > >> Thank you for your comment. > >> Indeed, returns false is make sense. > >> > >> Therefore, I changed to returns false when invalid UTF-8 strings. > >> > >> Regards > >> Yuya > > > > Many legacy functions return false on error, but that is widely regarde= d as bad design. Please do not continue bad design. > > I agree that returning false on error isn't ideal for exceptional cases, = that's what exceptions are for. > Looking at the other grapheme functions makes me wonder though how consis= tent this would be, especially w.r.t. intl_get_error_*() and intl_error_nam= e(). > > > > > Right now, the best "standard" error handling mechanism available is ex= ceptions. false (or null) can very easily lead to incorrectly using that v= alue as though it were valid, when it's not, which will sometimes cause a f= atal error and sometimes cause a security leak. > > > > If the input value cannot be logically processed, that's an exception. = (Or Error, perhaps.) > > > > --Larry Garfield > > Kind regards > Niels Thank you so much for advice. Indeed, This current grapheme* functions seems inconsistent. Therefore, it's one thing when returns null, throws any exception. Shall we do so just for the grapheme_str_split function? Regards Yuya --=20 --------------------------- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai -----------------------------