Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:121866 Return-Path: <takeda@youmind.jp> Delivered-To: mailing list internals@lists.php.net Received: (qmail 47427 invoked from network); 30 Nov 2023 00:26:50 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 30 Nov 2023 00:26:50 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id E00A8180038 for <internals@lists.php.net>; Wed, 29 Nov 2023 16:26:57 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: <takeda@youmind.jp> Received: from mail-vk1-f172.google.com (mail-vk1-f172.google.com [209.85.221.172]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for <internals@lists.php.net>; Wed, 29 Nov 2023 16:26:57 -0800 (PST) Received: by mail-vk1-f172.google.com with SMTP id 71dfb90a1353d-4abf80eab14so120273e0c.2 for <internals@lists.php.net>; Wed, 29 Nov 2023 16:26:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=youmind.jp; s=google; t=1701304007; x=1701908807; darn=lists.php.net; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bRfYX4G4IKgInZ6QQyiCfq3Cg069l7rXgBIdmZtqblw=; b=ce76scbKz24Y95+B06/grm1LKMVeBO1JnkoR0fTzdPe7WfWJ3P31prM4VtFiGhiRtE TMjIKixWwAAfaRhSMeCknr+Uy9okEXr2bvMHTWF3UeaNicURpXhJV7WeJ34yPKPrPX7V fMhbLCB74eI4ClxGYbYe5ZqaPodMa5fY1hMy7u0nqMndygA435xrGpRkCvN5wJX9ydMP m3MFMFwy83tPfHTAyDuOpx4oBfxsIQoeTi1nk2A2TVQ8JC+Dcc8cYKo2gOcKlOOMMqsz hjWs/p0QeqhW+pkoASUeDP5+Zbkbw+ulI3bvPHiCJb8HrFXXTNqN+ZP2MKwU2wAdW9cV BEMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701304007; x=1701908807; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bRfYX4G4IKgInZ6QQyiCfq3Cg069l7rXgBIdmZtqblw=; b=JsqO6a6hUPDkfdynJyFd+P/jelgv1dPvS3bv2+DksymNap2xvjaYQCBTu0yY4vVwJA YgjrvTc560wQVY1OF4aSMGT5mdcjRDBXyGo8dEb3uwi1qFvBes9mhJEENa8AtcykYSAN Ci3eEfUOGi8+1TG97gKaKfAgvxuNqZF9gfvepgcrUeJWmL9MfR5aoIdCTRcLWO9D79jH 66BHMJ3I5fqGFbh+WHCfxYWqVuWjJPBHbgN3m7cqFvD645aoydg2tBI3xdHXCL+pH+VP jNuTIO91uP8FkkqCaH1QVK6X8Jh4o4On8017lk+HENWNqcDWOERVSyBwy4GrNX6ZIBTN 8T4g== X-Gm-Message-State: AOJu0Yya2Tc1mlrF+1u/IQMCN7d0J/UikzTEBoJ7LShJmWU+Vv0Slg/R OONOY7rZeztARIbWQo87nLW8daNk1IRdK1X8aXZtaw== X-Google-Smtp-Source: AGHT+IF51vcbJxcx0LAQaAIGBO53wC4+SI+zETuVhxSwKgp4Npgd0EwE1mLYCh8ybLDCsKgc+feErAaA6vdBFL137YQ= X-Received: by 2002:ac5:c396:0:b0:4b2:919f:9c5f with SMTP id s22-20020ac5c396000000b004b2919f9c5fmr2304766vkk.15.1701304007147; Wed, 29 Nov 2023 16:26:47 -0800 (PST) MIME-Version: 1.0 References: <CAJmy8YFy5RmW1yWuA6akPU31xRz_i1io5uWgt1GWo0n5rzY9EQ@mail.gmail.com> <1BA05C1A-AFAE-4E86-BAA2-420B22549519@gmail.com> <0D8856BC-DDEE-47F8-8C59-7F4DC7A64237@woofle.net> <CAGBsUrd=5WhKERySavP=9yQM-RSkVCwhTJ2oQ4tAQOKTNFoCew@mail.gmail.com> <A00495F3-EEB5-4D49-9433-84E94C6920D0@gmail.com> <CAJmy8YG91ki8M-n7id=DUDUsG+1XBvtpSNdf6sJx3Y1B1Rytsg@mail.gmail.com> In-Reply-To: <CAJmy8YG91ki8M-n7id=DUDUsG+1XBvtpSNdf6sJx3Y1B1Rytsg@mail.gmail.com> Reply-To: Kentaro Takeda <takeda@youmind.jp> Date: Thu, 30 Nov 2023 09:26:36 +0900 Message-ID: <CAON3zHCC8Fx+U+fHTz1B3CkC==b9m-Gdxf96BVi3hPNG1uOY_w@mail.gmail.com> To: Hans Henrik Bergan <divinity76@gmail.com> Cc: Claude Pache <claude.pache@gmail.com>, Kamil Tekiela <tekiela246@gmail.com>, Dusk <dusk@woofle.net>, PHP internals <internals@lists.php.net> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ? From: internals@lists.php.net ("Kentaro Takeda via internals") > The migration path is to convert the legacy-encoding PHP files to UTF-8. Please take a look at the following code. This is a part of the code that I am actually maintaining in the latest version of php. ```php <?php pg_connect(/* omission */); // The database server expects clients to perform queries in SJIS. // Depending on the settings, it may not be necessary to specify it explici= tly. pg_set_client_encoding('SJIS'); $res =3D pg_query('select * from =E8=A1=A8'); ``` Unfortunately, this code breaks if I simply convert it to UTF-8. In the "Usage statistics of character encodings for websites" published by W3Techs, it is true that encodings other than UTF-8 are rarely used. However, this is only **within the range that can be observed from the outside as a website** . As the code above shows, PHP covers a much wider area. In addition to external connections, for example, SimpleXML and DOMDocument also handle character codes internally, so they can break down using the same logic as in the example above. As Yuya says, the conversion itself is difficult, and even if you can convert it, it may not be enough, so as a php user from a culture that uses multi-byte characters, please be aware of this. Kentaro Takeda