Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:121831 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 35386 invoked from network); 28 Nov 2023 20:48:38 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 28 Nov 2023 20:48:38 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id A8BB118002B for ; Tue, 28 Nov 2023 12:48:44 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 28 Nov 2023 12:48:44 -0800 (PST) Received: by mail-ot1-f53.google.com with SMTP id 46e09a7af769-6d81fc0ad6eso1967588a34.2 for ; Tue, 28 Nov 2023 12:48:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701204516; x=1701809316; darn=lists.php.net; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xcvKGHYnziZuWKP243ITWnhgFlIz+cxdJI3szklzFQI=; b=O52S25nUHui5bggANN8DN8yM0Ph0ATa93tRh2v7AAU1QnS8pP0khoJC93kBU9w3uoe rOPWf7V4aB1+NSA6cemEoPQ/qomXQFoc0cQz9sNnZSldfS8YGc/VqJKoIeAxqeLpe6wg LonHVFYSNUR/OxmlJxz1e+9bJyKIVWNXmYMjsNFkl1F4MZsNwU4WaAjncL+6Y93yKoz7 SP+DwNF7eE+aMDvoq5wSBvXnOwhNI61iXlEY2UCxsb7qzX31fbCwdynauFUdFM3XELB/ d6LtaYnrH8QMrTHKKCrX3PaBC1Y/uvWe1tIIF38g6r1J6JPXMSQeHhn2AsrCcxftwg5/ xJeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701204516; x=1701809316; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xcvKGHYnziZuWKP243ITWnhgFlIz+cxdJI3szklzFQI=; b=A4eCqhOFbnqo8tJfA2+z+OiLIEF/vyzE6xyDwjyqFu+SrdFhpcLuuYoCXL+JwhWFEB WYLRbg8/nqK94R8u17Tznx1IIzrQYCMniwGJO0K4lAqotCNPiWj93lr13gpP2ZjENg5K /g4F849zuDBpueEgvGAtl5CchSQPqh8A+znnox+43Td8bykgvFcIc2WB3lShK/Y/Z2H5 SYo+uEiVbzKenxaHs2FiJ5mBquAMVlzgMnOHo1GBbh1Eq6v0Vty2SzI1olFGMXcHtZA3 DtnpU/mGlndY64dfa85vYhPsAgRk+l+zIE7hOvBqjHJERiAfvkXyGOOeD0Vr0YKJnkXe T1kg== X-Gm-Message-State: AOJu0YwK+ayvzhB6t1i2eIYkGxQv4Mz2AxAAMg2sLwVARHdwIi+Q87NI O1HHQU5Rx6xzhqZc8CiYKk6MR0G3vvqdRfJ2PCU= X-Google-Smtp-Source: AGHT+IGwhb2gtVEmK62dLq5ztYeAkqApsIrX7aYZCw5dkZN8dahXpCvTm2fB7IPhQff5R9dY1Vak+hi12/N22ZIGW4U= X-Received: by 2002:a05:6871:691:b0:1fa:261f:3130 with SMTP id l17-20020a056871069100b001fa261f3130mr16817820oao.26.1701204515830; Tue, 28 Nov 2023 12:48:35 -0800 (PST) MIME-Version: 1.0 References: <1BA05C1A-AFAE-4E86-BAA2-420B22549519@gmail.com> <0D8856BC-DDEE-47F8-8C59-7F4DC7A64237@woofle.net> In-Reply-To: Date: Tue, 28 Nov 2023 21:47:59 +0100 Message-ID: To: Claude Pache Cc: Kamil Tekiela , Dusk , PHP internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ? From: divinity76@gmail.com (Hans Henrik Bergan) > What is the migration path for legacy code that use those directives? The migration path is to convert the legacy-encoding PHP files to UTF-8. Luckily this can be largely automated, here is my attempt: https://github.com/divinity76/php2utf8/blob/main/src/php2utf8.php but that code definitely needs some proof-reading and additions - idk if the approach used is even a good approach, it was just the first i could think of, feel free to write one from scratch >Can you share a little more details about how this works? I hope someone else can do that, but it allows PHP to parse and execute scripts not written in UTF-8 and scripts utilizing BOM/byte-order-masks. >add that what's special about UTF-8 isn't that it's "fixed-endian". one of multiple good things about UTF-8 is that it's fixed-endian, and UTF8 don't need a BOM to specify endianess (unlike UTF16 and UTF32 which are bi-endian, and a BOM helps identify endianess used~) >If the solution is as easy as just converting the encoding of the source file, then why did we even need to have this setting at all? Why did PHP parser support encodings that demanded the introduction of I've read your question but don't have an answer to it, hopefully someone else knows. On Tue, 28 Nov 2023 at 21:09, Claude Pache wrote: > > > > > Le 28 nov. 2023 =C3=A0 20:56, Kamil Tekiela a = =C3=A9crit : > > > >> Convert your PHP source files to UTF-8. > > > > If the solution is as easy as just converting the encoding of the > > source file, then why did we even need to have this setting at all? > > Why did PHP parser support encodings that demanded the introduction of > > this declare? > > It is not necessary as simple: because your code base may contain literal= strings, and changing the encoding of the source file will effectively cha= nge the contents of the strings. > > =E2=80=94Claude >