Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:62419 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 29020 invoked from network); 23 Aug 2012 12:05:20 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Aug 2012 12:05:20 -0000 X-Host-Fingerprint: 88.149.172.234 88-149-172-234.v4.ngi.it Received: from [88.149.172.234] ([88.149.172.234:5690] helo=localhost.localdomain) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 9B/45-29773-D7C16305 for ; Thu, 23 Aug 2012 08:05:18 -0400 Message-ID: <9B.45.29773.D7C16305@pb1.pair.com> To: internals@lists.php.net Date: Thu, 23 Aug 2012 14:04:41 +0200 User-Agent: tt v. 1.0.5; www.icosaedro.it/tt MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 References: <503388D3.6060403@hoa-project.net> X-Posted-By: 88.149.172.234 Subject: Re: UTF-8 files and include From: salsi@icosaedro.it (Umberto Salsi) ivan.enderlin@hoa-project.net ("Ivan Enderlin @ Hoa") wrote: > Hello, > > Some of my users & contributors have met an issue with files containing > UTF-8 on certain Windows configurations (but they actually did not found > the difference). Any idea why? > The issue does not appear on Linux, BSD or Mac OS system, only for > certain Windows. > > What do we need to check? --enable-zend-multibyte, some php.ini magic > parameters, some ENV variables? > > Best regards. > > -- > Ivan Enderlin > Developer of Hoa > http://hoa.42/ or http://hoa-project.net/ > > PhD. student at DISC/Femto-ST (Vesontio) and INRIA (Cassis) > http://disc.univ-fcomte.fr/ and http://www.inria.fr/ > > Member of HTML and WebApps Working Group of W3C > http://w3.org/ > If you are experimenting problems with file paths and files that do exist on a system but not in another system (that is, source files containing non-ASCII chars in either their name or their path), the issue may depend on the different configuration of the locale configuration. Under Unix/Linux check the environment variable LC_CTYPE, whose value might be something like "language_country.UTF-8", where "UTF-8" is the encoding of file names and paths. File names and file paths in every include*() and require*() must then be UTF-8 encoded as well. Windows uses the UTF-16 encoding for file names. Programs unaware of this encoding (as the PHP interpreter still is) MUST use the current Windows code page as set in the "Language and Regional Settings" of the control panel. Typically LC_CTYPE evaluates to something like "language_country.1252" on systems configured in the western countries, where 1252 is a Windows code page very similar, but not equal, to ISO-8859-1; so file names and paths specified in the require*() and include*() must be encoded accordingly. Under Windows, UTF-8 IS NOT a valid code page. More details about this issue in my comment to the bug 47096: https://bugs.php.net/bug.php?id=47096 Regards, ___ /_|_\ Umberto Salsi \/_\/ www.icosaedro.it