Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:72625 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 52724 invoked from network); 15 Feb 2014 10:46:30 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 15 Feb 2014 10:46:30 -0000 Authentication-Results: pb1.pair.com header.from=lester@lsces.co.uk; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=lester@lsces.co.uk; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain lsces.co.uk from 217.147.176.204 cause and error) X-PHP-List-Original-Sender: lester@lsces.co.uk X-Host-Fingerprint: 217.147.176.204 mail4.serversure.net Linux 2.6 Received: from [217.147.176.204] ([217.147.176.204:56317] helo=mail4.serversure.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 08/D0-45929-4854FF25 for ; Sat, 15 Feb 2014 05:46:29 -0500 Received: (qmail 28030 invoked by uid 89); 15 Feb 2014 10:46:26 -0000 Received: by simscan 1.3.1 ppid: 28023, pid: 28026, t: 0.1490s scanners: attach: 1.3.1 clamav: 0.96/m:52 Received: from unknown (HELO linux-dev4.lsces.org.uk) (lester@rainbowdigitalmedia.org.uk@81.138.11.136) by mail4.serversure.net with ESMTPA; 15 Feb 2014 10:46:25 -0000 Message-ID: <52FF465E.4040400@lsces.co.uk> Date: Sat, 15 Feb 2014 10:50:06 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24 MIME-Version: 1.0 To: Yasuo Ohgaki CC: PHP internals References: <52FF3BB7.8030408@lsces.co.uk> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] utf-8 filenames in phar files. From: lester@lsces.co.uk (Lester Caine) My previous post did not appear on the list ;) Yasuo Ohgaki wrote: > A lot of the current confusion does seem to be based around the Windows > Wide-API as documented in 'The Problem' section of that document. It would > seem that my 'naive' view of simply using UTF-8 strings is thwarted by these > problems?-- > > Unicode is like one name with several encoding. We cannot get away from > conversions, normalization especially. That is why personally I'm just looking at UTF8. Which is enough of a mine field on it's own, but since a large swath of what we are working with now is only UTF8 it does seem to be the right base going forward? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk