Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98718 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 22961 invoked from network); 1 Apr 2017 12:42:52 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 1 Apr 2017 12:42:52 -0000 Authentication-Results: pb1.pair.com header.from=php@fleshgrinder.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=php@fleshgrinder.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain fleshgrinder.com from 212.232.28.122 cause and error) X-PHP-List-Original-Sender: php@fleshgrinder.com X-Host-Fingerprint: 212.232.28.122 mx201.easyname.com Received: from [212.232.28.122] ([212.232.28.122:56453] helo=mx201.easyname.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id D4/F1-02743-A40AFD85 for ; Sat, 01 Apr 2017 07:42:50 -0500 Received: from cable-81-173-132-37.netcologne.de ([81.173.132.37] helo=[192.168.178.20]) by mx.easyname.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1cuIMj-00008D-3u; Sat, 01 Apr 2017 12:42:50 +0000 Reply-To: internals@lists.php.net References: To: Anatol Belski , Rasmus Schultz Cc: PHP internals Message-ID: <187eb0be-90b9-f7cd-b8bd-888915429796@fleshgrinder.com> Date: Sat, 1 Apr 2017 14:42:41 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-DNSBL-PBLSPAMHAUS: YES Subject: Re: [PHP-DEV] Directory separators on Windows From: php@fleshgrinder.com (Fleshgrinder) On 4/1/2017 2:01 PM, Anatol Belski wrote: > 1. optionally - yes, otherwise it should do platform default > 2. no, this kind of operation is a pure parsing, no I/O related checks needed > 3. irrelevant, but can be defined > > Other points yet I'd care about > - result should be correct for target platform disregarding actual platform, fe target Linux path Windows, or Windows path on Mac, etc. > - validation, particularly for reserved words and chars, also other platform aspects > - encodings have to be respected, or UTF-8 only, to define > - probably should be compatible with PHP stream wrapper namespaces > > > Thanks > > Anatol > 1. How do you envision that? If the path is `/a/b/../c` where only `/a` exists right now? It's unresolvable, assuming that `../` points to `/a` is wrong if `b/` is a symbolic link that points to `/x/y`. 2. Here I agree, casing cannot be decided without hitting the filesystem. Some are case-sensitive, some insensitive, and others configurable. 3. Does not matter for Windows itself, it is case-insensitive. (I continue the numbering for the points you raised.) 4. How would we go about normalizing a Windows path to POSIX? `C:\a` is not necessarily the same as `/a`, or should it produce `C:/a`? 5. ? 6. I vote for UTF-8 only. We already have locale dependent filesystem functions, which also makes them kind of weird to use, especially in libraries. Another very important aspect to take care of this point is normalization forms. Filesystems generally store stuff as is, that means that we can create to files with the same name, at least by the looks of it, which are actually different ones. Think of `ä` which can also be `ä`. It is generally most advisable to stick to NFC, because that is also how users usually produce those chars. 7. ? just forward I'd say. 8. Collapse multiple separators (e.g. `a//b` ~> `a/b`). 9. Resolve self-references, unless they are leading (e.g. `a/./b` ~> `a/b` but `./a/b` stays `./a/b`). 10. Trim separators from the end (e.g. `a/` ~> `a`). -- Richard "Fleshgrinder" Fussenegger