Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98720 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 32389 invoked from network); 1 Apr 2017 16:03:15 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 1 Apr 2017 16:03:15 -0000 Authentication-Results: pb1.pair.com header.from=rasmus@mindplay.dk; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=rasmus@mindplay.dk; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain mindplay.dk from 74.125.83.54 cause and error) X-PHP-List-Original-Sender: rasmus@mindplay.dk X-Host-Fingerprint: 74.125.83.54 mail-pg0-f54.google.com Received: from [74.125.83.54] ([74.125.83.54:34413] helo=mail-pg0-f54.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 61/F2-02743-24FCFD85 for ; Sat, 01 Apr 2017 11:03:15 -0500 Received: by mail-pg0-f54.google.com with SMTP id 21so91323502pgg.1 for ; Sat, 01 Apr 2017 09:03:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mindplay-dk.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=7H9C9mAUI3rF44KecEbFNf8tBlnFQF8fGhGXWF9scwg=; b=uFrI1WyWNs5EitAeLGhBA05uzARz3hegpDNKjzSbYqQsD1bkDRJ9pD2QCv4O7S6adr VwrVs5Wz9jlhc50gvpW/AbJPUQ2TmJ6FmJXlE3/aumPo1hNy9WwpON7QLAbPfXhOZYeU tMVsxI3s7zklHR+RUrcY6rK9hOrk3fU8sW2Rxw+hYJbkUqfUrwxRosR8s4ZqTisKaHkm FwtKO1/QB0SMN29Dua2RPICPuHaW/L1EKKYTK9s3i9tHQpAPT/kHsCNUnPMcPd2lUINa R5xSqHUI0Wvyvf061j8yA4l5AdibixvuDhDkU9xMpWdG6QYQhBzVh/OPzow8jjlk2ru4 f00g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=7H9C9mAUI3rF44KecEbFNf8tBlnFQF8fGhGXWF9scwg=; b=q3ynoa1kSeybNJa7UCw2sVazeVPoy+mfVDcjZZhLQt+n0mGY1Ar6BBy81dvDqAkZHI QECxbrc8ieAEXN/+soD19PuJD8GJd5uuc13sa4Ziau/wcMznOOZrIf5SD4qw+EuCZk3a lwSLjKY1TpX1BtPrY1rc65VS39fwsLMNtBNNrelAXAUg0UQkDUxdia99vjGsMv33Iu5Z vUHZC1DMSr/BioMXiHoesHIoUX9Ikd3uGAIqQ58q0zdl1wjfjZgSvlRg5Solg9NPpLSy ZaXpsmg+YkBSbp08Kcfqg0XHV7bliKdHujFZfiONbqP5ZYi3iPgGbNHgY9Npcm+/RtsR KTZg== X-Gm-Message-State: AFeK/H1M4YDZBX8QzeXumrMoplzV8VwbTzTr48+hhgf29Vt/KkRnkiz9/1Y8T81bMrafLfa3lHeQnn3WQN12aA== X-Received: by 10.84.130.99 with SMTP id 90mr10194672plc.94.1491062591355; Sat, 01 Apr 2017 09:03:11 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.145.150 with HTTP; Sat, 1 Apr 2017 09:03:10 -0700 (PDT) In-Reply-To: <187eb0be-90b9-f7cd-b8bd-888915429796@fleshgrinder.com> References: <187eb0be-90b9-f7cd-b8bd-888915429796@fleshgrinder.com> Date: Sat, 1 Apr 2017 18:03:10 +0200 Message-ID: To: PHP internals Cc: Anatol Belski Content-Type: multipart/alternative; boundary=94eb2c124e041890e2054c1d10fb Subject: Re: [PHP-DEV] Directory separators on Windows From: rasmus@mindplay.dk (Rasmus Schultz) --94eb2c124e041890e2054c1d10fb Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable 10 thumbs up ;-) But this really demonstrates how badly we need this function - I bet any number of those points may or may not be covered by any number of implementations in the wild. It would be so nice to have this done "right", once and for all. On Sat, Apr 1, 2017 at 2:42 PM, Fleshgrinder wrote: > On 4/1/2017 2:01 PM, Anatol Belski wrote: > > 1. optionally - yes, otherwise it should do platform default > > 2. no, this kind of operation is a pure parsing, no I/O related checks > needed > > 3. irrelevant, but can be defined > > > > Other points yet I'd care about > > - result should be correct for target platform disregarding actual > platform, fe target Linux path Windows, or Windows path on Mac, etc. > > - validation, particularly for reserved words and chars, also other > platform aspects > > - encodings have to be respected, or UTF-8 only, to define > > - probably should be compatible with PHP stream wrapper namespaces > > > > > > Thanks > > > > Anatol > > > > 1. How do you envision that? If the path is `/a/b/../c` where only `/a` > exists right now? It's unresolvable, assuming that `../` points to `/a` > is wrong if `b/` is a symbolic link that points to `/x/y`. > > 2. Here I agree, casing cannot be decided without hitting the > filesystem. Some are case-sensitive, some insensitive, and others > configurable. > > 3. Does not matter for Windows itself, it is case-insensitive. > > (I continue the numbering for the points you raised.) > > 4. How would we go about normalizing a Windows path to POSIX? `C:\a` is > not necessarily the same as `/a`, or should it produce `C:/a`? > > 5. =F0=9F=91=8D > > 6. I vote for UTF-8 only. We already have locale dependent filesystem > functions, which also makes them kind of weird to use, especially in > libraries. Another very important aspect to take care of this point is > normalization forms. Filesystems generally store stuff as is, that means > that we can create to files with the same name, at least by the looks of > it, which are actually different ones. Think of `=C3=A4` which can also b= e > `a=CC=88`. It is generally most advisable to stick to NFC, because that i= s > also how users usually produce those chars. > > 7. =F0=9F=91=8D just forward I'd say. > > 8. Collapse multiple separators (e.g. `a//b` ~> `a/b`). > > 9. Resolve self-references, unless they are leading (e.g. `a/./b` ~> > `a/b` but `./a/b` stays `./a/b`). > > 10. Trim separators from the end (e.g. `a/` ~> `a`). > > -- > Richard "Fleshgrinder" Fussenegger > --94eb2c124e041890e2054c1d10fb--