Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:64648 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 98143 invoked from network); 7 Jan 2013 19:14:30 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 7 Jan 2013 19:14:30 -0000 Authentication-Results: pb1.pair.com header.from=pierre.php@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=pierre.php@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.215.50 as permitted sender) X-PHP-List-Original-Sender: pierre.php@gmail.com X-Host-Fingerprint: 209.85.215.50 mail-la0-f50.google.com Received: from [209.85.215.50] ([209.85.215.50:43204] helo=mail-la0-f50.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id F9/94-08897-49E1BE05 for ; Mon, 07 Jan 2013 14:14:29 -0500 Received: by mail-la0-f50.google.com with SMTP id fs13so16934396lab.37 for ; Mon, 07 Jan 2013 11:14:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=VUwXE4B6+Xlw02WTZSIA67r68afNBQ2XuX3JNa+Zd3Q=; b=ATaM2r5uJ793LA0pPp2KR8ZYU6ystMVQ0Bdy7Eu6/vA0w+89h/qMvccOjxUi74/Su7 P7P7CtdnjQW2RXlLEAg6kcj1XFMu6wJyNIDtDdajXPLWwgI1D7j0+DxoY5CUvrOhSngB FUpVWlThugqBz01dAfUlda6EtFd8ENQoOQMkef79nhd04Yj2U0+vYMta7/pvfuNfHwmp dz6evw1E358AcIeRCfAKhYU8qE2032SEsltaYXlSOk1NSP2cyHB2bN1WojjveU2XNrN5 c7PI1OT8CZhHKxsPAhXLc5lGTkq0XtbeMgeT+zxe3DQKktw0HrtAn21H4FOvuiupw+er R0VA== MIME-Version: 1.0 Received: by 10.152.122.39 with SMTP id lp7mr59336538lab.0.1357586065533; Mon, 07 Jan 2013 11:14:25 -0800 (PST) Received: by 10.112.127.230 with HTTP; Mon, 7 Jan 2013 11:14:25 -0800 (PST) In-Reply-To: References: Date: Mon, 7 Jan 2013 20:14:25 +0100 Message-ID: To: Nicolai Scheer Cc: PHP internals Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [PHP-DEV] File-Paths exceeding MAX_PATH on Windows From: pierre.php@gmail.com (Pierre Joye) hi, On Mon, Jan 7, 2013 at 6:30 PM, Nicolai Scheer wrote: > Out of the urgent need to access files with a path longer than MAX_PATH on > Windows, I started some research. > At first I thought it might be a good idea to write my own stream wrapper > extension (e.g. file_long://.....) . > > Before I started, I tried to find out, why those paths don't work in the > current php code. > > According to [1] it is possible to use long_paths, if the path is prefixed > correctly, e.g. > > \\?\ > > for a local file path, and > > \\?\UNC\ > > for a UNC path. > > > I checked that fopen() and even open() in fact do work in C code with such > long paths when using the prefix. > > So I bumped up MAXPATHLEN in php.h and tsrm_config_common.h to 32786 and > recompiled a fresh php 5.3.20. > > Suprisingly a php script using a long path (including the prefix) did throw > an error. > > Tracing that error leads to > > plain_wrapper.c:914 expand_filepath ->expand_filepath_ex ->virtual_file_ex > > These are the lines, that produce the error (tsrm_virtual_cwd.c:1255): > > #ifdef TSRM_WIN32 > if (memchr(resolved_path, '*', path_length) || > memchr(resolved_path, '?', path_length)) { > return 1; > } > #endif > > Since there's a '?' in the string from the long path prefix the > virtual_file_ex fails at this point. > I did not quite understand the rationale behind this check. > Of course, both checked characters are invalid for a regular file path. > There seem to be some checking in tsrm_realpath_r() > for paths like > > \\?\Volume{62d1c3f8-83b9-11de-b108-806e6f6e6963}\foo > > If I remove those memchr lines, everything magically works, e.g. fopen(), > file_get_contents(), file_put_contents(), unlink(), rmdir(), mkdir(), etc. > > Only thing to do from userspace is to define the path as > > $path = "\\\\?\\x:\\long_stuff.......\\.....\\......\file.txt"; > > There are a few macros that get irritated (e.g. IS_ABSOLUTE_PATH, > IS_UNC_PATH) by the double double backslash in the path... Yes, we do not allow kernel path, on purpose, see my comment below. > My questions here: > > 1. What is the rationale behind the memchr checks for ? and *? Just > filtering invalid paths? When a path gets resolved (symbolic link, junction and the likes) there are many variations that need to be dealt with. > 2. Does allowing the "\\?\" prefix to bubble through the stream wrapper > layer (which effectively makes it usable) break anything? > 3. If not, is it possible to include this in php 5.3. or php 5.4? No, not even 5.5 imo, or ever :) > It would be indeed nice if the "\\?\" prefix was not needed in userspace > and php would do the work. But just for now I really would like to see php > support for long paths on windows at all. To my mind the changes needed for > the prefix workaround are function is minimal-invasive. Correct me, if I'm > wrong :) I would not ever expose that prefix to userland, the consequences and how we have to manage it are way too complicated for a user land scripting languages (even in C apps it is not recommended). A better solution I work on for previous php version (incl. 5.5 as I won't make it in time) is an extension which would override existing functions. Next major version (6) will support unicode filenames, which will solve the 255 chars horrible limitation. Cheers, -- Pierre @pierrejoye | http://blog.thepimp.net | http://www.libgd.org