Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:79010 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 78026 invoked from network); 19 Nov 2014 20:06:27 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 Nov 2014 20:06:27 -0000 Authentication-Results: pb1.pair.com header.from=tjerk.meesters@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=tjerk.meesters@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.192.174 as permitted sender) X-PHP-List-Original-Sender: tjerk.meesters@gmail.com X-Host-Fingerprint: 209.85.192.174 mail-pd0-f174.google.com Received: from [209.85.192.174] ([209.85.192.174:53972] helo=mail-pd0-f174.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 82/00-12113-F38FC645 for ; Wed, 19 Nov 2014 15:06:23 -0500 Received: by mail-pd0-f174.google.com with SMTP id w10so1505566pde.5 for ; Wed, 19 Nov 2014 12:06:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=xuu5LeBXAnk6stui2ZNJVyJUMuRKFQWhCXcwAiQ9PaU=; b=JdkLxbk4IO6nMkdTDNyVO3c8fq7cVCfIBNhiZy7yYF9z5g+hJRa+68fdfxotqmYw0a NSefFnwUp+LsPDmG53vgHccPfwmJNfo0q13U1mSz0FiRJwrmFSFsYTLZIGS13h0y1OBA VrC5zOVD2rAZTSYcq4FVtdfSaGom6GBYl/OTvtB5dLYevegR0UKUL/7hYtyJQjqogVnJ kcKSR5PmFknXgc9pqbOjBTtyUQcrw2iViRtS+2JAZrNfpGvZJwZoCfXwq6RiCjY6VHKO TvUxRDU6E3aju3yAqZNuktbFzDbLxVYJQVKMsUSe0x2ZEepBv7STUzk6ZlwijwgvuWvy ZffA== X-Received: by 10.70.51.98 with SMTP id j2mr33884550pdo.1.1416427580328; Wed, 19 Nov 2014 12:06:20 -0800 (PST) Received: from tjerks-imac.gateway.2wire.net (bb121-7-198-24.singnet.com.sg. [121.7.198.24]) by mx.google.com with ESMTPSA id yw1sm88376pbb.52.2014.11.19.12.06.18 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 19 Nov 2014 12:06:19 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.1 \(1993\)) In-Reply-To: <546CC4C8.7010001@gmx.de> Date: Thu, 20 Nov 2014 04:06:15 +0800 Cc: PHP Internals Content-Transfer-Encoding: quoted-printable Message-ID: <8E49EAC8-3681-4E36-BDAD-163D292193DD@gmail.com> References: <82139FDD-8D8B-43D9-B811-ECC1FFE6E8A6@gmail.com> <546CC4C8.7010001@gmx.de> To: Christoph Becker X-Mailer: Apple Mail (2.1993) Subject: Re: fgetcsv incompatible with fputcsv From: tjerk.meesters@gmail.com (Tjerk Meesters) > On 20 Nov 2014, at 00:26, Christoph Becker wrote: >=20 > Tjerk Meesters wrote: >=20 >> Hi list, >>=20 >> As I was fiddling with CSV data reading and writing I noticed that = fgetcsv() is inherently incompatible with fputcsv() when it comes to the = enclosure escape character that=E2=80=99s used. >>=20 >> Example: http://3v4l.org/LHEZj >>=20 >> The above example code demonstrates how, by default, fputcsv() = encodes a backslash as is but fgetcsv() will treat that same backslash = as the enclosure escape character as well as the enclosure character = itself; this is rather surprising behaviour and imho unnecessarily = complicated. >>=20 >> I would suggest changing the behaviour in such a way that: >> a) the default enclosure escape character for fgetcsv() is a double = quote. >> b) fgetcsv() only treats the escape character as =E2=80=A6 an escape = character. >>=20 >> Due to the kind of change BC can=E2=80=99t be maintained, so I=E2=80=99= d propose this change for PHP 7. >>=20 >> If anyone has violent objections to the above, or thinks that an RFC = should be drawn up, do let me know =E2=80=A6 otherwise I=E2=80=99ll = commit the change into master by next week or so. >=20 > Are you aware of ? It seems = this > very inconsistency has been reported a few years ago, but has been > tagged as "Wont fix" back then. Actually that bug report seems to suggest that fputcsv() uses backslash = to encode enclosure characters, but AFAICT it doesn=E2=80=99t. And then there are bug reports like = https://bugs.php.net/bug.php?id=3D67566 which were fixed but really just = made the situation worse =3D( >=20 > also seems to deal with this > inconsistency, and had been tagged as "Not a bug". >=20 > So maybe an RFC is appropriate? Yeah, I didn=E2=80=99t realise the can of worms until I opened it; = I=E2=80=99ll round up all the bug reports and run them against whatever = RFC I can get my hands on. PS: Favourite quote from the semi-authoritative spec of Perl_CSV: = http://rath.ca/Misc/Perl_CSV/CSV-2.0.html#csv: > Given that the essence of CSV files is simplicity, I have decided to = reject all escape and escaped characters with the exception of quoation = marks appearing within quotation marks =E2=80=A6 Good times :) >=20 > --=20 > Christoph M. Becker