Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:100745 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 94905 invoked from network); 22 Sep 2017 00:55:53 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 22 Sep 2017 00:55:53 -0000 Authentication-Results: pb1.pair.com smtp.mail=andreas@dqxtech.net; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=andreas@dqxtech.net; sender-id=unknown Received-SPF: error (pb1.pair.com: domain dqxtech.net from 209.85.215.43 cause and error) X-PHP-List-Original-Sender: andreas@dqxtech.net X-Host-Fingerprint: 209.85.215.43 mail-lf0-f43.google.com Received: from [209.85.215.43] ([209.85.215.43:57132] helo=mail-lf0-f43.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 74/29-62331-89F54C95 for ; Thu, 21 Sep 2017 20:55:52 -0400 Received: by mail-lf0-f43.google.com with SMTP id a18so7279336lfl.13 for ; Thu, 21 Sep 2017 17:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dqxtech-net.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=IERKwfVOMGAvB/cNazygfKEjpc+iffShlKNioommelE=; b=Pgo4fDG35TwWFwM6i5NiW5AbV2o+/Ys6UULn4pu+7qoC9GtclO0IHwbHOTFkxZ5NuN Zdup/TBSEib515gBcgX/1DdPkcHVPjvjXunWs8tcXvvhWzYafh1TDUpzk8fvTgVg+DRV 2TzsRs9+a9+lvda57zM0047d0rGkzVUcpBq2mHZO+2oDBllI3H1qoFaI9XAgzbuOn8Zq cDbG6ZC3PmTd7+VaAg8CF1P6D4QHh0kkzRXYGTciGZm4X6BzCpphqVF0vq9YNBk8orpl 1f6ng7iBPxpjPm0pSrPFgjssMoD0pgy6qILQCOcBgN1GVK2OE09fFUDlUEFHf23DghKt o4UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=IERKwfVOMGAvB/cNazygfKEjpc+iffShlKNioommelE=; b=MmaxMs3SKgNgqFChpVK1cMbHwy58uDWB+DwArvvoV4IOIwydQsCEo+rwfwYojQYthP MCUhxKyY3QsRzmijhP4C8MKmqABG0khkJGHmsu/lWF4y/HLe/oGyuZM65H8T7LbY9JRG w9Yoa40Kij0hFZGMFrm+dftbGkGDcQ20F5i1OiP3Vu7CQ7fdCYLXuQa9lEdGF78Xy0Xc +7PBsKdKo66NdhxabjGB/VVvUq9xaX0ishjkGk7vajqiSBL10yGod+gKjAFOv3jAKNu3 /EMTo/bq5t/dHHM+hCKvvFxJPtIpRPczfaEj0ps8CYjbWcaRWu/ElXsnmiHWK02St0EX WG3A== X-Gm-Message-State: AHPjjUjnI6EvQtIWipuboR9cxpqQOGhpm8GOuxBo0TXrcrO9zGGM+UVo QGpRe9seDH4d1Fi5Q6UAj/s0ruG+ X-Received: by 10.25.83.209 with SMTP id h78mr1481563lfl.129.1506041748980; Thu, 21 Sep 2017 17:55:48 -0700 (PDT) Received: from mail-lf0-f46.google.com (mail-lf0-f46.google.com. [209.85.215.46]) by smtp.googlemail.com with ESMTPSA id o80sm423198lfb.93.2017.09.21.17.55.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Sep 2017 17:55:47 -0700 (PDT) Received: by mail-lf0-f46.google.com with SMTP id c80so7313163lfh.0 for ; Thu, 21 Sep 2017 17:55:47 -0700 (PDT) X-Google-Smtp-Source: AOwi7QA7nFHky0HKVxqcRvjeTfZXM/L0juV5aN9G0TD5EE0K0llB+QbXq9cwk98YZv0h3asfZMEDVCsmbJjGVX3SPoc= X-Received: by 10.46.71.1 with SMTP id u1mr1827733lja.79.1506041747305; Thu, 21 Sep 2017 17:55:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.212.79 with HTTP; Thu, 21 Sep 2017 17:55:26 -0700 (PDT) In-Reply-To: <7cf5adb8-0738-259e-6d1e-f966722fdae2@gmx.de> References: <7cf5adb8-0738-259e-6d1e-f966722fdae2@gmx.de> Date: Fri, 22 Sep 2017 02:55:26 +0200 X-Gmail-Original-Message-ID: Message-ID: To: "Christoph M. Becker" Cc: PHP internals Content-Type: text/plain; charset="UTF-8" Subject: Re: [PHP-DEV] fputcsv() and $escape character From: andreas@dqxtech.net (Andreas Hennings) On Thu, Sep 21, 2017 at 1:43 PM, Christoph M. Becker wrote: > I don't think the current behavior is a bug, but rather the escape > character is an extension to the CSV "standard" (RFC 7111). Are you sure you mean RFC 7111 ? I was just parrotting the number in my previous email, but now I looked it up and only find this: https://tools.ietf.org/html/rfc7111 This talks about uri fragments with CSV, not CSV itself. RFC 4180 seems to be closer to what we are looking for: https://tools.ietf.org/html/rfc4180#section-2 fputcsv() and fgetcsv() already have a number of extensions to this format, which I think are not harmful and that we should keep: - option to choose a different delimiter - option to choose a different enclosure - option to have a different number of cells per row. - fgetcsv() has some tolerance for broken CSV, that we should continue to support. In the stackoverflow discussion, someone argues that line breaks are not part of the standard / not portable: https://stackoverflow.com/questions/44427926/data-gets-garbled-when-writing-to-csv-with-fputcsv-fgetcsv/46342634#comment75882780_44427926 However, the RFC 4180 that I found clearly mentions them: > 6. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes. So, all we would change is the escape behavior. In RFC 4180, it says: > If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote.