Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:115387
Return-Path: <danack@basereality.com>
Delivered-To: mailing list internals@lists.php.net
Received: (qmail 33626 invoked from network); 11 Jul 2021 16:46:34 -0000
Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5)
  by pb1.pair.com with SMTP; 11 Jul 2021 16:46:34 -0000
Received: from php-smtp4.php.net (localhost [127.0.0.1])
	by php-smtp4.php.net (Postfix) with ESMTP id 75C611804C4
	for <internals@lists.php.net>; Sun, 11 Jul 2021 10:09:37 -0700 (PDT)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE
	autolearn=no autolearn_force=no version=3.4.2
X-Spam-Virus: No
X-Envelope-From: <danack@basereality.com>
Received: from mail-ua1-f54.google.com (mail-ua1-f54.google.com [209.85.222.54])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(No client certificate requested)
	by php-smtp4.php.net (Postfix) with ESMTPS
	for <internals@lists.php.net>; Sun, 11 Jul 2021 10:09:37 -0700 (PDT)
Received: by mail-ua1-f54.google.com with SMTP id n61so6087946uan.2
        for <internals@lists.php.net>; Sun, 11 Jul 2021 10:09:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=basereality-com.20150623.gappssmtp.com; s=20150623;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=bAh/GNuCdKfp/3ou8/+PLEGrDLLGj1iXwJCJHmMX4/E=;
        b=SvauNz8p0qfc6MXF8zlonJw5wPRTd21A7K5YZhMFcz5Xpkei7yuX883kINIkouH1w6
         F9t0EZfpev+Od4/XGICUcK0oAiCzgAAnqGX3QRjNp9PxtN/LxSOFCeioFVyzEinzTv6G
         /wqOSlht6CqQxlrh/lXaH7iSC9Q5GYwbyqPrims9sujj2pw45vfmzLETEJTZsot4HJWD
         GPjXqLVhPbJMSpbmxKvkpCEZYxmvQQimnFNulOYBOS/vpSQvTiu/hEyqCAwPGhJfZgSc
         HNhNSiKx0nag0Gk1wF6Z6sJg1Cxx/Nc0fPK1OXaZj0e/EC+S78xdCEdMmZkW8fKyAS1p
         JVVg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=bAh/GNuCdKfp/3ou8/+PLEGrDLLGj1iXwJCJHmMX4/E=;
        b=tNV6niaSYeHhhGUEpQiynKlg+UZvwfeWRa+vsKaO7btLXIu+6C87ABW5CyUIZcErcd
         TVWjDYmC5Zyhn6lW/WZZxCJ+sI9K4sgP2KbKBYfr8PR6WtJVvRPkT8KhH03DpsEP7XjO
         SjLkJjCzTCcwJ2iZMdO17mipH8V04Mun/624SJmsYBcdcKUegiKU8Pp3uXnIaaeJ5eRt
         sCEdLMiWqndgaSM9VIHr/gKF/8Lb5SSxr/u8p0BvpbfZEle+yrbV5QgVzjxGcayUvOBz
         6acofwJu4kL9M5dHzQiF5462CZiXllLJfMaeWv2TMZpIW+unrpw48k/kYv9hjEJYYNxa
         e+hw==
X-Gm-Message-State: AOAM5322N0CcSJi0fOlx+DYiWReHYApbvfWtJb8OAgMPag7VaYVC9NkU
	Di9p1WNYA8MxcN0eryGFArD16Ka5RYPZuEVoAa25YA==
X-Google-Smtp-Source: ABdhPJxQf/ZAP7QAAdp8c1k6wtQ0IyGpucMxhqLxaBxARPvfZGCFct/4J6kYKgxT594BpTNoytXw63uXLW+aK/Nm3s8=
X-Received: by 2002:a9f:326c:: with SMTP id y41mr35142070uad.52.1626023375509;
 Sun, 11 Jul 2021 10:09:35 -0700 (PDT)
MIME-Version: 1.0
References: <CAFv4g+HRHrvCOTgpbVfCXj06De1kL+1+RwnGgq6tcyLNAtLfQA@mail.gmail.com>
 <CAFPFaMJs7qvko=tfBsXuwCqMV3VqXL4TjO2aKWWPjrAicZzM9w@mail.gmail.com> <CAFv4g+G85cfjykjKRbOwWg+KXctZ1u93Onu5cLh1bNrtQ0=DPg@mail.gmail.com>
In-Reply-To: <CAFv4g+G85cfjykjKRbOwWg+KXctZ1u93Onu5cLh1bNrtQ0=DPg@mail.gmail.com>
Date: Sun, 11 Jul 2021 18:09:24 +0100
Message-ID: <CA+kxMuQkBevK5k-Khhpo3tFfHXeSrM+-GCOWRoZoKJ8Y=_mV4Q@mail.gmail.com>
To: Craig Francis <craig@craigfrancis.co.uk>
Cc: "G. P. B." <george.banyard@gmail.com>, PHP internals <internals@lists.php.net>
Content-Type: text/plain; charset="UTF-8"
Subject: Re: [PHP-DEV] [RFC] [VOTE] is_literal
From: Danack@basereality.com (Dan Ackroyd)

On Tue, 6 Jul 2021 at 10:32, Craig Francis <craig@craigfrancis.co.uk> wrote:
> which will result in too many invalid errors, requiring them to make
> substantial/unnecessary changes

My understanding is that the majority of PHP developers use one of the
many frameworks available. All of those provide libraries that people
use rather than accessing the DB directly.

At the other end of the scale, you have companies with large-ish
engineering teams (e.g. 30+) where access to the DB is done through an
in-house library.

The only people who don't already go through an access layer of code
to reach the DB are people who don't depend on frameworks, and
maintain their own applications with a small (or singular) engineering
team.

This type of person may be over-represented in internals, but are
probably relatively few in number compared to the majority of PHP
users.

> requiring them to make substantial/unnecessary changes

My very strong suspicion is that if/when people start using this, the
majority of changes that would be required would be actual security
problems that ought to be fixed.

> (i.e. replacing every string concat with
> `literal_concat()`, or using a special Query Builder for their
> SQL/HTML/CLI/etc).

Most people should be using a library for building HTML and CLI
escaping, because remembering how to do the escaping properly is
really difficult. I certainly never remember the difference between
the correct escaping for CSS and JS.

But the changes required for supporting DB queries just don't seem
that big to me. I've put a code example of an implementation below.

I'm pretty sure that the changes needed to fix the actual security
problems that exist in their code would be bigger, and also that
tracking down a single leaked non-literal would take longer than
migrating code to use the new feature.

As I said in my reply to Rowan, making it easy to track down issues
where they occur, minimises the cost of using this feature over the
years it would be used.

cheers
Dan
Ack


# Before

$query = "select * from preferences user_id = " . $_GET['user_id'];

db_exec($query);

# After

$query[] = "select * from preferences user_id = ";
$query[] = $_GET['user_id'];

db_exec($query);

And the function db_exec would be changed to something like:

function db_exec(string|array $query_parts)
{
    if (is_string($query_parts) && !is_literal($query_parts)) {
        throw new \Exception("Cannot use non literal string as query.
Please pass the parts in as an array");
    } else {
        foreach ($query_parts as $query_part) {
            if (is_string($query_part) && !is_literal($query_part) {
                throw new \Exception("non-literal string found [$query_part]");
            }
            else if (is_int($query_part)) {
                // todo - decide if you want to allow this or not.
                // I personally wouldn't.
            }
            else {
                // todo - support other types
            }
        }

        $query = implode("", $query_parts);
    }

    // rest of db_exec here...
}