Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:106345
MIME-Version: 1.0
References: <CAF+90c_Ctjja=8fR-uBWL1wDxZge771yMujMhaR_R2seKXciYw@mail.gmail.com>
 <CAF+90c8JhV6-aQrsvW8u_xGB-wHmyObEU1Chd64J+x3tkKogQQ@mail.gmail.com>
 <9ADC8994-9D3C-4810-A2DB-6FB81D513098@gmail.com> <CABdc3WqxvucnCk7rd-Y2YuNn_dCvh5atNagwCqo8gp9NjMt_aQ@mail.gmail.com>
 <CANS-=pcp=-t4-cEHK8vbuNAaMBLSjmoL94h-TXKtHm0v8Ptkqg@mail.gmail.com>
 <CALKiJKorD5xayU1QYtd_XMEz7Zkx+etWs3k3DGZS3O9zPnk7mQ@mail.gmail.com> <CABdc3WrgSwFtoxzOpUa=_09N6vPt-CU_qFhCmZM9GVJMMQ7Gow@mail.gmail.com>
In-Reply-To: <CABdc3WrgSwFtoxzOpUa=_09N6vPt-CU_qFhCmZM9GVJMMQ7Gow@mail.gmail.com>
Date: Tue, 30 Jul 2019 16:08:07 +0200
Message-ID: <CAF+90c82zYeUkn3QRiPDFzf4B3cdF188TFGOMMXjwMAJ1UW6DQ@mail.gmail.com>
To: =?UTF-8?Q?Micha=C5=82_Brzuchalski?= <michal@brzuchalski.com>
Cc: Rowan Collins <rowan.collins@gmail.com>, PHP internals <internals@lists.php.net>
Content-Type: multipart/alternative; boundary="000000000000aefca7058ee68a0b"
Subject: Re: [PHP-DEV] Re: [RFC] Namespace-scoped declares, again
From: nikita.ppv@gmail.com (Nikita Popov)

--000000000000aefca7058ee68a0b
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Thanks everyone for your responses!

I think the discussion resolves around two primary concerns, so let me
address them in turn.

The first is the general approach of using declares as a language-evolution
mechanism. The concern here is that each additional declare fragments the
language and increases the number of combinations of different options
there are.

What I ultimately want to achieve here is a way to evolve the language and
fix long-standing issues without breaking backwards compatibility or
causing ecosystem fragmentation. The only way we currently have to address
(nowadays) undesirable behavior is through deprecation and subsequent
removal. As people like to regularly remind me, this has a high cost on the
ecosystem, because millions of codebases that were running without a glitch
need to be updated, which not only takes a lot of effort, but also delays
adoption of new PHP versions for everyone. As such, the "deprecation and
removal" approach has to work over long time-frames and is only really
applicable to rather "minor" issues in the first place.

If we want to evolve the language without breaking backwards compatibility,
we need to provide a way for gradual migration of the ecosystem: A library
should be able to opt-in to breaking changes, while remaining usable by
downstream consumers. Conversely, an application should be able to opt-in
to breaking changes, while still being able to use an older library.

To achieve this, I believe it is unavoidable to have *some* kind of
mechanism to affect language-behavior on a per-library/project level. Of
course, the devil tends to be in the details...

What this RFC originally proposed is a fine-grained approach, where
individual changes are controlled by separate declare directives. However,
this is not the only possibility: As has been recently popularized with
Rust editions (https://doc.rust-lang.org/edition-guide/editions/index.html)=
,
a coarse-grained approach where multiple changes have to be enabled
together as part of an "edition" is also possible.

The advantage of the coarse-grained "edition" approach is that it avoids a
combinatorial explosion of options: It's all or nothing, and it is easier
to keep in mind that a project uses "PHP 2020" rather than some specific
combination of declares.

The advantage of a fine-grained approach is that is also allows a
fine-grained migration. As a statically-typed language, Rust can provide
fairly reliable tooling to perform edition migrations. While such tooling
also exists for PHP (e.g. Rector), it does not have the same level of
reliability, especially for codebases that do not make pervasive use of
type annotations. Fine-grained declares allow a code-base to be updated one
step at a time.

It is possible to combine both approaches by providing both fine-grained
control and an overall "edition" that enables a larger set of language
declares. The end goal should be to move to the next edition, but
individual declares may be used during the migration, or to opt-out a
section of code. This is probably my preferred approach.

I should probably also highlight that this is somewhat different from the
existing strict_types directive: strict_types was only in part a mechanism
to control BC breakage (with regard to internal functions), but to a large
part exists because we couldn't agree on which semantics are preferable.

This is not what I'm going for here. I don't want declares to becomes a way
to resolve disagreements by just providing both options. Instead a declare
represents a change that we *want* to make and that codebases *should* make
eventually, but that is opt-in to maintain backwards compatibility and
library interoperability.

---

The second concern is around the technical details of opting-in to
BC-breaking language changes on the library level. Here is an overview of
some proposals that have been made:

1. Keep declares per-file. This is clearly incompatible with any
fine-grained (or optionally fine-grained) approach, because declares have
to be replicated across hundreds of files. I think this is a bad choice
also for a coarse-grained approach (or even for the existing strict_types
directive), because in all cases I've seen people want to enable the option
for the whole library, not individual files.

Replicating declares per-file is error prone (I regularly forget to add
strict_types declarations to newly created files) and complicates the
mental model of the programmer. While ostensibly per-file declares make
things explicit, I think the reality is that nobody actually double-checks
declares in each file they open and will instead assume that the project
default holds.

2. Support per-namespace declares. This is what I originally proposed. This
is based on the premise that a library will usually correspond to a
namespace. This approach has been extensively discussed in this thread -- I
think the main issue is that the premise just doesn't reliably hold up in
practice, e.g. because multiple packages publish under the same namespace.

3. Support per-directory declares, which is the direction I was planning to
explore next. This is based on the premise that all library files are part
of some top-level directory, which I think is a fairly safe premise (note
that the "directory" could also be a phar file).

The actual intended use (similar to the namespace-based variant) is that
people will specify their declares in the composer.json file, and composer
then includes a call to declare_directory() or similar as part of the
autoloader. Projects not using composer have the choice of issuing an
explicit call.

4. Specify declares in a special file, similar to how INI directives are
declared. The suggestion here has been that PHP could scan the path of an
included file upwards to find a declares.json (or similar).

The main advantage I see here (over a declare_directory() function) is that
there are no loading order issues. declare_directory() needs to be called
before any files from that directory have been included (which is part of
why an integration into the composer autoloader is useful), while for a
separate and implicitly processed file this falls out naturally.

Apart from that, I'm not a big fan of this proposal, mostly because of the
implicit loading it entails. I also don't think that having one more
configuration file for this buys us something over declaring things in
composer.json.

5. Introduce a first-class module/package concept and support per-package
declares. This is arguably the closest fit for what is needed, but also the
most complex solution. This is a fairly big problem space and I personally
do not want to pursue this outside a certain narrow scope.

In particular I have serious doubts about retrofitting (at this point in
time) an invasive module system that involves explicit export and import
management, along the lines of what Michal is describing. (Though I will be
happily surprised if someone comes forward with a proposal to do this in a
non-invasive way.)

What I think might be worth pursuing though, is a much weaker package
notion that essentially grants some language-integration to the existing
notion of composer packages. Instead of having a declare_directory() we'd
have declare_package(), which is bound to a certain path and can be used to
specify declares, but also used for other purposes, such as package-private
visibility.

If I may make another Rust analogy, this would be more like Rust crates
than Rust modules. The analogy being that this is a more coarse grained
level, and is fairly tightly integrated with the package manager (but of
course still usable without it).

Regards,
Nikita


On Tue, Jul 30, 2019 at 12:14 PM Micha=C5=82 Brzuchalski <michal@brzuchalsk=
i.com>
wrote:

> Hi Rowan,
>
> wt., 30 lip 2019 o 10:48 Rowan Collins <rowan.collins@gmail.com>
> napisa=C5=82(a):
>
> > I think there's some confusion here, because establishing the concept o=
f
> a
> > package as separate from a namespace is exactly what I proposed.
> >
> > Here's a previous message (technically in the same thread, but from 18
> > months ago) where I also mentioned class visibility:
> > https://externals.io/message/101323#101390
> >
>
> Was thinking about similar, a package with own identity and a way to
> declare autoload and other stuff.
> Was even thinking it could use a double colon which I've proposed way bac=
k
> in the same thread and
> with a delimiter in name all related symbols could be stored in package
> individual symbol tables,
> it won't collide with namespaced and global ones and would be easier to
> detect if tried to use an internal symbol
> in another context like other package or in global code.
> It could introduce a few more keywords like:
> * "package" - for declaring package name and declares,
> * "export" - for explicit declare of publicly available symbols which the=
n
> could be detectable etc. for visibility features,
> * "expect" - for explicit declare of required dependencies
>
> Last two are for future features and the first one could be enough for
> shaping how it could look like.
> For eg. some of my thoughts
> https://gist.github.com/brzuchal/c45010f0dd20642b470eeee8b9c56c5f
>
> I know it's out of the main topic but IMO we should start another one and
> I'm pretty sure I've mentioned that earlier.
> If we wanna shape package for PHP then the separate discussion could be a
> good idea.
>
> --
> regards / pozdrawiam,
> --
> Micha=C5=82 Brzuchalski
> about.me/brzuchal
> brzuchalski.com
>

--000000000000aefca7058ee68a0b--