Hi everyone,
This is the tiniest of issues, but it's bugged me for a long time and
makes the HTML produced by PHP code less readable than it out to be.
Specifically, PHP ignores a newline immediately following a ?> tag. The
reason for this is, from what I recall, to prevent issues where
whitespace at the end of a PHP file is echoed before headers can be
sent. On UNIX in particular, all text files (should) end in a newline,
so this is a reasonable and necessary feature.
However, for ?> tags anywhere that aren't right at the end of the file,
this is just a nuisance that makes for messy output. For example, HTML
output that should look like:
May instead end up looking something like:
<table> <tr> <td>foo</td> <td>bar</td> </tr></table>Of course, HTML doesn't matter so much, it'll render the same to the
end-user. However, for outputting e.g. plain text, newlines can be
significant, and so you have to insert an ugly and surprising extra
newline following a tag.
Would anyone object to me changing how PHP handles this so that only the
final ?> tag consumes its following newline, and only at the end of the
file?
Thanks!
Andrea Faulds
https://ajf.me/
Would anyone object to me changing how PHP handles this so that only the final ?> tag consumes its following newline, and only at the end of the file?
I object. It's a change in ancient behavior that has the potential to break existing code for superficial reasons.
We'd never design it that way today, but that die is long cast.
-1
-Sara
Hi Andrea,
Le 07/09/2017 à 03:45, Andrea Faulds a écrit :
Hi everyone,
This is the tiniest of issues, but it's bugged me for a long time and
makes the HTML produced by PHP code less readable than it out to be.
Specifically, PHP ignores a newline immediately following a ?> tag.
The reason for this is, from what I recall, to prevent issues where
whitespace at the end of a PHP file is echoed before headers can be
sent. On UNIX in particular, all text files (should) end in a newline,
so this is a reasonable and necessary feature.However, for ?> tags anywhere that aren't right at the end of the
<table> <tr> <td>foo</td> <td>bar</td> </tr> </table>
file, this is just a nuisance that makes for messy output. For
example, HTML output that should look like:May instead end up looking something like:
<table> <tr> <td>foo</td> <td>bar</td> </tr></table>Of course, HTML doesn't matter so much, it'll render the same to the
end-user. However, for outputting e.g. plain text, newlines can be
significant, and so you have to insert an ugly and surprising extra
newline following a tag.Would anyone object to me changing how PHP handles this so that only
the final ?> tag consumes its following newline, and only at the end
of the file?Thanks!
+1 to create a PHP8 branch and change the behavior there. not in PHP7.
Once again, some may think it's too early but, IMO, we should create
such a branch and encourage RFCs and changes targeting next major
version to be announced, discussed, implemented, and tested as soon as
possible. This is the only way to introduce BC breaks while minimizing
their impact. We saw this when talking about PHP7 features : when
proposed too late, changes introducing BC breaks generally must be
rejected, whatever their value.
Regards
François
On Thu, Sep 7, 2017 at 12:11 PM, François Laupretre francois@tekwire.net
wrote:
Hi Andrea,
Le 07/09/2017 à 03:45, Andrea Faulds a écrit :
Hi everyone,
This is the tiniest of issues, but it's bugged me for a long time and
makes the HTML produced by PHP code less readable than it out to be.
Specifically, PHP ignores a newline immediately following a ?> tag. The
reason for this is, from what I recall, to prevent issues where whitespace
at the end of a PHP file is echoed before headers can be sent. On UNIX in
particular, all text files (should) end in a newline, so this is a
reasonable and necessary feature.However, for ?> tags anywhere that aren't right at the end of the file,
<table> <tr> <td>foo</td> <td>bar</td> </tr> </table>
this is just a nuisance that makes for messy output. For example, HTML
output that should look like:May instead end up looking something like:
<table> <tr> <td>foo</td> <td>bar</td> </tr></table>Of course, HTML doesn't matter so much, it'll render the same to the
end-user. However, for outputting e.g. plain text, newlines can be
significant, and so you have to insert an ugly and surprising extra newline
following a tag.Would anyone object to me changing how PHP handles this so that only the
final ?> tag consumes its following newline, and only at the end of the
file?Thanks!
+1 to create a PHP8 branch and change the behavior there. not in PHP7.
Once again, some may think it's too early but, IMO, we should create such
a branch and encourage RFCs and changes targeting next major version to be
announced, discussed, implemented, and tested as soon as possible. This is
the only way to introduce BC breaks while minimizing their impact. We saw
this when talking about PHP7 features : when proposed too late, changes
introducing BC breaks generally must be rejected, whatever their value.Regards
François
New branches cause a lot of additional overhead for core developers.
Changes have to merged across all actively supported branches, commonly
with NEWS file adjustments. Depending on where we are in the release cycle
right now, we already have 3-4 active branches -- we don't need to add to
that.
I think it's fine to start targeting PHP 8 now with RFCs, but
implementation work should be done outside of php-src. It is more cost
effective for one person to rebase their code two years down the line than
it is for everybody to do extra work every time they commit something.
(Alternatively we would have to change our development model so that
branches are not synchronized at all times.)
Nikita
Hi everyone,
This is the tiniest of issues, but it's bugged me for a long time and
makes the HTML produced by PHP code less readable than it out to be.
Specifically, PHP ignores a newline immediately following a ?> tag. The
reason for this is, from what I recall, to prevent issues where whitespace
at the end of a PHP file is echoed before headers can be sent. On UNIX in
particular, all text files (should) end in a newline, so this is a
reasonable and necessary feature.However, for ?> tags anywhere that aren't right at the end of the file,
<table> <tr> <td>foo</td> <td>bar</td> </tr> </table>
this is just a nuisance that makes for messy output. For example, HTML
output that should look like:May instead end up looking something like:
<table> <tr> <td>foo</td> <td>bar</td> </tr></table>Of course, HTML doesn't matter so much, it'll render the same to the
end-user. However, for outputting e.g. plain text, newlines can be
significant, and so you have to insert an ugly and surprising extra newline
following a tag.Would anyone object to me changing how PHP handles this so that only the
final ?> tag consumes its following newline, and only at the end of the
file?Thanks!
It also goes the other way. Whether you want to drop the newline after ?>
depends (roughly) on whether the code is control flow (drop) or trailing
output (don't drop). If the newline is not dropped anymore it doesn't mean
that the output will look nice, it's just going to be broken in a different
way.
Nikita
Hi Nikita,
Nikita Popov wrote:
It also goes the other way. Whether you want to drop the newline after ?>
depends (roughly) on whether the code is control flow (drop) or trailing
output (don't drop). If the newline is not dropped anymore it doesn't mean
that the output will look nice, it's just going to be broken in a different
way.
I understand that it should be dropped for “control flow” code (maybe
not the best term, I misunderstood what you meant at first). That's why
I suggest ignoring the following newline only for the ?> at the end of
the file, because I can't think of another place where you would have a
?> and not intend output immediately after it.
So I'm not sure I understand your objection, from that standpoint. Did I
miss something?
Regards.
Andrea Faulds
https://ajf.me/
Hi Nikita,
Nikita Popov wrote:
It also goes the other way. Whether you want to drop the newline after ?>
depends (roughly) on whether the code is control flow (drop) or trailing
output (don't drop). If the newline is not dropped anymore it doesn't mean
that the output will look nice, it's just going to be broken in a
different
way.I understand that it should be dropped for “control flow” code (maybe not
the best term, I misunderstood what you meant at first). That's why I
suggest ignoring the following newline only for the ?> at the end of the
file, because I can't think of another place where you would have a ?> and
not intend output immediately after it.So I'm not sure I understand your objection, from that standpoint. Did I
miss something?Regards.
I'm referring to code like
<ul> <?php foreach ($data as $value): ?> <li><?= $value ?></li> <?php endforeach; ?> </ul>Currently this would produce the output
<ul> <li>Foo</li> <li>Bar</li> </ul>Without the trailing newline elision it would produce
<ul><li>Foo</li>
<li>Bar</li>
</ul>
I always assumed that this is the reason why we do this in the first place.
Nikita
Hi,
Nikita Popov wrote:
Hi Nikita,
Nikita Popov wrote:
It also goes the other way. Whether you want to drop the newline after ?>
depends (roughly) on whether the code is control flow (drop) or trailing
output (don't drop). If the newline is not dropped anymore it doesn't mean
that the output will look nice, it's just going to be broken in a
different
way.I understand that it should be dropped for “control flow” code (maybe not
the best term, I misunderstood what you meant at first). That's why I
suggest ignoring the following newline only for the ?> at the end of the
file, because I can't think of another place where you would have a ?> and
not intend output immediately after it.So I'm not sure I understand your objection, from that standpoint. Did I
miss something?Regards.
I'm referring to code like
<ul> <?php foreach ($data as $value): ?> <li><?= $value ?></li> <?php endforeach; ?> </ul>Currently this would produce the output
<ul> <li>Foo</li> <li>Bar</li> </ul>Without the trailing newline elision it would produce
<ul></ul><li>Foo</li> <li>Bar</li>
I always assumed that this is the reason why we do this in the first place.
Ah. See, it's actually that kind of code that is my problem. A practical
example would be:
which currently produces:
<table> <tr> <td>foo</td> <td>bar</td> </tr> <tr> <td>baz</td> <td>qux</td> </tr> </table>The doubled-up indentation from missing newlines makes it into a mess.
And this is even worse in practice when you have more nested control
flow. Extra newlines would be fine here, but missing newlines aren't.
Thanks.
--
Andrea Faulds
https://ajf.me/
Ah. See, it's actually that kind of code that is my problem. A practical
<table> <?php foreach($rows as $row): ?> <tr> <?php foreach ($row as $column): ?> <td><?=htmlspecialchars($column)?></td> <?php endforeach; ?> </tr> <?php endforeach; ?> </table>
example would be:
I start the "control flow lines" always on column 0 (similar to C
preprocessor instructions), what gives the desired output and is quite
readable:
--
Christoph M. Becker
Hi,
Christoph M. Becker wrote:
Ah. See, it's actually that kind of code that is my problem. A practical
<table> <?php foreach($rows as $row): ?> <tr> <?php foreach ($row as $column): ?> <td><?=htmlspecialchars($column)?></td> <?php endforeach; ?> </tr> <?php endforeach; ?> </table>
example would be:I start the "control flow lines" always on column 0 (similar to C
<table> <?php foreach($rows as $row): ?> <tr> <?php foreach ($row as $column): ?> <td><?=htmlspecialchars($column)?></td> <?php endforeach; ?> </tr> <?php endforeach; ?> </table>
preprocessor instructions), what gives the desired output and is quite
readable:
This seems like a reasonable workaround, thank you for the idea. It
reminds me of what PHP's source code does with preprocessor instructions:
#ifndef FOO
# define FOO
#endif
I might do this in future code.
That said, I still think the ?> newline behaviour should be looked at,
since this kind of workaround isn't universally applicable (and in any
case isn't to everyone's tastes). In particular, if you want to generate
plain text and need to insert a newline, having PHP throw them away and
requiring you to add extra ones to compensate makes for uglier source
code which is harder to reason about.
Thanks!
Andrea Faulds
https://ajf.me/
This seems like a reasonable workaround, thank you for the idea. It
reminds me of what PHP's source code does with preprocessor instructions:#ifndef FOO
# define FOO
#endif
Hence the name PHP. :)
That said, I still think the ?> newline behaviour should be looked at,
since this kind of workaround isn't universally applicable (and in any
case isn't to everyone's tastes). In particular, if you want to generate
plain text and need to insert a newline, having PHP throw them away and
requiring you to add extra ones to compensate makes for uglier source
code which is harder to reason about.
If you don't mind a trailing space (I don't like them, but well), you
can write:
<?='foo'?>
bar
And of course, there are template engines which could be used as well.
Frankly, I don't see any need for action here. :)
--
Christoph M. Becker
I always assumed that this is the reason why we do this in the first place.
I think the main reason was that old versions of ie go into quirksmode
if the doctype is not in the first line of the output e.g.:
<?php header('Content-Type: text/html'); ?>
<!DOCTYPE htmlWould anyone object to me changing how PHP handles this so that only the
final ?> tag consumes its following newline, and only at the end of the
file?
Captain Obvious here. It has long been the policy of many large PHP
projects to not close the last PHP tag for this reason. This change
wouldn't affect them. It risks affecting projects without this policy, and
those tend to be older and often private.
Hi,
Michael Morris wrote:
Would anyone object to me changing how PHP handles this so that only the
final ?> tag consumes its following newline, and only at the end of the
file?Captain Obvious here. It has long been the policy of many large PHP
projects to not close the last PHP tag for this reason. This change
wouldn't affect them. It risks affecting projects without this policy, and
those tend to be older and often private.
The idea here though is not to affect code where the entire file is a
<?php ?> block. If newlines are still consumed, but only for ?> at the
end of the file, those files should still behave the same.
What I want to change is how it behaves in other circumstances, i.e.
templating.
Thanks.
Andrea Faulds
What I want to change is how it behaves in other circumstances, i.e.
templating.Thanks.
I get that, but I can think of one example where this innocent change might
BC break something. You cite this change being for templating - this
implies the php files with this feature are being loaded by another php
file with require() or include(). Suppose someone creates a template
wrapper with this circumstance in mind. Instead of doing the obvious, omit
the final ?> tag in the template, they write code in the template wrapper
to snip the last endline character from the included file. Depending on how
their code is written your change could now become a breaking change: for
example they just lop off the last character of the template's return
without checking to see if it is indeed a newline character.
Suppose someone creates a template
wrapper with this circumstance in mind. Instead of doing the obvious,
omit
the final ?> tag in the template, they write code in the template
wrapper
to snip the last endline character from the included file. Depending on
how
their code is written your change could now become a breaking change:
for
example they just lop off the last character of the template's return
without checking to see if it is indeed a newline character.
I think you have the change the wrong way round (unless I do). The current behaviour is:
- PHP blocks at end of file -> suppress following newline
- PHP blocks elsewhere in file -> suppress following newline
The proposed behaviour is:
- PHP blocks at end of file -> suppress following newline (no change)
- PHP blocks elsewhere in file -> treat following newline literally
So in your scenario, there would be no newline to trim, before or after the proposed change.
Regards,
--
Rowan Collins
[IMSoP]
Hi everyone,
This is the tiniest of issues, but it's bugged me for a long time and
makes the HTML produced by PHP code less readable than it out to be.
Specifically, PHP ignores a newline immediately following a ?> tag. The
reason for this is, from what I recall, to prevent issues where
whitespace at the end of a PHP file is echoed before headers can be
sent. On UNIX in particular, all text files (should) end in a newline,
so this is a reasonable and necessary feature.However, for ?> tags anywhere that aren't right at the end of the file,
<table> <tr> <td>foo</td> <td>bar</td> </tr> </table>
this is just a nuisance that makes for messy output. For example, HTML
output that should look like:May instead end up looking something like:
<table> <tr> <td>foo</td> <td>bar</td> </tr></table>Of course, HTML doesn't matter so much, it'll render the same to the
end-user. However, for outputting e.g. plain text, newlines can be
significant, and so you have to insert an ugly and surprising extra
newline following a tag.Would anyone object to me changing how PHP handles this so that only the
final ?> tag consumes its following newline, and only at the end of the
file?Thanks!
I've noticed that over the years. When I care, I'll either press enter
an extra time or, more frequently, switch over to using pure echo
statements for precise output control. I don't think of this as a
particularly significant issue.*
Alternatively, for the HTML case, it is possible to stream an output
buffer and manipulate newlines through the TagFilterStream class:
https://github.com/cubiclesoft/ultimate-web-scraper
That particular class can process HTML at a rate of up to 1MB/sec even
when using callbacks via its very efficient stream-based state engine.
The extra overhead is minimal for prettifying HTML output.
- I'd personally rather see a suitable fix for Bug #73535 at this point.
It's been an open issue with a CVE assigned for almost 10 months. It
would be nice to see it triaged properly (e.g. the suggested fix
applied) so that I can finally close that browser tab. If you have the
spare time for newline output adjustments, I'd love to see that extra
energy sunk into fixing existing security vulnerabilities, especially
those with CVEs and suggested solutions. Just sayin'. But you guys do
whatever you want to do.
--
Thomas Hruska
CubicleSoft President
I've got great, time saving software that you will find useful.
And once you find my software useful: