Please let me do a little explanation of the title first. Japanese is an
interesting language where context is vital to the meaning of a word.
Sayonara usually means a simple "good bye", but within a different context
can mean "we'll probably never meet again".
To understand this mail you'll have to know that I was just another user of
PHP, an user that was probably too eager. I wanted to get more involved with
the development of PHP as I do believe in all the philosophy of open-source.
In the end I found my attempts ended in frustration, but, nevertheless, I
learned a lot in just a few months. I don't want this mail to be one where I
get to display all my frustration, instead I want to leave here all my
findings, the things I researched, the few things I managed to actually
code, and mostly the ideas that someone else might find useful.
---- To those who may want to involve in the php internals ----
For those in the generals list that may ever try to venture in the internals
of PHP, remember that you have to back your point of view with a patch. So,
sit down, remember the old days in college using the c compiler, and code
like a cowboy before trying to promote anything in the internals. It's the
status quo of the PHP development community, as I did learn too late.
---- Namespaces: function imports ----
Here is the patch to add function imports to 5.3. To be consistent constants
imports have to be added too:
http://martinalterisio.com.ar/php5.3/use.function.v2b.patch
If you don't know what imports are, they are just a way to refer to a longer
name with a shorter name. For example:
<?php
class MyRowset extends Zend::Db::Table::Rowset::Abstract {
...
or with imports:
<?php
use Zend::Db::Table::Rowset::Abstract as Rowset;
class MyRowset extends Rowset {
...
The use statement behavior currently supports only class names aliasing.
Functions and constants have to referred with full name, although these too
can be namespaced.
---- Import statement is broken, why, how can be fixed ----
While doing the previous patch I realized that the use statement is broken.
It should generate and error when you try to override an existing name. But
the use statement is handled at compile, where it's unknown if a name will
be overridden or not. What happens is that the error might be triggered
depending on the conditions and order of compilation. If you have an opcode
cache, this error may not appear until the cache is invalidated.
On a suggestion by Dmitry, which I really don't know if he knew about this
issue with use or not, but, anyway, his idea solved this issue, I made this
patch:
http://martinalterisio.com.ar/php5.3/file.scope.use.patch
With this the use statement is checked only against the names used in the
current file (or namespace if using multiple namespaces per file). Since the
imports only affect the current file, this is more sensible, and the issue
mentioned before disappears.
---- Name clash and ambiguity issue introduced by namespaces ----
There's another pending issue with namespaces, there's a name clash that
currently goes undetected, and makes static methods partially unavailable.
This is due to the fact that using :: as namespace separator generates
ambiguity. foo::bar() can refer to the static method bar in class foo, or to
the function bar in the namespace foo. This is an issue to php library
developers. Someone can inject a namespaced function which overrides your
static method.
One possible solution I approached was to prevent the name clash altogether,
but I found this approach inappropriate for 2 reasons: the performance
impact is too big; is not consistent with how other name clashes are handled
in php (classes and functions may have the same name).
Another approach, which I believe is the correct one but never got the
chance to implement in a patch, is to change the order of name resolution,
search first the static method and then the namespaced function, and if the
user wants to refer to the function he can import the function. This way
both remain accessible although the user has to solve the ambiguity. Also
this reduces the impact of adding namespaces on legacy code, since there's
an impact to all static method calls (because first the namespaced function
is searched).
---- Reducing impact on performance introduced by namespaces ----
I found out that although the philosophy behind the namespaces
implementation is to do as much as possible in compile time, but much is
pushed to the executor. Those could be solved on compile time. Much can be
optimized changing the name resolution rules. If these become more explicit,
the compiler can discern which is the actual name that's referred to. As of
now, it can be optimized using imports and explicit names, which are used as
alternative notation. In other words, the normal use of namespaces is not
optimal.
There's still one name resolution that seems inevitable that it will fall to
the executor: the ambiguity mentioned earlier between static methods and
namespaced functions. This could be solved by the user if the use statement
allows to also explicitly indicate the type of import: use class X; use
namespace X; use function X; use const X;
---- Fix name resolution rules for better coding practices ----
Also, as of now, I'm more than confident when I say that the current name
resolution rules will bring much headaches to users. For starters you'll
have to make a habit of prefixing :: to all internal function calls (such as
::count, ::strlen, etc). This way will be safer for creating php libraries,
since another user could inject a namespaced function that overrides those
functions. This is because the function call without that prefix will try
first a function in the same namespace then the internal. Also, for this
same reason, using the :: prefix will be faster (since it's solved at
compile time). And if you want to refer to an element of the current
namespace, is better to use namespace::
If you don't know about the name resolution rules, check what's written in
the manual:
http://php.net/language.namespaces.rules
What I wanted to implement but will never get the chance is name resolution
rules that aren't context aware and explicit:
foo(); // is always global foo (except if foo is an alias)
new A(); // is always global class A (except if A is an alias)
A::B(); // try static method of global class then namespaced function
(except if A is an alias)
namespace::foo(); // is always foo() in current namespace
new namespace::A(); // is always class A in current namespace
::foo(); // is always global foo (aliases ignored)
new ::A(); // is always global class A (aliases ignored)
I think this will improve readability, maintainability and debugging,
because of its explicitness.
---- Autoloading issue with namespaces ----
There is also an issue with autoloading and internal names with the name
resolution rules. The autoload has to be the last thing tried, therefore
even if there's a namespaced name that overrides an internal name, it won't
be seen if its loading its subject to autoloading. That's also another
reason to change the name resolution rules. With the rules I explained
earlier there won't be this issue with the autoload.
---- Possible enhancement for autoloading with namespaces ----
Regarding the autoloading, I think there's an enhancement that can be
achieved with the implementation of namespaces. Consider the possibility of
a namespaced __autoload(). Autoloading in PHP has one important issue: as
the system grows, and external libraries grow, the complexity of the
autoloading increases. Using the spl autoloader, each library adds its
autoloading. If you have many libraries, autoload can cost too much. If a
namespaced __autoload() is implemented, this can reduce the impact by
distributing the autoloading behavior, ie, first use the namespace autoload,
then try the global autoload. A package should know better where its classes
are.
---- Constrained scope for imports is unpractical ----
When trying to refactor code to use namespaces, as a test, I also found that
having the use statement limited to outer scope is unpractical. One
necessary addition, which is not very complicated, is to have an extra scope
for use statements, such as imports in a function scope. It's only a matter
of keeping an extra table for the function scope in compile time.
---- Namespaces keyword issue, it can be solved without taking a keyword
There's still the issue of the keyword taken by the namespaces
implementation. It doesn't matter if it's "package" or "namespace". Both are
keywords widely used in php (use google code search if you don't believe
me). I know they have tried to remove the need for the keyword, but I still
think there's a way. Consider the following:
<?php
class Foo::Bar {
use bla::bla;
}
?>
Instead of:
<?php
namespace Foo;
use bla::bla;
class Bar {
}
?>
In the first there's no need for namespace declaration, it's declared with
the class name. The same can used for functions and consts:
<?php
function Foo::test() {
use bla::bla;
}
const Foo::CONSTANT = 101;
?>
This approach restricts namespaces to classes, functions or constants scope.
If you want to execute code in a namespace you'll have to be in one of these
scopes. But, I think it's a restriction one would pay in favor of all those
libraries that will break because they use the fatal keyword (think of all
the XML related libraries that use "namespace").
Also, using namespace:: or package:: doesn't need to take a keyword (think
of self:: and parent::, they aren't keywords just special names that can't
be used for naming classes).
---- Namespaces as nested classes? ----
Reading about how the previous implementation of namespaces went down the
drain, one recurring though in some users and developers caught my
attention. Maybe namespaces and nested classes should be one and the same
thing in php. Considering that many are using classes as namespaces for
functions, this is not such a illogical approach to the problem. I have not
much considered the technical feasibility of this approach, but one that
would be probably needed is the ability to forward declare members. Without
this, all definition must be clustered.
Example:
<?php
final class A { // should be final to have nested classes?
public class B; // forward declaration
}
?>
other file:
<?php
class A::B {
}
?>
I can't say much about this approach. It's just one wild idea.
---- Type hints, improvements could help drastically improve performance
I thought much about type hints. Right now they are only seen as syntactic
sugar for system designers, and something that reduces performance. Actually
quite the opposite can be achieved, but not with the current implementation
of type hinting. The guys behind flash 9 obtained a 10x improvement in
performance thanks to type hinting. Actually doing the same with PHP is
quite sensible, since one of the bottlenecks for performance is the zval.
Knowing before hand that the variable is a native type, a just in time
native compile can be done to drastically improve performance.
For that to happen first type hinting must be improved. Here are some
thoughts I shared with another user some time ago:
http://martinalterisio.com.ar/php5.3/php-typehints.txt
---- Taints ----
Last but not least, I thought about taints. Since PHP6 will remove safe mode
and magic quotes, as far as I know, if nothing else is there to prevent
users from being users, well PHP6 might be considered too insecure. Taints
should be the solution to this, but approaches copied from other languages
seem not feasible in PHP. Variable level taints are not the way to go: not
much can be added to zval without suffering the consequences, and a simple
model of tainted/not tainted is not safe enough, as there are many taints to
be considered (XSS, SQL injection, HTML injection to say the least).
I think one possible approach to consider is scope taints. Instead of
tracking taints on variable level, do it on scope level, ie, attach taints
to functions, classes, global scope. Taints should be an arbitrarily sized
list of elements, where the user can also add taints of his own (we don't
know where security holes might appear in the near future, so let's leave
that door open). Taints tracking is to be attached to classes, functions or
global scope (methods use class scope).
When function or class code refer to another scope (function call, method
call, member access, global access) a pollution occurs. In a pollution the
involved scopes become infected with taints from both. The pollution
operation needs a new opcode that can handle a reference to scope either
statically or by an object reference. For each function/class the user has
to be able to mark taints that infect them, which taints they can
handle/resist, and which taints they reject. A function/class ignores
pollution by taints that can handle/resist. If a function/class is polluted
by a taint that rejects, an error occurs. Internal functions should define
also how they are affected by taints, and some defaults taints be specified
for known security issues.
The problem with this approach is that is not an automagical solution. It
requires the user to be conscious of the security issues. If he does nothing
about it an error occurs, but he can mark the scope as one that handles the
taint and still do nothing about it.
There's two alternatives to how keep track of taints:
- keep a list of taints that pollute the scope
- keep a list of taints that DO NOT pollute the scope
The second alternative is harder to understand. It assumes that any scope
cannot be trusted by nature. Instead of adding threats, you remove threats.
I think this approach is more secure.
---- The end ----
Well that wraps it all, I think. That's as much as I can download from my
brain which is related to PHP. Do whatever you want with all this, even the
spam folder is fine.
Anyway, it's been fun, and I learned a lot.
My thanks to everyone that ever gave a hand.
A former PHP user says to you all:
Sayonara PHP
P.S.: Please be understanding if I don't answer replies to this email.
Hi,
Martin Alterisio wrote:
Please let me do a little explanation of the title first. Japanese is an
interesting language where context is vital to the meaning of a word.
Sayonara usually means a simple "good bye", but within a different context
can mean "we'll probably never meet again".To understand this mail you'll have to know that I was just another user of
PHP, an user that was probably too eager. I wanted to get more involved with
the development of PHP as I do believe in all the philosophy of open-source.
In the end I found my attempts ended in frustration, but, nevertheless, I
learned a lot in just a few months. I don't want this mail to be one where I
get to display all my frustration, instead I want to leave here all my
findings, the things I researched, the few things I managed to actually
code, and mostly the ideas that someone else might find useful.
Snip of interesting technical stuff.
---- The end ----
Well that wraps it all, I think. That's as much as I can download from my
brain which is related to PHP. Do whatever you want with all this, even the
spam folder is fine.Anyway, it's been fun, and I learned a lot.
My thanks to everyone that ever gave a hand.A former PHP user says to you all:
Sayonara PHP
While I'll admit I've not fully read your mail due to it's relatively
in-depth and technical nature that I'm not really up-to-speed with
regarding the internals of PHP, it did strike me when skimming the mail,
that you've not really covered your personal standpoint now.
You state some interesting technical about how namespaces and such
will/could work in 5.3 (something which I would personally welcome with
open arms (especially as I've coded around the autoloading issue with
other techniques involving regexps of class names and other such
slightly nasty things (although acceptable if you used good prefixes on
all your class/interface names)).
But you also say you're leaving PHP (if not for good, at least for now)
and you don't really say why, other than referring to the hard initial
entry to the internals community.
If you would be so kind, I think it would be interesting to say why you
have decided to move away from using PHP (and what you are now intending
to use!). I think it would help the PHP community grow stronger with
this kind of information as much as the technical information you've
already given.
P.S.: Please be understanding if I don't answer replies to this email.
Wishing you all the best.
I appreciate this may not get an answer :)
Col
Well, my fellow countryman... one of my classmates at college had a saying:
"el hombre es un animal de costumbres" (translate it "you gringos" :) - no
offense). And I guess it's most of the time like that... we learn something,
we are never willing to unlearn it. But the truth is, there are at least as
many habits and learnt behaviors as people are there walking in the streets.
So, sometimes, we should be a bit more tolerant to "foreign habits" (unless
we are Micro$oft.. but even so...).
If my intuition is right you must come from the Java/C++ world (my bet is
java 80/20). Maybe you have evaluated the hassles of implementing namespaces
into PHP... and you have concluded it's not possible. Or maybe, that it will
be a "buggy implementation" in the end; like PHP 4 OOP (which doesn't look
like OOP at all). Maybe some old-seasoned gurus in the internals community
have set you apart, or have treated your opinions with contempt (this is
just my assumption, like most of this email's contents). So, you are now
assuming that you won't need PHP, and that it will 'die("alone")' like some
poem of your authorship stated in one of its verses. Yes, after all
developers find out the hassles of namespaces and type hinting in PHP, they
will give up... won't they? (just reading your mind... forgive my arrogance
and continue).
You know... I think I'm about your age (judging for the picture of yours at
phpclasses.org, if that's your picture). Maybe a bit younger, or a bit
older... but just a bit. And the thing is, I heard about two years ago or
so, a big buzz around a "PHP replacement". It was something about trains
(that's the farthest understanding I reached on it... "something about
trains"). I think it was called, railroad, or railway, or diamond on a
train... mmmm... nope, now I remember, it's "ruby on rails" (if you have a
sarcasm detector, use it now). Last time I checked, it was still alive...
arguably in a much more evolved fashion, and some (may I say "few"?) hosting
companies support it now. I don't know much about current statistics, but
I'm tempted to say that:
- There are many more Books on PHP than on RoR
- There are many more PHP hosting offers than RoR's counterpart (even if we
reduce the stats to PHP 5 - just a guess) - There are many many more websites built on top of PHP than the RoR's
counterpart - There are many many more extensions, APIs and Frameworks for PHP than for
RoR (actually, RoR IS a framework itself) - There are many many more PHP developers than RoR developers
In the shared market niche, PHP has beaten java, coldfusion, asp, and perl,
which already existed. PHP has survived .Net rumbling, despite the Vb, C#,
J# or C++ flavors and the awesome Vi$ual $tudio IDE. And despite all the
predictions and prophecies about PHP's doom... it is still here, and will be
here and in the top 5 for at least 10 years. By the time PHP is replaced by
RoR or anything else, I will probably be selling RoR T-Shirts, or be
retired, or be dead (maybe of lung cancer, or cirrhosis, or just because
no-one can live past 120's)...
The point is... "sayonara PHP" means "sayonara most new clients" right now,
sayonara "sustained trend", sayonara "all php-based solutions" and sayonara
"most of the web development world". The facts show that googling for "PHP"
throws about 8,830,000,000 results (this is a bit biased, but the point
still holds)... try to Google for anything else and get similar numbers.
So... why not just saying "sayonara PHP internals" if that's the scenario in
which you've had trouble? (meaning, bad luck, not lack of skill). I've seen
some of your code, and though I don't personally use it, I find it
interesting. And I dont use it because I see no need for it; RIGHT NOW,
most of my company's projects are either small or already work with a
framework of some kind. I think I know why not "just sayonara PHP
internals"... because "I know how this or that should be, I want to turn it
into the way I think it should be... and as I can't do that, because someone
or something prevents me to do so... I feel frustrated and I just give it
up". Again... just (wrongly?) reading your mind.
Now, it's like you couldn't live without namespaces, could you? We've all
being living without namespaces so far... and we've all lived with PHP 4 OOP
so far. And it was even harder for me... I come from the Delphi world and
had previously done some bit of C++ too. So just imagine that when I
switched from Delphi to C++ I was expecting a (SINGLE!) VCL... and a form to
place the UI elements... oh and a "use MyUnit" kind of magical syntax
(instead of include <whatever> or "using namespace std"). But I had to
live with that... and finally I got used to the good and bad of C++
somehow. Then, I started making my first steps into PHP and the web, and
immediately thought similar syntax, supports OOP
will be easy
but I had
to drop and/or rewrite all my first attempt code because it just happened
that the client for which we where working had a hosting of its own with
PHP 4.2 and MySQL 3.23 (there are still some of those archaic hosting
packages over there
I dont need to tell you). So, I just had to get used
to this new OOP of PHP 4 (which is almost no OOP at all).
All in all, my fellow countryman, I guess that unless you have a huge
positive bank account balance, and you drive a BMW (I dont like them
anyway
) youd better off tolerating PHP for this little namespace issue
if you want to stay in business. Unless, of course, that you have an
incoming contract to develop a core system for an NSA mainframe. And if
thats the case please tell them I prepare the most awesome mate
(http://en.wikipedia.org/wiki/Mate_(beverage)) in the world, and that you
cannot work without it.
I know that, even if you wish to leave PHP forever, youll come back
all
the roads will lead you to it. So, youd better take a smart decision now
than have no other choice in the future (... ok, that was kind of The
Godfathers script, lol).
Enjoy your holidays,
Rob
Andrés Robinet | Lead Developer | BESTPLACE CORPORATION
5100 Bayview Drive 206, Royal Lauderdale Landings, Fort Lauderdale, FL 33308
| TEL 954-607-4207 | FAX 954-337-2695
Email: info@bestplace.net | MSN Chat: best@bestplace.net | SKYPE:
bestplace | Web: http://www.bestplace.biz | Web: http://www.seo-diy.com
From: Martin Alterisio [mailto:malterisio777@gmail.com]
Sent: Wednesday, December 26, 2007 2:04 AM
To: PHP Developers Mailing List; PHP General
Subject: [PHP] Sayonara PHPPlease let me do a little explanation of the title first. Japanese is
an
interesting language where context is vital to the meaning of a word.
Sayonara usually means a simple "good bye", but within a different
context
can mean "we'll probably never meet again".To understand this mail you'll have to know that I was just another
user of
PHP, an user that was probably too eager. I wanted to get more involved
with
the development of PHP as I do believe in all the philosophy of open-
source.
In the end I found my attempts ended in frustration, but, nevertheless,
I
learned a lot in just a few months. I don't want this mail to be one
where I
get to display all my frustration, instead I want to leave here all my
findings, the things I researched, the few things I managed to actually
code, and mostly the ideas that someone else might find useful.---- To those who may want to involve in the php internals ----
For those in the generals list that may ever try to venture in the
internals
of PHP, remember that you have to back your point of view with a patch.
So,
sit down, remember the old days in college using the c compiler, and
code
like a cowboy before trying to promote anything in the internals. It's
the
status quo of the PHP development community, as I did learn too late.---- Namespaces: function imports ----
Here is the patch to add function imports to 5.3. To be consistent
constants
imports have to be added too:http://martinalterisio.com.ar/php5.3/use.function.v2b.patch
If you don't know what imports are, they are just a way to refer to a
longer
name with a shorter name. For example:<?php
class MyRowset extends Zend::Db::Table::Rowset::Abstract {
...or with imports:
<?php
use Zend::Db::Table::Rowset::Abstract as Rowset;
class MyRowset extends Rowset {
...The use statement behavior currently supports only class names
aliasing.
Functions and constants have to referred with full name, although these
too
can be namespaced.---- Import statement is broken, why, how can be fixed ----
While doing the previous patch I realized that the use statement is
broken.
It should generate and error when you try to override an existing name.
But
the use statement is handled at compile, where it's unknown if a name
will
be overridden or not. What happens is that the error might be triggered
depending on the conditions and order of compilation. If you have an
opcode
cache, this error may not appear until the cache is invalidated.On a suggestion by Dmitry, which I really don't know if he knew about
this
issue with use or not, but, anyway, his idea solved this issue, I made
this
patch:http://martinalterisio.com.ar/php5.3/file.scope.use.patch
With this the use statement is checked only against the names used in
the
current file (or namespace if using multiple namespaces per file).
Since the
imports only affect the current file, this is more sensible, and the
issue
mentioned before disappears.---- Name clash and ambiguity issue introduced by namespaces ----
There's another pending issue with namespaces, there's a name clash
that
currently goes undetected, and makes static methods partially
unavailable.
This is due to the fact that using :: as namespace separator generates
ambiguity. foo::bar() can refer to the static method bar in class foo,
or to
the function bar in the namespace foo. This is an issue to php library
developers. Someone can inject a namespaced function which overrides
your
static method.One possible solution I approached was to prevent the name clash
altogether,
but I found this approach inappropriate for 2 reasons: the performance
impact is too big; is not consistent with how other name clashes are
handled
in php (classes and functions may have the same name).Another approach, which I believe is the correct one but never got the
chance to implement in a patch, is to change the order of name
resolution,
search first the static method and then the namespaced function, and if
the
user wants to refer to the function he can import the function. This
way
both remain accessible although the user has to solve the ambiguity.
Also
this reduces the impact of adding namespaces on legacy code, since
there's
an impact to all static method calls (because first the namespaced
function
is searched).---- Reducing impact on performance introduced by namespaces ----
I found out that although the philosophy behind the namespaces
implementation is to do as much as possible in compile time, but much
is
pushed to the executor. Those could be solved on compile time. Much can
be
optimized changing the name resolution rules. If these become more
explicit,
the compiler can discern which is the actual name that's referred to.
As of
now, it can be optimized using imports and explicit names, which are
used as
alternative notation. In other words, the normal use of namespaces is
not
optimal.There's still one name resolution that seems inevitable that it will
fall to
the executor: the ambiguity mentioned earlier between static methods
and
namespaced functions. This could be solved by the user if the use
statement
allows to also explicitly indicate the type of import: use class X; use
namespace X; use function X; use const X;---- Fix name resolution rules for better coding practices ----
Also, as of now, I'm more than confident when I say that the current
name
resolution rules will bring much headaches to users. For starters
you'll
have to make a habit of prefixing :: to all internal function calls
(such as
::count, ::strlen, etc). This way will be safer for creating php
libraries,
since another user could inject a namespaced function that overrides
those
functions. This is because the function call without that prefix will
try
first a function in the same namespace then the internal. Also, for
this
same reason, using the :: prefix will be faster (since it's solved at
compile time). And if you want to refer to an element of the current
namespace, is better to use namespace::If you don't know about the name resolution rules, check what's written
in
the manual:http://php.net/language.namespaces.rules
What I wanted to implement but will never get the chance is name
resolution
rules that aren't context aware and explicit:foo(); // is always global foo (except if foo is an alias)
new A(); // is always global class A (except if A is an alias)
A::B(); // try static method of global class then namespaced function
(except if A is an alias)
namespace::foo(); // is always foo() in current namespace
new namespace::A(); // is always class A in current namespace
::foo(); // is always global foo (aliases ignored)
new ::A(); // is always global class A (aliases ignored)I think this will improve readability, maintainability and debugging,
because of its explicitness.---- Autoloading issue with namespaces ----
There is also an issue with autoloading and internal names with the
name
resolution rules. The autoload has to be the last thing tried,
therefore
even if there's a namespaced name that overrides an internal name, it
won't
be seen if its loading its subject to autoloading. That's also another
reason to change the name resolution rules. With the rules I explained
earlier there won't be this issue with the autoload.---- Possible enhancement for autoloading with namespaces ----
Regarding the autoloading, I think there's an enhancement that can be
achieved with the implementation of namespaces. Consider the
possibility of
a namespaced __autoload(). Autoloading in PHP has one important issue:
as
the system grows, and external libraries grow, the complexity of the
autoloading increases. Using the spl autoloader, each library adds its
autoloading. If you have many libraries, autoload can cost too much. If
a
namespaced __autoload() is implemented, this can reduce the impact by
distributing the autoloading behavior, ie, first use the namespace
autoload,
then try the global autoload. A package should know better where its
classes
are.---- Constrained scope for imports is unpractical ----
When trying to refactor code to use namespaces, as a test, I also found
that
having the use statement limited to outer scope is unpractical. One
necessary addition, which is not very complicated, is to have an extra
scope
for use statements, such as imports in a function scope. It's only a
matter
of keeping an extra table for the function scope in compile time.---- Namespaces keyword issue, it can be solved without taking a
keywordThere's still the issue of the keyword taken by the namespaces
implementation. It doesn't matter if it's "package" or "namespace".
Both are
keywords widely used in php (use google code search if you don't
believe
me). I know they have tried to remove the need for the keyword, but I
still
think there's a way. Consider the following:<?php
class Foo::Bar {
use bla::bla;
}
?>Instead of:
<?php
namespace Foo;
use bla::bla;
class Bar {
}
?>In the first there's no need for namespace declaration, it's declared
with
the class name. The same can used for functions and consts:<?php
function Foo::test() {
use bla::bla;
}
const Foo::CONSTANT = 101;
?>This approach restricts namespaces to classes, functions or constants
scope.
If you want to execute code in a namespace you'll have to be in one of
these
scopes. But, I think it's a restriction one would pay in favor of all
those
libraries that will break because they use the fatal keyword (think of
all
the XML related libraries that use "namespace").Also, using namespace:: or package:: doesn't need to take a keyword
(think
of self:: and parent::, they aren't keywords just special names that
can't
be used for naming classes).---- Namespaces as nested classes? ----
Reading about how the previous implementation of namespaces went down
the
drain, one recurring though in some users and developers caught my
attention. Maybe namespaces and nested classes should be one and the
same
thing in php. Considering that many are using classes as namespaces for
functions, this is not such a illogical approach to the problem. I have
not
much considered the technical feasibility of this approach, but one
that
would be probably needed is the ability to forward declare members.
Without
this, all definition must be clustered.Example:
<?php
final class A { // should be final to have nested classes?
public class B; // forward declaration
}
?>other file:
<?php
class A::B {
}
?>I can't say much about this approach. It's just one wild idea.
---- Type hints, improvements could help drastically improve
performanceI thought much about type hints. Right now they are only seen as
syntactic
sugar for system designers, and something that reduces performance.
Actually
quite the opposite can be achieved, but not with the current
implementation
of type hinting. The guys behind flash 9 obtained a 10x improvement in
performance thanks to type hinting. Actually doing the same with PHP is
quite sensible, since one of the bottlenecks for performance is the
zval.
Knowing before hand that the variable is a native type, a just in time
native compile can be done to drastically improve performance.For that to happen first type hinting must be improved. Here are some
thoughts I shared with another user some time ago:http://martinalterisio.com.ar/php5.3/php-typehints.txt
---- Taints ----
Last but not least, I thought about taints. Since PHP6 will remove safe
mode
and magic quotes, as far as I know, if nothing else is there to prevent
users from being users, well PHP6 might be considered too insecure.
Taints
should be the solution to this, but approaches copied from other
languages
seem not feasible in PHP. Variable level taints are not the way to go:
not
much can be added to zval without suffering the consequences, and a
simple
model of tainted/not tainted is not safe enough, as there are many
taints to
be considered (XSS, SQL injection, HTML injection to say the least).I think one possible approach to consider is scope taints. Instead of
tracking taints on variable level, do it on scope level, ie, attach
taints
to functions, classes, global scope. Taints should be an arbitrarily
sized
list of elements, where the user can also add taints of his own (we
don't
know where security holes might appear in the near future, so let's
leave
that door open). Taints tracking is to be attached to classes,
functions or
global scope (methods use class scope).When function or class code refer to another scope (function call,
method
call, member access, global access) a pollution occurs. In a pollution
the
involved scopes become infected with taints from both. The pollution
operation needs a new opcode that can handle a reference to scope
either
statically or by an object reference. For each function/class the user
has
to be able to mark taints that infect them, which taints they can
handle/resist, and which taints they reject. A function/class ignores
pollution by taints that can handle/resist. If a function/class is
polluted
by a taint that rejects, an error occurs. Internal functions should
define
also how they are affected by taints, and some defaults taints be
specified
for known security issues.The problem with this approach is that is not an automagical solution.
It
requires the user to be conscious of the security issues. If he does
nothing
about it an error occurs, but he can mark the scope as one that handles
the
taint and still do nothing about it.There's two alternatives to how keep track of taints:
- keep a list of taints that pollute the scope
- keep a list of taints that DO NOT pollute the scope
The second alternative is harder to understand. It assumes that any
scope
cannot be trusted by nature. Instead of adding threats, you remove
threats.
I think this approach is more secure.---- The end ----
Well that wraps it all, I think. That's as much as I can download from
my
brain which is related to PHP. Do whatever you want with all this, even
the
spam folder is fine.Anyway, it's been fun, and I learned a lot.
My thanks to everyone that ever gave a hand.A former PHP user says to you all:
Sayonara PHP
P.S.: Please be understanding if I don't answer replies to this email.
Hello Martin,
I am just yet another PHP advanced user looking at the current state of
development of PHP to know what is coming and will have to be faced. I'm
not used to write here: just reading and seeing how things evolve. A
while ago, I unsuscribed because it was again a brawl and I didn't take
time to read on. But since I wanted to know what was about to change in
5.3 and what becomes of 6.0, I suscribed again. So, for me, even if you
leave us, your mail is very interesting and I now better understand
what's the play with namespaces.
For once, I can use a Japanese word without mistake, and say :
Sayonara Martin,
Jil.
Martin Alterisio a écrit :
Please let me do a little explanation of the title first. Japanese is an
interesting language where context is vital to the meaning of a word.
Sayonara usually means a simple "good bye", but within a different context
can mean "we'll probably never meet again".To understand this mail you'll have to know that I was just another user of
PHP, an user that was probably too eager. I wanted to get more involved with
the development of PHP as I do believe in all the philosophy of open-source.
In the end I found my attempts ended in frustration, but, nevertheless, I
learned a lot in just a few months. I don't want this mail to be one where I
get to display all my frustration, instead I want to leave here all my
findings, the things I researched, the few things I managed to actually
code, and mostly the ideas that someone else might find useful.---- To those who may want to involve in the php internals ----
For those in the generals list that may ever try to venture in the internals
of PHP, remember that you have to back your point of view with a patch. So,
sit down, remember the old days in college using the c compiler, and code
like a cowboy before trying to promote anything in the internals. It's the
status quo of the PHP development community, as I did learn too late.---- Namespaces: function imports ----
Here is the patch to add function imports to 5.3. To be consistent constants
imports have to be added too:http://martinalterisio.com.ar/php5.3/use.function.v2b.patch
If you don't know what imports are, they are just a way to refer to a longer
name with a shorter name. For example:<?php
class MyRowset extends Zend::Db::Table::Rowset::Abstract {
...or with imports:
<?php
use Zend::Db::Table::Rowset::Abstract as Rowset;
class MyRowset extends Rowset {
...The use statement behavior currently supports only class names aliasing.
Functions and constants have to referred with full name, although these too
can be namespaced.---- Import statement is broken, why, how can be fixed ----
While doing the previous patch I realized that the use statement is broken.
It should generate and error when you try to override an existing name. But
the use statement is handled at compile, where it's unknown if a name will
be overridden or not. What happens is that the error might be triggered
depending on the conditions and order of compilation. If you have an opcode
cache, this error may not appear until the cache is invalidated.On a suggestion by Dmitry, which I really don't know if he knew about this
issue with use or not, but, anyway, his idea solved this issue, I made this
patch:http://martinalterisio.com.ar/php5.3/file.scope.use.patch
With this the use statement is checked only against the names used in the
current file (or namespace if using multiple namespaces per file). Since the
imports only affect the current file, this is more sensible, and the issue
mentioned before disappears.---- Name clash and ambiguity issue introduced by namespaces ----
There's another pending issue with namespaces, there's a name clash that
currently goes undetected, and makes static methods partially unavailable.
This is due to the fact that using :: as namespace separator generates
ambiguity. foo::bar() can refer to the static method bar in class foo, or to
the function bar in the namespace foo. This is an issue to php library
developers. Someone can inject a namespaced function which overrides your
static method.One possible solution I approached was to prevent the name clash altogether,
but I found this approach inappropriate for 2 reasons: the performance
impact is too big; is not consistent with how other name clashes are handled
in php (classes and functions may have the same name).Another approach, which I believe is the correct one but never got the
chance to implement in a patch, is to change the order of name resolution,
search first the static method and then the namespaced function, and if the
user wants to refer to the function he can import the function. This way
both remain accessible although the user has to solve the ambiguity. Also
this reduces the impact of adding namespaces on legacy code, since there's
an impact to all static method calls (because first the namespaced function
is searched).---- Reducing impact on performance introduced by namespaces ----
I found out that although the philosophy behind the namespaces
implementation is to do as much as possible in compile time, but much is
pushed to the executor. Those could be solved on compile time. Much can be
optimized changing the name resolution rules. If these become more explicit,
the compiler can discern which is the actual name that's referred to. As of
now, it can be optimized using imports and explicit names, which are used as
alternative notation. In other words, the normal use of namespaces is not
optimal.There's still one name resolution that seems inevitable that it will fall to
the executor: the ambiguity mentioned earlier between static methods and
namespaced functions. This could be solved by the user if the use statement
allows to also explicitly indicate the type of import: use class X; use
namespace X; use function X; use const X;---- Fix name resolution rules for better coding practices ----
Also, as of now, I'm more than confident when I say that the current name
resolution rules will bring much headaches to users. For starters you'll
have to make a habit of prefixing :: to all internal function calls (such as
::count, ::strlen, etc). This way will be safer for creating php libraries,
since another user could inject a namespaced function that overrides those
functions. This is because the function call without that prefix will try
first a function in the same namespace then the internal. Also, for this
same reason, using the :: prefix will be faster (since it's solved at
compile time). And if you want to refer to an element of the current
namespace, is better to use namespace::If you don't know about the name resolution rules, check what's written in
the manual:http://php.net/language.namespaces.rules
What I wanted to implement but will never get the chance is name resolution
rules that aren't context aware and explicit:foo(); // is always global foo (except if foo is an alias)
new A(); // is always global class A (except if A is an alias)
A::B(); // try static method of global class then namespaced function
(except if A is an alias)
namespace::foo(); // is always foo() in current namespace
new namespace::A(); // is always class A in current namespace
::foo(); // is always global foo (aliases ignored)
new ::A(); // is always global class A (aliases ignored)I think this will improve readability, maintainability and debugging,
because of its explicitness.---- Autoloading issue with namespaces ----
There is also an issue with autoloading and internal names with the name
resolution rules. The autoload has to be the last thing tried, therefore
even if there's a namespaced name that overrides an internal name, it won't
be seen if its loading its subject to autoloading. That's also another
reason to change the name resolution rules. With the rules I explained
earlier there won't be this issue with the autoload.---- Possible enhancement for autoloading with namespaces ----
Regarding the autoloading, I think there's an enhancement that can be
achieved with the implementation of namespaces. Consider the possibility of
a namespaced __autoload(). Autoloading in PHP has one important issue: as
the system grows, and external libraries grow, the complexity of the
autoloading increases. Using the spl autoloader, each library adds its
autoloading. If you have many libraries, autoload can cost too much. If a
namespaced __autoload() is implemented, this can reduce the impact by
distributing the autoloading behavior, ie, first use the namespace autoload,
then try the global autoload. A package should know better where its classes
are.---- Constrained scope for imports is unpractical ----
When trying to refactor code to use namespaces, as a test, I also found that
having the use statement limited to outer scope is unpractical. One
necessary addition, which is not very complicated, is to have an extra scope
for use statements, such as imports in a function scope. It's only a matter
of keeping an extra table for the function scope in compile time.---- Namespaces keyword issue, it can be solved without taking a keyword
There's still the issue of the keyword taken by the namespaces
implementation. It doesn't matter if it's "package" or "namespace". Both are
keywords widely used in php (use google code search if you don't believe
me). I know they have tried to remove the need for the keyword, but I still
think there's a way. Consider the following:<?php
class Foo::Bar {
use bla::bla;
}
?>Instead of:
<?php
namespace Foo;
use bla::bla;
class Bar {
}
?>In the first there's no need for namespace declaration, it's declared with
the class name. The same can used for functions and consts:<?php
function Foo::test() {
use bla::bla;
}
const Foo::CONSTANT = 101;
?>This approach restricts namespaces to classes, functions or constants scope.
If you want to execute code in a namespace you'll have to be in one of these
scopes. But, I think it's a restriction one would pay in favor of all those
libraries that will break because they use the fatal keyword (think of all
the XML related libraries that use "namespace").Also, using namespace:: or package:: doesn't need to take a keyword (think
of self:: and parent::, they aren't keywords just special names that can't
be used for naming classes).---- Namespaces as nested classes? ----
Reading about how the previous implementation of namespaces went down the
drain, one recurring though in some users and developers caught my
attention. Maybe namespaces and nested classes should be one and the same
thing in php. Considering that many are using classes as namespaces for
functions, this is not such a illogical approach to the problem. I have not
much considered the technical feasibility of this approach, but one that
would be probably needed is the ability to forward declare members. Without
this, all definition must be clustered.Example:
<?php
final class A { // should be final to have nested classes?
public class B; // forward declaration
}
?>other file:
<?php
class A::B {
}
?>I can't say much about this approach. It's just one wild idea.
---- Type hints, improvements could help drastically improve performance
I thought much about type hints. Right now they are only seen as syntactic
sugar for system designers, and something that reduces performance. Actually
quite the opposite can be achieved, but not with the current implementation
of type hinting. The guys behind flash 9 obtained a 10x improvement in
performance thanks to type hinting. Actually doing the same with PHP is
quite sensible, since one of the bottlenecks for performance is the zval.
Knowing before hand that the variable is a native type, a just in time
native compile can be done to drastically improve performance.For that to happen first type hinting must be improved. Here are some
thoughts I shared with another user some time ago:http://martinalterisio.com.ar/php5.3/php-typehints.txt
---- Taints ----
Last but not least, I thought about taints. Since PHP6 will remove safe mode
and magic quotes, as far as I know, if nothing else is there to prevent
users from being users, well PHP6 might be considered too insecure. Taints
should be the solution to this, but approaches copied from other languages
seem not feasible in PHP. Variable level taints are not the way to go: not
much can be added to zval without suffering the consequences, and a simple
model of tainted/not tainted is not safe enough, as there are many taints to
be considered (XSS, SQL injection, HTML injection to say the least).I think one possible approach to consider is scope taints. Instead of
tracking taints on variable level, do it on scope level, ie, attach taints
to functions, classes, global scope. Taints should be an arbitrarily sized
list of elements, where the user can also add taints of his own (we don't
know where security holes might appear in the near future, so let's leave
that door open). Taints tracking is to be attached to classes, functions or
global scope (methods use class scope).When function or class code refer to another scope (function call, method
call, member access, global access) a pollution occurs. In a pollution the
involved scopes become infected with taints from both. The pollution
operation needs a new opcode that can handle a reference to scope either
statically or by an object reference. For each function/class the user has
to be able to mark taints that infect them, which taints they can
handle/resist, and which taints they reject. A function/class ignores
pollution by taints that can handle/resist. If a function/class is polluted
by a taint that rejects, an error occurs. Internal functions should define
also how they are affected by taints, and some defaults taints be specified
for known security issues.The problem with this approach is that is not an automagical solution. It
requires the user to be conscious of the security issues. If he does nothing
about it an error occurs, but he can mark the scope as one that handles the
taint and still do nothing about it.There's two alternatives to how keep track of taints:
- keep a list of taints that pollute the scope
- keep a list of taints that DO NOT pollute the scope
The second alternative is harder to understand. It assumes that any scope
cannot be trusted by nature. Instead of adding threats, you remove threats.
I think this approach is more secure.---- The end ----
Well that wraps it all, I think. That's as much as I can download from my
brain which is related to PHP. Do whatever you want with all this, even the
spam folder is fine.Anyway, it's been fun, and I learned a lot.
My thanks to everyone that ever gave a hand.A former PHP user says to you all:
Sayonara PHP
P.S.: Please be understanding if I don't answer replies to this email.
The use statement behavior currently supports only class names aliasing.
Functions and constants have to referred with full name, although these too
can be namespaced.
Right, this was an conscious decision - since there are much more
functions than classes, importing functions into global space is much
higher risk (and utility functions in OO libraries are less frequently
used externally then classes).
resolution rules will bring much headaches to users. For starters you'll
have to make a habit of prefixing :: to all internal function calls (such as
::count, ::strlen, etc). This way will be safer for creating php libraries,
You can do it, but mandating it is not a good idea, which was discussed
here.
---- Constrained scope for imports is unpractical ----
When trying to refactor code to use namespaces, as a test, I also found that
having the use statement limited to outer scope is unpractical. One
necessary addition, which is not very complicated, is to have an extra scope
for use statements, such as imports in a function scope. It's only a matter
of keeping an extra table for the function scope in compile time.
It would make PHP code more complicated, so I'm not sure it is a good
addition.
---- Namespaces as nested classes? ----
Reading about how the previous implementation of namespaces went down the
drain, one recurring though in some users and developers caught my
attention. Maybe namespaces and nested classes should be one and the same
thing in php. Considering that many are using classes as namespaces for
The issue of nested classes was discussed numerous times and it never
proved to be a good idea.
---- Type hints, improvements could help drastically improve performance
I thought much about type hints. Right now they are only seen as syntactic
sugar for system designers, and something that reduces performance. Actually
quite the opposite can be achieved, but not with the current implementation
of type hinting. The guys behind flash 9 obtained a 10x improvement in
performance thanks to type hinting. Actually doing the same with PHP is
quite sensible, since one of the bottlenecks for performance is the zval.
It would be nice to hear what did you mean by that.
Knowing before hand that the variable is a native type, a just in time
native compile can be done to drastically improve performance.
I know of no efforts for JIT compiling PHP, so I guess if we'd want to
get something there, we'd better start with having any JIT at all and
not with modifying the language for possible imaginary JIT compiler.
---- Taints ----
Last but not least, I thought about taints. Since PHP6 will remove safe mode
and magic quotes, as far as I know, if nothing else is there to prevent
users from being users, well PHP6 might be considered too insecure. Taints
I must note that the primary reason for removal of "safe mode" and
"magic quotes" were the fact that it does not improve security. Thus,
PHP 6 is not less secure than any versions with these capabilities.
Primary fault of "magic quotes" was that it tried to solve input
filtering problem without taking into account how variables are used,
better input filtering extension takes care of that. Primary fault of
"safe mode" was that it tried to bring into PHP things that can not be
properly done in PHP, since PHP engine can not control extensions.
much can be added to zval without suffering the consequences, and a simple
model of tainted/not tainted is not safe enough, as there are many taints to
be considered (XSS, SQL injection, HTML injection to say the least).
That's what current proposal is doing. I can not have an opinion of how
good it does it since I didn't finish reviewing it yet.
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com