We really need to settle on whether we want unicode.semantics to be
changeable at runtime or not. During early development it was
ZEND_INI_PERDIR, meaning that it could be changed in .htaccess and
VirtualHost blocks. However, the infrastructure to support this
flexibility was deemed too complicated at the Paris PDM. Basically, we
need to maintain two sets of symbol tables and convert between them on
the fly as well as two copies of each class entry. The latter was
especially problematic instead of just mentioning class entry pointer,
you had to access it like U_CLASS_ENTRY(ce). So it was decided that
unicode.semantics switch would be only ZEND_INI_SYSTEM and that is how
the development proceeded since then. However, there have come up
concerns that keeping it this way will make PHP 6 adoption infeasible
by the majority of hosting companies and users since they would have to
run two copies of Apache to support both modes.
We can go back to the PERDIR version, but that requires a lot of work
and not just in the engine, but also in a lot of extensions. I will let
Dmitry provide the technical details, but we need to decide which way
to go:
- ZEND_INI_SYSTEM and make people run two copies of Apache if they
want both modes. This is architecturally more simple and more robust, I
believe. - ZEND_INI_PERDIR and let people switch modes as described above. This
is a lot of work and will probably result in quite a few edge cases
where we used to rely on stability of one mode (such as APC or
serialization, for example).
-Andrei
- ZEND_INI_SYSTEM and make people run two copies of Apache if they want
both modes. This is architecturally more simple and more robust, I believe.- ZEND_INI_PERDIR and let people switch modes as described above. This
is a lot of work and will probably result in quite a few edge cases
where we used to rely on stability of one mode (such as APC or
serialization, for example).
I agree that #1 is more robust, and altogether simpler.
However, I think that this will heavily affect the PHP 6 adoption
rate... especially for virtual hosters. If, for example,
register_globals had no PERDIR/per-vhost capability, I'd still have
applications that I couldn't run (... mostly from cvs.php.net).
Therefore, from an end-user/sysadmin standpoint, I'd have to go with #2.
S
Andrei Zmievski wrote:
- ZEND_INI_SYSTEM and make people run two copies of Apache if they want
both modes. This is architecturally more simple and more robust, I believe.- ZEND_INI_PERDIR and let people switch modes as described above. This
is a lot of work and will probably result in quite a few edge cases
where we used to rely on stability of one mode (such as APC or
serialization, for example).
I think it would be best for all of us to go the more robust route
and make unicode.semantics only changeable in php.ini.
Regards,
Michael
Andrei Zmievski wrote:
- ZEND_INI_SYSTEM and make people run two copies of Apache if they want
both modes. This is architecturally more simple and more robust, I believe.- ZEND_INI_PERDIR and let people switch modes as described above. This
is a lot of work and will probably result in quite a few edge cases
where we used to rely on stability of one mode (such as APC or
serialization, for example).I think it would be best for all of us to go the more robust route
and make unicode.semantics only changeable in php.ini.
I concur.
regards,
Derick
Derick Rethans
http://derickrethans.nl | http://ez.no | http://xdebug.org
Andrei Zmievski wrote:
- ZEND_INI_SYSTEM and make people run two copies of Apache if they
want
both modes. This is architecturally more simple and more robust, I
believe.- ZEND_INI_PERDIR and let people switch modes as described above. This
is a lot of work and will probably result in quite a few edge cases
where we used to rely on stability of one mode (such as APC or
serialization, for example).I think it would be best for all of us to go the more robust route
and make unicode.semantics only changeable in php.ini.I concur.
Me too, if it counts any. I think the people that really need Unicode
support (aka most of the world) will have it enabled as soon as it's
possible to do so, and those that don't (USA and some of Europe) won't care
about it enough to adopt until they get a Chinese client or so. At which
point they'll have to change their ideas anyway...
- Steph
regards,
Derick
Michael Wallner wrote:
I think it would be best for all of us to go the more robust route
and make unicode.semantics only changeable in php.ini.
I actually think that I'm currently experiencing a problem related to
this issue. The new output control code supports aliases, i.e. one
can start internal handlers with the ob_start()
function.
Those aliases are maintained in a HashTable filled at MINIT, but
UG(unicode) is always 0 at MINIT, so ob_start("ob_gzhandler") won't
work if unicode.semantics is enabled and respectively
ob_start(b"ob_gzhandler") works only.
Regards,
Michael
We really need to settle on whether we want unicode.semantics to be
changeable at runtime or not. During early development it was
ZEND_INI_PERDIR, meaning that it could be changed in .htaccess and
VirtualHost blocks. However, the infrastructure to support this
flexibility was deemed too complicated at the Paris PDM. Basically, we
need to maintain two sets of symbol tables and convert between them on
the fly as well as two copies of each class entry. The latter was
especially problematic instead of just mentioning class entry pointer,
you had to access it like U_CLASS_ENTRY(ce). So it was decided that
unicode.semantics switch would be only ZEND_INI_SYSTEM and that is how
the development proceeded since then. However, there have come up
concerns that keeping it this way will make PHP 6 adoption infeasible
by the majority of hosting companies and users since they would have to
run two copies of Apache to support both modes.
I think making it INI_PERDIR
would mean a lot of headache and overcomplicated code in the engine and the extensions that modify its behaviour (like APC).
AFAIK lot of shared hosters provide PHP4 as Apache module and PHP5 as CGI atm.
This is also possible with PHP5 and PHP6, in which case the INI_SYSTEM
limitations do not really matter, since it's only per request.
We can go back to the PERDIR version, but that requires a lot of work
and not just in the engine, but also in a lot of extensions. I will let
Dmitry provide the technical details, but we need to decide which way
to go:
- ZEND_INI_SYSTEM and make people run two copies of Apache if they
want both modes. This is architecturally more simple and more robust, I
believe.
- ZEND_INI_PERDIR and let people switch modes as described above. This
is a lot of work and will probably result in quite a few edge cases
where we used to rely on stability of one mode (such as APC or
serialization, for example).
--
Wbr,
Antony Dovgal
I think making it
INI_PERDIR
would mean a lot of headache and overcomplicated code in the engine and the extensions that modify its behaviour (like APC).
i don't guess so. it's sure need upgrade but easy to implement, simply
put unicode.semantics into entry key, and hash/search by the key.
From a technical perspective it makes sense to keep it php.ini only
setting or as Sara insists (STARTUP phase only). However, from a user
(hosting companies) perspective it adds a fair degree of complexity
to their setup, which would probably mean one php6 instance will need
to run as CGI or FCGI, which will without a doubt affect adoption
rates and/or or unicode.semantics being enabled by default on most
installs.
Personally, I think we'd be better off with a slower adoption rate,
but a more robust PHP without added engine/language complexity per-
dir unicode.semantics would add.
Ilia Alshanetsky
Ilia Alshanetsky wrote:
From a technical perspective it makes sense to keep it php.ini only
setting or as Sara insists (STARTUP phase only). However, from a user
(hosting companies) perspective it adds a fair degree of complexity to
their setup, which would probably mean one php6 instance will need to
run as CGI or FCGI, which will without a doubt affect adoption rates
and/or or unicode.semantics being enabled by default on most installs.Personally, I think we'd be better off with a slower adoption rate, but
a more robust PHP without added engine/language complexity per- dir
unicode.semantics would add.Ilia Alshanetsky
My personal opinion, as humble as it may be, is that it's pure bullshit
to even give the chance of disabling it. WHY in hell's name would you
want to give hoster's the choice? I can see a part of the hosts
disabling it to "give an easy transition" while another part of the
hosts enable it to "give the new features a chance". If Unicode support
it supposed to be such a big part of the while PHP6 release then why do
you give the option of disabling it? you're breaking away part of the
MAIN reason why people would want to upgrade in the first place.
Just imagine what a mess it would be if you had given the choice of
"disabling" the OOP support in PHP5. Be very very very glag you didn't
do that, and as such I'd suggest not doing something equally drastic in
PHP6.
Anyway, just a user's point of view here.
Maciej Sokolewicz
Ilia Alshanetsky wrote:
From a technical perspective it makes sense to keep it php.ini
only setting or as Sara insists (STARTUP phase only). However,
from a user (hosting companies) perspective it adds a fair degree
of complexity to their setup, which would probably mean one php6
instance will need to run as CGI or FCGI, which will without a
doubt affect adoption rates and/or or unicode.semantics being
enabled by default on most installs.
Personally, I think we'd be better off with a slower adoption
rate, but a more robust PHP without added engine/language
complexity per- dir unicode.semantics would add.
Ilia AlshanetskyMy personal opinion, as humble as it may be, is that it's pure
bullshit to even give the chance of disabling it. WHY in hell's
name would you want to give hoster's the choice?
Well, with unicode semantics enabled, many PHP applications that have
not been designed with PHP6+unicode in mind are likely to break. On
the other hand when semantics are off, those applications may work
just fine. The other reason could be that unicode enabled PHP will be
noticeably slower then the one without it, so hosters to conserve
system resources may only enable it for people who actually need the
functionality.
Ilia Alshanetsky
My personal opinion, as humble as it may be, is that it's pure bullshit
to even give the chance of disabling it. WHY in hell's name would you
want to give hoster's the choice? I can see a part of the hosts
disabling it to "give an easy transition" while another part of the
hosts enable it to "give the new features a chance". If Unicode support
it supposed to be such a big part of the while PHP6 release then why do
you give the option of disabling it? you're breaking away part of the
MAIN reason why people would want to upgrade in the first place.Just imagine what a mess it would be if you had given the choice of
"disabling" the OOP support in PHP5. Be very very very glag you didn't
do that, and as such I'd suggest not doing something equally drastic in
PHP6.
I agree on this. From my reading of some the issues around unicode you are
far better off simply saying PHP6 is unicode only. A lot of scripts that use
register_globals and any number of deprecated features are simply not going
to work with PHP6 anyway.
There is another side to this. For those developers who are NOT unicode
aware they will be wondering what all the fuss is about. They are probably
the same developers who wonder why anyone would want discrete
setters/getters in PHPs OOP. I think it is these people who will have the
biggest headache.
Many ISPs install CPanel or Plesk or whatever and never configure things
anyway.
Actively promoting ISPs capable of taking PHP6 and running with it would be
a nice way to show/prove the value of PHP6.
What would be far more useful would be a very clear set of documentation
regarding the steps needed to take to move a PHP based application forward
into the Unicode'd world.
Regards,
Richard Quadling.
Richard Quadling
Zend Certified Engineer :
http://zend.com/zce.php?c=ZEND002498&r=213474731
"Standing on the shoulders of some very clever giants!"
Richard Quadling wrote:
I agree on this. From my reading of some the issues around unicode you are
far better off simply saying PHP6 is unicode only. A lot of scripts that
You didn't address the point of people who don't need Unicode but need
the fast possible version (because slower running time = more servers
needed = more money spent).
Now whether you say "screw them, they should stick to PHP5" is up to the
core team, that's for sure :-)
- Chris
Hello,
- ZEND_INI_SYSTEM and make people run two copies of Apache if they
want both modes. This is architecturally more simple and more robust, I
believe.
I believe that too. It is not only more robust but will also ease your/our life.
--Pierre
As Andrei knows, I believe that not allowing to tune this on a per virtual
host basis, is going to make life very hard for our users. A huge part of
our users are hosting providers, or companies running multiple applications
on the same machine. Probably a majority do not own dedicated boxes. This
kind of limitation is going to not only slow down PHP 6 adoption, but I
think it may also significantly impair PHP as a hosting friendly solution,
and therefore, we could actually see a loss in overall PHP market share.
I suggest to first make the theoreticaly decision that we prefer to support
this on a per-request if it's feasible. When I say feasible it means with
some but minimal pain. If it becomes a disaster we should re-evaluate. I'll
try and spend the next week to try and see what the issues are and whether
we can resolve them in an acceptable way.
Andi
-----Original Message-----
From: Andrei Zmievski [mailto:andrei@gravitonic.com]
Sent: Wednesday, September 06, 2006 9:40 AM
To: PHP Internals
Cc: Dmitry Stogov
Subject: [PHP-DEV] RFC: unicode.semantics: runtime or not?We really need to settle on whether we want unicode.semantics
to be changeable at runtime or not. During early development
it was ZEND_INI_PERDIR, meaning that it could be changed in
.htaccess and VirtualHost blocks. However, the infrastructure
to support this flexibility was deemed too complicated at the
Paris PDM. Basically, we need to maintain two sets of symbol
tables and convert between them on the fly as well as two
copies of each class entry. The latter was especially
problematic instead of just mentioning class entry pointer,
you had to access it like U_CLASS_ENTRY(ce). So it was
decided that unicode.semantics switch would be only
ZEND_INI_SYSTEM and that is how the development proceeded
since then. However, there have come up concerns that keeping
it this way will make PHP 6 adoption infeasible by the
majority of hosting companies and users since they would have
to run two copies of Apache to support both modes.We can go back to the PERDIR version, but that requires a lot
of work and not just in the engine, but also in a lot of
extensions. I will let Dmitry provide the technical details,
but we need to decide which way to go:
- ZEND_INI_SYSTEM and make people run two copies of Apache
if they want both modes. This is architecturally more simple
and more robust, I believe.- ZEND_INI_PERDIR and let people switch modes as described
above. This is a lot of work and will probably result in
quite a few edge cases where we used to rely on stability of
one mode (such as APC or serialization, for example).-Andrei
--
To
unsubscribe, visit: http://www.php.net/unsub.php
Hello Andrei,
we have already a freaking complexapi to deal with and on the other hand
we have fastcgisupport. What we should imo do is trying to drop complexity
of our api and not increase it to an unhandable extreme and instead promote
usage of fastcgi. My 2c.
best regards
marcus
Wednesday, September 6, 2006, 6:39:57 PM, you wrote:
We really need to settle on whether we want unicode.semantics to be
changeable at runtime or not. During early development it was
ZEND_INI_PERDIR, meaning that it could be changed in .htaccess and
VirtualHost blocks. However, the infrastructure to support this
flexibility was deemed too complicated at the Paris PDM. Basically, we
need to maintain two sets of symbol tables and convert between them on
the fly as well as two copies of each class entry. The latter was
especially problematic instead of just mentioning class entry pointer,
you had to access it like U_CLASS_ENTRY(ce). So it was decided that
unicode.semantics switch would be only ZEND_INI_SYSTEM and that is how
the development proceeded since then. However, there have come up
concerns that keeping it this way will make PHP 6 adoption infeasible
by the majority of hosting companies and users since they would have to
run two copies of Apache to support both modes.
We can go back to the PERDIR version, but that requires a lot of work
and not just in the engine, but also in a lot of extensions. I will let
Dmitry provide the technical details, but we need to decide which way
to go:
- ZEND_INI_SYSTEM and make people run two copies of Apache if they
want both modes. This is architecturally more simple and more robust, I
believe.- ZEND_INI_PERDIR and let people switch modes as described above. This
is a lot of work and will probably result in quite a few edge cases
where we used to rely on stability of one mode (such as APC or
serialization, for example).
I tend to agree with this. But I guess not everyone does..
-Andrei
Hello Andrei,
we have already a freaking complexapi to deal with and on the other
hand
we have fastcgisupport. What we should imo do is trying to drop
complexity
of our api and not increase it to an unhandable extreme and instead
promote
usage of fastcgi. My 2c.
I sat on this for a few days and the egg that hatched seems to agree with
the concensus....
Keep it INI_SYSTEM. Yes it'll be a little more hassle for hosting
providers, but that's the reality of the beast. The fact that unicode
semantics can be turned off is our consession to the gods of the upgrade
path and it providers plenty of maintenance woes already. Complicating this
with PERDIR logic is only going to make PHP6 fat, slow, and difficult to
maintain/extend.
-Sara