OK. Now your patch will work, but I would like to
think about more elegant solution.
The problem that I am busy with other work.
Could you please wait a week and then commit it if
I won't return (on the next Tuesday).Argh. Can we please accelerate this somehow?
This patch is necessary for the HTTP request
decoding work in PHP 6 and we really should
get it done sooner than later.
Okay, rewind and reset time.
Dmitry, here's a quick summary of what's being done, how, and why.
Initial Problem: PHP6 needs better http input encoding detection,
preferably with minimal wasted effort in conversion and limited vectors
for conversion failure based attacks.
Proposed Solution: Wait until the first time a given input argument is
requested before actually converting it. This allows scripts to perform
their own (potentially more relevant) determination of what the correct
input encoding is.
Proposed Implementation for this solution: Make JIT be runtime based
and fine-grained enough to signal not just the autoglobal being fetched,
but what specific dimension/property within that auto global is being
requested. Using runtime-dimension-JIT to decode input arguments as
they are requested.
Rejected Implementation: Use object/array-access overloading to JIT the
values instead. While this solution is the simplest and can be done
with relatively few LOCs, it breaks assumptions about the GPC auto
globals (is_array() fails, is_object()
succeeds, assignments of the
autoglobals becomes "reference-like"*). In short, this solution
introduces BC issues.
Next Problem: How to actually make runtime-JIT with dim/prop level
granularity?
Proposed Solution: Catch fetches during FETCH_DIM/FETCH_OBJ execution
handlers.
Next Problem: auto_globals aren't processed as CVs, meaning that during
FETCH_DIM, there's no way to tell if op1 came from an auto global or not
(since the fetch happened earlier).
Solution (Implemented last week): Remove restriction on CVing auto
globals by adding a fetch_type field to auto global structure.
Next Problem: Silence operator forces non-CV even in situations where a
CV is appropriate since the associated fetch_dim/obj op would not fall
outside of silence scoping.
Proposed Solution (patch from prior email): modify the variable parsing
routines slightly to rewrite simple fetch ops to CV'd fetch_dim/obj ops
when appropriate.
I'm not meaning to apply pressure (a week doesn't effect my timetable
any), I can even move-forward with the next (and last) ZE related patch
(FETCH_DIM/FETCH_OBJ handling) separate from this one. I'm just trying
to balance Andrei's timetable on one side, with a desired to not
overwhelm you and Andi with ZE patches on the other. Hopefully this
summary helps everyone get on the same page.
-Sara
-
- Sidenote: I refuse to call object behavior "reference by default",
I've had too many people notice that it's not actually true and expect
me to explain why in 2 minutes without the aid of a whiteboard.in
- Sidenote: I refuse to call object behavior "reference by default",
Hi Sara,
Thank you for detailed description, now I see goal that was hidden from me.
I am litle bit afraid about FETCH_DIM/FETCH_OBJ patch, because it probably
will slowdown each array operation,
and I still don't understand what is wrong with silence.
Can you show the whole patch (with FETCH_DIM)? (may be draft)
Thanks. Dmitry.
-----Original Message-----
From: Sara Golemon [mailto:pollita@php.net]
Sent: Tuesday, January 23, 2007 10:02 PM
To: internals@lists.php.net; Andrei Zmievski; Andi Gutmans;
Dmitry Stogov
Subject: Autoglobal CVs without silence -- SummaryOK. Now your patch will work, but I would like to
think about more elegant solution.
The problem that I am busy with other work.
Could you please wait a week and then commit it if
I won't return (on the next Tuesday).Argh. Can we please accelerate this somehow?
This patch is necessary for the HTTP request
decoding work in PHP 6 and we really should
get it done sooner than later.Okay, rewind and reset time.
Dmitry, here's a quick summary of what's being done, how, and why.
Initial Problem: PHP6 needs better http input encoding detection,
preferably with minimal wasted effort in conversion and
limited vectors
for conversion failure based attacks.Proposed Solution: Wait until the first time a given input
argument is
requested before actually converting it. This allows scripts
to perform
their own (potentially more relevant) determination of what
the correct
input encoding is.Proposed Implementation for this solution: Make JIT be runtime based
and fine-grained enough to signal not just the autoglobal
being fetched,
but what specific dimension/property within that auto global is being
requested. Using runtime-dimension-JIT to decode input arguments as
they are requested.Rejected Implementation: Use object/array-access overloading
to JIT the
values instead. While this solution is the simplest and can be done
with relatively few LOCs, it breaks assumptions about the GPC auto
globals (is_array() fails,is_object()
succeeds, assignments of the
autoglobals becomes "reference-like"*). In short, this solution
introduces BC issues.
Next Problem: How to actually make runtime-JIT with dim/prop level
granularity?Proposed Solution: Catch fetches during FETCH_DIM/FETCH_OBJ execution
handlers.
Next Problem: auto_globals aren't processed as CVs, meaning
that during
FETCH_DIM, there's no way to tell if op1 came from an auto
global or not
(since the fetch happened earlier).Solution (Implemented last week): Remove restriction on CVing auto
globals by adding a fetch_type field to auto global structure.
Next Problem: Silence operator forces non-CV even in
situations where a
CV is appropriate since the associated fetch_dim/obj op would
not fall
outside of silence scoping.Proposed Solution (patch from prior email): modify the
variable parsing
routines slightly to rewrite simple fetch ops to CV'd
fetch_dim/obj ops
when appropriate.
I'm not meaning to apply pressure (a week doesn't effect my timetable
any), I can even move-forward with the next (and last) ZE
related patch
(FETCH_DIM/FETCH_OBJ handling) separate from this one. I'm
just trying
to balance Andrei's timetable on one side, with a desired to not
overwhelm you and Andi with ZE patches on the other. Hopefully this
summary helps everyone get on the same page.-Sara
- Sidenote: I refuse to call object behavior "reference by
default",
I've had too many people notice that it's not actually true
and expect
me to explain why in 2 minutes without the aid of a whiteboard.in
Hi Sara,
I don't feel to great with this patch. It kind of feels like twisting the language implementation around for some very specific
problem which probably shouldn't be fixed at this level. Andrei says performance of Unicode isn't great so it shouldn't matter too
much, but I think a) it's not only about performance but also about maintainability. The code in PHP 6 is already much more complex
than that of PHP 5 and has become much harder to maintain. Such additional changes will make it worse b) There will still be plenty
of people who use PHP 6 in PHP 5 mode.
I still haven't quite understood why not just do the detection on the whole auto-global when it's being fetched (or even during
request startup). As Andrei said, it's slow anyway so for people who need Unicode that should be an affordable hit.
Maybe I don't understand the problem in enough detail and it might make sense for me to talk to Andrei via voice directly, but I
really don't think we should be over eager commiting this patch. It'll not be good for the engine long term (not that I don't
appreciate your efforts and hard work on this patch).
Andi
-----Original Message-----
From: Sara Golemon [mailto:pollita@php.net]
Sent: Tuesday, January 23, 2007 11:02 AM
To: internals@lists.php.net; Andrei Zmievski; Andi Gutmans;
Dmitry Stogov
Subject: [PHP-DEV] Autoglobal CVs without silence -- SummaryOK. Now your patch will work, but I would like to >>
think about more elegant solution.
The problem that I am busy with other work.
Could you please wait a week and then commit it if >> I
won't return (on the next Tuesday).Argh. Can we please accelerate this somehow?
This patch is necessary for the HTTP request > decoding
work in PHP 6 and we really should > get it done sooner than later.Okay, rewind and reset time.
Dmitry, here's a quick summary of what's being done, how, and why.
Initial Problem: PHP6 needs better http input encoding
detection, preferably with minimal wasted effort in
conversion and limited vectors for conversion failure based attacks.Proposed Solution: Wait until the first time a given input
argument is requested before actually converting it. This
allows scripts to perform their own (potentially more
relevant) determination of what the correct input encoding is.Proposed Implementation for this solution: Make JIT be
runtime based and fine-grained enough to signal not just the
autoglobal being fetched, but what specific
dimension/property within that auto global is being
requested. Using runtime-dimension-JIT to decode input
arguments as they are requested.Rejected Implementation: Use object/array-access overloading
to JIT the values instead. While this solution is the
simplest and can be done with relatively few LOCs, it breaks
assumptions about the GPC auto globals (is_array() fails,
is_object()
succeeds, assignments of the autoglobals becomes
"reference-like"*). In short, this solution introduces BC issues.
Next Problem: How to actually make runtime-JIT with dim/prop
level granularity?Proposed Solution: Catch fetches during FETCH_DIM/FETCH_OBJ
execution handlers.
Next Problem: auto_globals aren't processed as CVs, meaning
that during FETCH_DIM, there's no way to tell if op1 came
from an auto global or not (since the fetch happened earlier).Solution (Implemented last week): Remove restriction on CVing
auto globals by adding a fetch_type field to auto global structure.
Next Problem: Silence operator forces non-CV even in
situations where a CV is appropriate since the associated
fetch_dim/obj op would not fall outside of silence scoping.Proposed Solution (patch from prior email): modify the
variable parsing routines slightly to rewrite simple fetch
ops to CV'd fetch_dim/obj ops when appropriate.
I'm not meaning to apply pressure (a week doesn't effect my
timetable any), I can even move-forward with the next (and
last) ZE related patch (FETCH_DIM/FETCH_OBJ handling)
separate from this one. I'm just trying to balance Andrei's
timetable on one side, with a desired to not overwhelm you
and Andi with ZE patches on the other. Hopefully this
summary helps everyone get on the same page.-Sara
- Sidenote: I refuse to call object behavior "reference by
default", I've had too many people notice that it's not
actually true and expect me to explain why in 2 minutes
without the aid of a whiteboard.in--
To
unsubscribe, visit: http://www.php.net/unsub.php
Hi Andi,
Hi Sara,
I don't feel to great with this patch. It kind of feels like twisting the language implementation around for some very specific
problem which probably shouldn't be fixed at this level. Andrei says performance of Unicode isn't great so it shouldn't matter too
much, but I think a) it's not only about performance but also about maintainability. The code in PHP 6 is already much more complex
than that of PHP 5 and has become much harder to maintain. Such additional changes will make it worse b) There will still be plenty
of people who use PHP 6 in PHP 5 mode.I still haven't quite understood why not just do the detection on the whole auto-global when it's being fetched (or even during
request startup). As Andrei said, it's slow anyway so for people who need Unicode that should be an affordable hit.
Please read (once or again) my initial proposal, it is the base of
Sara's proposals. Request startup is not a solution because the
users must have the ability to define the input encoding before the
fist fetch. That's why we decide to move the JIT trigger at runtime
instead of compile time.
--Pierre