Guys, could we take a look at making the ref to temp var fix a bit
narrower? Currently we try to catch it at call-time. This means that
something like:
current(explode(' ','a b'))
as per bug #34468 doesn't work. Now, I think there is a secondary bug
here. I see no reason for current()
to take a by-ref, so this
particular one could be easily fixed. But there are many other cases
where a function legitimately takes a by-ref and doesn't necessarily
write to it or the write is a secondary action not required for the code
to work. Could we not catch this on the write instead of on the call?
The memory problem happens on the write. Or perhaps better, an E_NOTICE
or E_STRICT
on the call and an E_FATAL on the write. The current
E_FATAL on the call seems out of whack.
Gallery, for example, broke in a rather subtle way in their
gallery_remote2.php script which meant the various client-side tools
like iphototogallery and others got a cryptic "no album at URL" error
message. I had to break out ethereal to track it down to a couple of
functions where read-only function args were marked as by-ref. So they
didn't actually have a memory corruption problem yet the E_FATAL killed
them.
SquirrelMail has code like this all over the place:
$value = strtolower(array_shift(split('/\w/',trim($value))));
Here array_shift()
does of course change the arg, so that is a potential
problem. And yes, that's a dumb way to do this, but people write code
like this. In some of these array manipulation calls, which seems to
account for a number of the BC problems we are having, we could check
for a non-ref and behave slightly differently. In the case of
array_shift()
we could return the first arg and throw a notice. Same
would go for reset()
, end()
, next()
, prev()
and probably a few others.
-Rasmus
At 10:57 12/09/2005, Rasmus Lerdorf wrote:
Guys, could we take a look at making the ref to temp var fix a bit
narrower? Currently we try to catch it at call-time. This means that
something like:current(explode(' ','a b'))
as per bug #34468 doesn't work. Now, I think there is a secondary bug
here. I see no reason forcurrent()
to take a by-ref, so this
particular one could be easily fixed.
Yep, we need to fix that one.
But there are many other cases
where a function legitimately takes a by-ref and doesn't necessarily
write to it or the write is a secondary action not required for the code
to work. Could we not catch this on the write instead of on the call?
The problem is that there's no way to tell that element apart at that
time. It's too late. As soon as we treat a read-only zval as if it's
read/write (take a ** instead of a *), it's too late, since we can't really
detect later on where it came from.
The memory problem happens on the write. Or perhaps better, an
E_NOTICE
orE_STRICT
on the call and an E_FATAL on the write. The current
E_FATAL on the call seems out of whack.
I don't really agree that it's out of whack, since you are passing a piece
of data by reference, which is an undefined behavior. I agree that it
would have been nice if we could allow for this and only complain if the
data is written to in the function (in the PHP spirit of 'just work!'), but
I don't see how that would be possible.
Gallery, for example, broke in a rather subtle way in their
gallery_remote2.php script which meant the various client-side tools
like iphototogallery and others got a cryptic "no album at URL" error
message. I had to break out ethereal to track it down to a couple of
functions where read-only function args were marked as by-ref. So they
didn't actually have a memory corruption problem yet the E_FATAL killed
them.
You'd have to agree that it is a bug on Gallery's part, though, right? If
they're read only, they shouldn't have been marked as by-ref (yes, we
screwed up by only introducing this error now after it worked for years,
but it's still problematic to let it go on working and create possible
corruption).
SquirrelMail has code like this all over the place:
$value = strtolower(array_shift(split('/\w/',trim($value))));
Here
array_shift()
does of course change the arg, so that is a potential
problem. And yes, that's a dumb way to do this, but people write code
like this. In some of these array manipulation calls, which seems to
account for a number of the BC problems we are having, we could check
for a non-ref and behave slightly differently. In the case of
array_shift() we could return the first arg and throw a notice. Same
would go forreset()
,end()
,next()
,prev()
and probably a few others.
We could probably provide a way for internal functions to denote that
they're handling by-ref 'wannabe' arguments, so that we won't generate a
fatal error for them. We'd still have to do it for userland functions (and
any other internal function). I'm not sure if it's worth it..?
Zeev
Hi Rasmus,
You start a new thread for my question, maybe you will get more
answers or at least better ones.
If it is a reference or not, I do not care, it should not "act"
differently. Understand that a quiet code in 5.0.4 must be quiet in
5.0.5.
If we add notices in 5.1.0 and then make them fatal in 5.2.0, I'm
fine. People will have time to migrate and "fix" their code.
Regards,
--Pierre
Just got back from Sri Lanka and am catching up on email. But in
general I don't agree with your statement that things that are quiet in
5.0.4 must be quiet in 5.0.5. Quietly corrupting memory doesn't really
work. Sometimes we need to break things slightly to fix things. My
message was more about if there was a better way to fix it. I looked at
the code and couldn't tell if it was somehow possible to catch this on
the write instead of on the call. It sounds like that isn't possible.
The second approach to try to regain some backwards compatibility is to
make some of our internal functions aware of the fact that they are
being passed a ref to a temp var and if so pretend it wasn't passed by
reference.
-Rasmus
Pierre Joye wrote:
Hi Rasmus,
You start a new thread for my question, maybe you will get more
answers or at least better ones.If it is a reference or not, I do not care, it should not "act"
differently. Understand that a quiet code in 5.0.4 must be quiet in
5.0.5.If we add notices in 5.1.0 and then make them fatal in 5.2.0, I'm
fine. People will have time to migrate and "fix" their code.Regards,
--Pierre
Just got back from Sri Lanka and am catching up on email. But in
general I don't agree with your statement that things that are quiet in
5.0.4 must be quiet in 5.0.5. Quietly corrupting memory doesn't really
work. Sometimes we need to break things slightly to fix things.
I agree about the fix, not about the fatal error in 5.0.5. See my
answer in my post.
My message was more about if there was a better way to fix it. I looked at
the code and couldn't tell if it was somehow possible to catch this on
the write instead of on the call. It sounds like that isn't possible.
The second approach to try to regain some backwards compatibility is to
make some of our internal functions aware of the fact that they are
being passed a ref to a temp var and if so pretend it wasn't passed by
reference.
As it is too late now anyway, regain BC makes little sense. Unless we
restore it all related BC breaks and fix the internal temp vars
management (which is equal to a rewrite of this part). From a user
point of view, it will only end to the same confusion as in arguments
order, or other little annoyances ;)
Regards,
--Pierre
Pierre Joye wrote:
As it is too late now anyway, regain BC makes little sense.
I don't agree. This is one of the main things stopping people from
migrating to PHP5 right now. If we can remove a bit of the migration
pain with some of clever temp var handling, I think it is worthwhile.
-Rasmus
Rasmus Lerdorf wrote:
SquirrelMail has code like this all over the place:
$value = strtolower(array_shift(split('/\w/',trim($value))));
Here
array_shift()
does of course change the arg, so that is a potential
problem. And yes, that's a dumb way to do this, but people write code
like this. In some of these array manipulation calls, which seems to
account for a number of the BC problems we are having, we could check
for a non-ref and behave slightly differently. In the case of
array_shift()
we could return the first arg and throw a notice. Same
would go forreset()
,end()
,next()
,prev()
and probably a few others.
I noticed the same problem with array_pop()
. I'm happy with the new
fatal error for user-space functions. But nobody - at least non-devs -
knows that functions like array_pop()
may not be used the way you
described here anymore. It's not documented somewhere and it's hard to
understand why. Many PHP programmers put more than one function in a row
(I know it's bad style), so there a lot of such code out there.
Would be great if that behaviour could be fixed.
What I don't understand - upgrading from 5.0.4 to 5.0.5 broke by far
more applications on my servers, than upgrading from 4.x to 5.x! And
there was no information in the News/ChangeLog.
For an admin it looked like a small, non-BC-braking security update
which should be installed as soon as possible, without any risk to break
scripts working with PHP 5.0.4.
And in 5.0.5 it's fatal error, not only notice.
best regards
Andreas
Hello Rasmus,
my solution was (and i proposed a patch for it here) that we have a way
to allow function signatures that pass variables as const just like c++
allows. This i did because first it is faster and second it applies to
most pass by ref signatures. Or in other words we could get rid of most
pass by ref sigs. And just to note, it was declined because it would
increase complexity a tight bit.
best regards
marcus
Monday, September 12, 2005, 9:57:24 AM, you wrote:
Guys, could we take a look at making the ref to temp var fix a bit
narrower? Currently we try to catch it at call-time. This means that
something like:
current(explode(' ','a b'))
as per bug #34468 doesn't work. Now, I think there is a secondary bug
here. I see no reason forcurrent()
to take a by-ref, so this
particular one could be easily fixed. But there are many other cases
where a function legitimately takes a by-ref and doesn't necessarily
write to it or the write is a secondary action not required for the code
to work. Could we not catch this on the write instead of on the call?
The memory problem happens on the write. Or perhaps better, anE_NOTICE
orE_STRICT
on the call and an E_FATAL on the write. The current
E_FATAL on the call seems out of whack.
Gallery, for example, broke in a rather subtle way in their
gallery_remote2.php script which meant the various client-side tools
like iphototogallery and others got a cryptic "no album at URL" error
message. I had to break out ethereal to track it down to a couple of
functions where read-only function args were marked as by-ref. So they
didn't actually have a memory corruption problem yet the E_FATAL killed
them.
SquirrelMail has code like this all over the place:
$value = strtolower(array_shift(split('/\w/',trim($value))));
Here
array_shift()
does of course change the arg, so that is a potential
problem. And yes, that's a dumb way to do this, but people write code
like this. In some of these array manipulation calls, which seems to
account for a number of the BC problems we are having, we could check
for a non-ref and behave slightly differently. In the case of
array_shift()
we could return the first arg and throw a notice. Same
would go forreset()
,end()
,next()
,prev()
and probably a few others.
-Rasmus
Best regards,
Marcus