[PATCH] __attribute_malloc__ for malloc-like functions

22 years ago by Ard Biesheuvel — view source — reply

unread

Hello group,

After having spent some time digging into the origin of the
strict-aliasing warnings when compiling with GCC 3.3 (brought about by
Stefan Roehrich's post a couple of days ago), I came up with a couple of
things that might be of interest in the development of PHP.

First of all, GCC provides an function qualifier attribute_malloc,
which tells the compiler that the function in question is a malloc-like
function, and will therefore never return values that might be aliased
by other pointers in the scope of the function that called it. This is
quite a boost for the effectiveness of the optimizations, because - the
return type of malloc() being void* - the compiler has to take into
account that every pointer in scope might be an alias of the resulting
value. attribute_malloc fixes that. (Disabled for non-GCC compilers)

The other thing is, that a lot of warnings are being triggered by
invocations of zend_hash_find(). This function stores its result in the
location pointed to by a void** argument, and returns an int specifying
whether the key was found or not. As the comment in the source states,
this was a conscious decision, because hashes can also be used to store
null pointers.

After looking through some of the code, I found out that in a lot of
cases, the pointers are expected not to be zero, and are being
dereferenced in the code that immediately follows it [without checking
for validity first].

Wouldn't it be more straight-forward to introduce an analogue for
zend_hash_find() (eg. 'void *zend_hash_get()') which returns the stored
pointer [or NULL on failure]. It could be used for applications where
null pointers aren't allowed, and fix current code, where the implicit
assumption is made the SUCCESS means the the pointers is non-null.

Last of all, I would like to know if we care about the warnings.
Personally I don't really mind, as GCC's capabilities of accurately
identifying violations of strict-aliasing rules are limited anyway.

Ard

22 years ago by Zeev Suraski — view source — reply

unread

At 23:43 27/08/2003, Ard Biesheuvel wrote:

The other thing is, that a lot of warnings are being triggered by
invocations of zend_hash_find(). This function stores its result in the
location pointed to by a void** argument, and returns an int specifying
whether the key was found or not. As the comment in the source states,
this was a conscious decision, because hashes can also be used to store
null pointers.

After looking through some of the code, I found out that in a lot of
cases, the pointers are expected not to be zero, and are being
dereferenced in the code that immediately follows it [without checking for
validity first].

Wouldn't it be more straight-forward to introduce an analogue for
zend_hash_find() (eg. 'void *zend_hash_get()') which returns the stored
pointer [or NULL on failure]. It could be used for applications where null
pointers aren't allowed, and fix current code, where the implicit
assumption is made the SUCCESS means the the pointers is non-null.

Last of all, I would like to know if we care about the warnings.
Personally I don't really mind, as GCC's capabilities of accurately
identifying violations of strict-aliasing rules are limited anyway.

I'd just make gcc shut up about it. Using void ** is the generic way to do
what we want to do and I see no reason to introduce another way that works
only in a subset of the cases, and will require tons of code changes. I
have to admit I still haven't been able to figure out what gcc thinks is
wrong with casting a zval *** pointer to a void **, considering that any
pointer can be cast to a void *, and that pointers always have the same
size (i.e., any pointer is castable to void *).

Not sure about attribute_malloc - it sounds reasonable but I have no
experience with it. Were you able to measure any performance difference?

Zeev

22 years ago by Sascha Schumann — view source — reply

unread

account that every pointer in scope might be an alias of the resulting
value. attribute_malloc fixes that. (Disabled for non-GCC compilers)

What about erealloc?  It can return the same pointer as
passed to it.

After looking through some of the code, I found out that in a lot of
cases, the pointers are expected not to be zero, and are being
dereferenced in the code that immediately follows it [without checking
for validity first].

Segfaulting early is a good thing.  Whenever a crash occurs a
defect has been found and needs to be fixed.  If the `NULL` would
just be silently ignored, the real bug (where `NULL` is
erroneously inserted) could slip by and make it into a
release.

Wouldn't it be more straight-forward to introduce an analogue for
zend_hash_find() (eg. 'void *zend_hash_get()') which returns the stored
pointer [or NULL on failure]. It could be used for applications where
null pointers aren't allowed, and fix current code, where the implicit
assumption is made the SUCCESS means the the pointers is non-null.

Then you would also need a _set function which rejects `NULL` 
values.  Considering that the existing APIs have worked quite
well in the past, I don't see a need for adding such a class
of APIs.

Last of all, I would like to know if we care about the warnings.
Personally I don't really mind, as GCC's capabilities of accurately
identifying violations of strict-aliasing rules are limited anyway.

I think we should address them.  Many warnings provide
information about potential defects, but the flood of aliasing
warnings deters developers from considering the relevant
ones.

- Sascha

22 years ago by Ard Biesheuvel — view source — reply

unread

account that every pointer in scope might be an alias of the resulting
value. attribute_malloc fixes that. (Disabled for non-GCC compilers)
What about erealloc?  It can return the same pointer as
passed to it.

GNU C realloc() does have the attribute((malloc)) itself. This seems
reasonable, because, from a strict-aliasing point of view, the result of
the function is either a new 'fresh' pointer, or equal to the first
argument passed to realloc(), to which the strict-aliasing rules apply
already.

Ard