Hello everyone,
I think there are a few improvements that we could make to some fundamental algorithms.
I'd like to point out one particular for now, to see if there is any interest in these kind of low level code optimizations from the community.
Please consider the following code snippet extracted from
static zend_always_inline const void *zend_memrchr(const void *s, int c, size_t n)
{
const unsigned char *e;
if (n <= 0) {
return NULL;
}
for (e = (const unsigned char *)s + n - 1; e >= (const unsigned char *)s; e--) {
if (*e == (const unsigned char)c) {
return (const void *)e;
}
}
return NULL;
}
The issue with this is that on each loop we have to do two comparison tests, one against the length boundary and one to check for the actual occurrence of the search character.
We can avoid checking the length boundary if we use a sentinel by setting the last possible element to the character we are searching for, whereby we guarantee that we will find it and no longer need the test inside the loop. After the search we reset that character and then recheck the last iteration step to see if we have aborted the loop because of reaching the end (false positive) or because of an actual positive.
A patch could look like this:
static zend_always_inline const void *zend_memrchr(const void *s, int c, size_t n)
{
const unsigned char *b;
const unsigned char *e;
const unsigned char t;
if (n > 0) {
/* Initialize boundary pointers */
b = (const unsigned char *)s;
e = b + n - 1;
/* Set sentinel */
t = *b;
*b = (const unsigned char)c;
/* Search */
while (*e != (const unsigned char)c) {
--e;
}
/* Restore sentinel */
*b = t;
/* Recheck */
if (e != b || t == (const unsigned char)c) {
return (const void *)e;
}
}
return NULL;
}
For the above example we could go one step further and test an entire longword at a time with alignment on longword boundaries, just like some modern C library implementations do. Which begs the question: Is there a reason we are not using standard memrchr here?
This kind of approach might be applicable to other string handling functions as well. Should I look into this further?
Cheers,
Benjamin Coutu
--
Bejamin Coutu
ben.coutu@zeyos.com
ZeyOS, Inc.
http://www.zeyos.com
Hello everyone,
I think there are a few improvements that we could make to some fundamental algorithms.
I'd like to point out one particular for now, to see if there is any interest in these kind of low level code optimizations from the community.Please consider the following code snippet extracted from
static zend_always_inline const void *zend_memrchr(const void *s, int c, size_t n)
{
const unsigned char *e;
if (n <= 0) {
return NULL;
}for (e = (const unsigned char *)s + n - 1; e >= (const unsigned char *)s; e--) {
if (*e == (const unsigned char)c) {
return (const void *)e;
}
}
return NULL;
}The issue with this is that on each loop we have to do two comparison tests, one against the length boundary and one to check for the actual occurrence of the search character.
We can avoid checking the length boundary if we use a sentinel by setting the last possible element to the character we are searching for, whereby we guarantee that we will find it and no longer need the test inside the loop. After the search we reset that character and then recheck the last iteration step to see if we have aborted the loop because of reaching the end (false positive) or because of an actual positive.
A patch could look like this:
static zend_always_inline const void *zend_memrchr(const void *s, int c, size_t n)
{
const unsigned char *b;
const unsigned char *e;
const unsigned char t;if (n > 0) {
/* Initialize boundary pointers */
b = (const unsigned char *)s;
e = b + n - 1;/* Set sentinel */ t = *b; *b = (const unsigned char)c; /* Search */ while (*e != (const unsigned char)c) { --e; } /* Restore sentinel */ *b = t; /* Recheck */ if (e != b || t == (const unsigned char)c) { return (const void *)e; }
}
return NULL;
}For the above example we could go one step further and test an entire longword at a time with alignment on longword boundaries, just like some modern C library implementations do. Which begs the question: Is there a reason we are not using standard memrchr here?
This kind of approach might be applicable to other string handling functions as well. Should I look into this further?
Cheers,
Benjamin Coutu
--
Bejamin Coutu
ben.coutu@zeyos.comZeyOS, Inc.
http://www.zeyos.com--
An obvious issue with this is that not all strings are writable. The
signature of the function also specifies it is a pointer to a const so
you shouldn't be able to do this anyway.