Hi,
The NEWS and UPGRADING explains the details.
The patch is big, but actually quite simple.
I'm going to commit it on Monday or Tuesday (if no objections).
I'm going to look into the similar optimization for CVs, but it's going to
be a bit more difficult.
Thanks. Dmitry.
Hi,
The NEWS and UPGRADING explains the details.
The patch is big, but actually quite simple.
I'm going to commit it on Monday or Tuesday (if no objections).I'm going to look into the similar optimization for CVs, but it's going to
be a bit more difficult.
Looks good to me. I'll commit this change to APC when you commit it:
Index: apc_zend.c
--- apc_zend.c (revision 328577)
+++ apc_zend.c (working copy)
@@ -48,7 +48,11 @@
static opcode_handler_t *apc_original_opcode_handlers;
static opcode_handler_t apc_opcode_handlers[APC_OPCODE_HANDLER_COUNT];
+#if PHP_MAJOR_VERSION
>= 6 || PHP_MAJOR_VERSION
== 5 &&
PHP_MINOR_VERSION
>= 5
+#define APC_EX_T(offset) (EX_TMP_VAR(execute_data,
offset))
+#else
#define APC_EX_T(offset) ((temp_variable
)((char)execute_data->Ts + offset))
+#endif
And there are a couple of extensions that are going to need similar
changes because of this. pecl/optimizer, pecl/inclued, pecl/xhprof,
pecl/operator and xdebug from a quick check.
-Rasmus
Hi Rasmus,
I'm glad the patch is not a big problem for APC.
BTW: I decided not to commit the patch as is, and implement the similar
optimization for CVs first.
Otherwise I may need to change the way TMP_VAR accessed twice.
I'll probably send the patch for review later today.
Thanks. Dmitry.
Hi,
The NEWS and UPGRADING explains the details.
The patch is big, but actually quite simple.
I'm going to commit it on Monday or Tuesday (if no objections).I'm going to look into the similar optimization for CVs, but it's going
to
be a bit more difficult.Looks good to me. I'll commit this change to APC when you commit it:
Index: apc_zend.c
--- apc_zend.c (revision 328577)
+++ apc_zend.c (working copy)
@@ -48,7 +48,11 @@
static opcode_handler_t *apc_original_opcode_handlers;
static opcode_handler_t apc_opcode_handlers[APC_OPCODE_HANDLER_COUNT];+#if
PHP_MAJOR_VERSION
>= 6 ||PHP_MAJOR_VERSION
== 5 &&
PHP_MINOR_VERSION
>= 5
+#define APC_EX_T(offset) (EX_TMP_VAR(execute_data,
offset))
+#else
#define APC_EX_T(offset) ((temp_variable
)((char)execute_data->Ts + offset))
+#endifAnd there are a couple of extensions that are going to need similar
changes because of this. pecl/optimizer, pecl/inclued, pecl/xhprof,
pecl/operator and xdebug from a quick check.-Rasmus
The new proposed patch: http://pastebin.com/pj5fQTfN
Now both execute_data->Ts and execute_data->CVs are removed and
corresponding temporary and compiled variables accessed using
"execute_data" as the base pointer. Temporary variables allocate directly
before the "execute_data" in reverse order and compiled variables right
after. So they can be accessed without any additional computations. The
patch reduces the number of executed instructions and number of memory
reads (about 8% less).
I'm going to commit the patch on Tuesday.
Thanks. Dmitry.
Hi Rasmus,
I'm glad the patch is not a big problem for APC.
BTW: I decided not to commit the patch as is, and implement the similar
optimization for CVs first.
Otherwise I may need to change the way TMP_VAR accessed twice.
I'll probably send the patch for review later today.Thanks. Dmitry.
On Fri, Nov 30, 2012 at 11:48 PM, Rasmus Lerdorf rasmus@lerdorf.comwrote:
Hi,
The NEWS and UPGRADING explains the details.
The patch is big, but actually quite simple.
I'm going to commit it on Monday or Tuesday (if no objections).I'm going to look into the similar optimization for CVs, but it's going
to
be a bit more difficult.Looks good to me. I'll commit this change to APC when you commit it:
Index: apc_zend.c
--- apc_zend.c (revision 328577)
+++ apc_zend.c (working copy)
@@ -48,7 +48,11 @@
static opcode_handler_t *apc_original_opcode_handlers;
static opcode_handler_t apc_opcode_handlers[APC_OPCODE_HANDLER_COUNT];+#if
PHP_MAJOR_VERSION
>= 6 ||PHP_MAJOR_VERSION
== 5 &&
PHP_MINOR_VERSION
>= 5
+#define APC_EX_T(offset) (EX_TMP_VAR(execute_data,
offset))
+#else
#define APC_EX_T(offset) ((temp_variable
)((char)execute_data->Ts + offset))
+#endifAnd there are a couple of extensions that are going to need similar
changes because of this. pecl/optimizer, pecl/inclued, pecl/xhprof,
pecl/operator and xdebug from a quick check.-Rasmus
Hi Dmitry.
is this improvemnt (intruction reduce and memory read reduce) preserved under
"strong" compiler optimization (-O2, -O3) ?
is it something that the compiler can not automatically optimize ?
anyhow, the patch seems impressive...
The new proposed patch: http://pastebin.com/pj5fQTfN
Now both execute_data->Ts and execute_data->CVs are removed and
corresponding temporary and compiled variables accessed using
"execute_data" as the base pointer. Temporary variables allocate directly
before the "execute_data" in reverse order and compiled variables right
after. So they can be accessed without any additional computations. The
patch reduces the number of executed instructions and number of memory
reads (about 8% less).I'm going to commit the patch on Tuesday.
Thanks. Dmitry.
Hi Rasmus,
I'm glad the patch is not a big problem for APC.
BTW: I decided not to commit the patch as is, and implement the similar
optimization for CVs first.
Otherwise I may need to change the way TMP_VAR accessed twice.
I'll probably send the patch for review later today.Thanks. Dmitry.
On Fri, Nov 30, 2012 at 11:48 PM, Rasmus Lerdorf
rasmus@lerdorf.comwrote:Hi,
The NEWS and UPGRADING explains the details.
The patch is big, but actually quite simple.
I'm going to commit it on Monday or Tuesday (if no objections).I'm going to look into the similar optimization for CVs, but it's
goingto
be a bit more difficult.
Looks good to me. I'll commit this change to APC when you commit it:
Index: apc_zend.c
--- apc_zend.c (revision 328577)
+++ apc_zend.c (working copy)
@@ -48,7 +48,11 @@static opcode_handler_t *apc_original_opcode_handlers;
static opcode_handler_t apc_opcode_handlers[APC_OPCODE_HANDLER_COUNT];+#if
PHP_MAJOR_VERSION
>= 6 ||PHP_MAJOR_VERSION
== 5 &&
PHP_MINOR_VERSION
>= 5
+#define APC_EX_T(offset) (*EX_TMP_VAR(execute_data,
offset))
+#else#define APC_EX_T(offset) (*(temp_variable
)((char)execute_data->Ts + offset))
+#endifAnd there are a couple of extensions that are going to need similar
changes because of this. pecl/optimizer, pecl/inclued, pecl/xhprof,
pecl/operator and xdebug from a quick check.-Rasmus
Hi Yoram,
Yeas, I checked memory reads on a release PHP built (with -O2 optimization
level).
The idea is very simple. Before the patch each access to TMP was
implemented as:
execute_data->Ts + offset
GCC with -O2 kept execute_data in register but it needed to read
execute_data->Ts anyway and then add the offset
Now we just add the offset to execute_data directly.
Also, the patch I published today is not compatible with 64-bit systems.
I already fixed it, and the fix is minimal.
Thanks. Dmitry.
Hi Dmitry.
is this improvemnt (intruction reduce and memory read reduce) preserved
under
"strong" compiler optimization (-O2, -O3) ?
is it something that the compiler can not automatically optimize ?
anyhow, the patch seems impressive...The new proposed patch: http://pastebin.com/pj5fQTfN
Now both execute_data->Ts and execute_data->CVs are removed and
corresponding temporary and compiled variables accessed using
"execute_data" as the base pointer. Temporary variables allocate directly
before the "execute_data" in reverse order and compiled variables right
after. So they can be accessed without any additional computations. The
patch reduces the number of executed instructions and number of memory
reads (about 8% less).I'm going to commit the patch on Tuesday.
Thanks. Dmitry.
Hi Rasmus,
I'm glad the patch is not a big problem for APC.
BTW: I decided not to commit the patch as is, and implement the similar
optimization for CVs first.
Otherwise I may need to change the way TMP_VAR accessed twice.
I'll probably send the patch for review later today.Thanks. Dmitry.
On Fri, Nov 30, 2012 at 11:48 PM, Rasmus Lerdorf
rasmus@lerdorf.comwrote:Hi,
The NEWS and UPGRADING explains the details.
The patch is big, but actually quite simple.
I'm going to commit it on Monday or Tuesday (if no objections).I'm going to look into the similar optimization for CVs, but it's
goingto
be a bit more difficult.
Looks good to me. I'll commit this change to APC when you commit it:
Index: apc_zend.c
--- apc_zend.c (revision 328577)
+++ apc_zend.c (working copy)
@@ -48,7 +48,11 @@static opcode_handler_t *apc_original_opcode_handlers;
static opcode_handler_t
apc_opcode_handlers[APC_OPCODE_HANDLER_COUNT];+#if
PHP_MAJOR_VERSION
>= 6 ||PHP_MAJOR_VERSION
== 5 &&
PHP_MINOR_VERSION
>= 5
+#define APC_EX_T(offset)
(*EX_TMP_VAR(execute_data,
offset))
+#else#define APC_EX_T(offset) (*(temp_variable
)((char)execute_data->Ts + offset))
+#endifAnd there are a couple of extensions that are going to need similar
changes because of this. pecl/optimizer, pecl/inclued, pecl/xhprof,
pecl/operator and xdebug from a quick check.-Rasmus
Hi Yoram,
Yeas, I checked memory reads on a release PHP built (with -O2 optimization
level).The idea is very simple. Before the patch each access to TMP was
implemented as:execute_data->Ts + offset
GCC with -O2 kept execute_data in register but it needed to read
execute_data->Ts anyway and then add the offsetNow we just add the offset to execute_data directly.
Also, the patch I published today is not compatible with 64-bit systems.
I already fixed it, and the fix is minimal.Thanks. Dmitry.
Great work Dmitry, as usual !
Julien
The new proposed patch: http://pastebin.com/pj5fQTfN
Now both execute_data->Ts and execute_data->CVs are removed and
corresponding temporary and compiled variables accessed using
"execute_data" as the base pointer. Temporary variables allocate directly
before the "execute_data" in reverse order and compiled variables right
after. So they can be accessed without any additional computations. The
patch reduces the number of executed instructions and number of memory
reads (about 8% less).
Did you also test how much these 8% less memory reads improve performance?
Would be interesting to know :)
Nikita
Unfortunately, I didn't see visible performance gain on x86. :(
It's near the measurement mistake.
Probably it's because reading from L1 data cache is very cheap.
Thanks. Dmitry.
The new proposed patch: http://pastebin.com/pj5fQTfN
Now both execute_data->Ts and execute_data->CVs are removed and
corresponding temporary and compiled variables accessed using
"execute_data" as the base pointer. Temporary variables allocate directly
before the "execute_data" in reverse order and compiled variables right
after. So they can be accessed without any additional computations. The
patch reduces the number of executed instructions and number of memory
reads (about 8% less).Did you also test how much these 8% less memory reads improve performance?
Would be interesting to know :)Nikita