Hi folks. It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.
I've prepared a patch to allow the timelib code to use the system
timezone database directly; would something like this be acceptable, any
thoughts? This passes the ext/date tests, and reduces the PHP binary
size by about 300K to boot.
Notes:
-
I've not implemented timelib_timezone_builtin_identifiers_list() here
since it doesn't seem to be used, is it there for third-party
extensions? It could be implemented by iterating through the directory. -
there's no general way that I can find to obtain the database
version, so I just invented a string here.
joe
Index: ext/date/lib/parse_tz.c
RCS file: /repository/php-src/ext/date/lib/parse_tz.c,v
retrieving revision 1.35
diff -u -r1.35 parse_tz.c
--- ext/date/lib/parse_tz.c 31 Dec 2007 07:12:08 -0000 1.35
+++ ext/date/lib/parse_tz.c 9 Jan 2008 14:04:28 -0000
@@ -20,6 +20,14 @@
#include "timelib.h"
+#ifdef HAVE_SYSTEM_TZDATA
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <unistd.h>
+#endif
#include <stdio.h>
#ifdef HAVE_STRING_H
@@ -27,7 +35,10 @@
#else
#include <strings.h>
#endif
+#ifndef HAVE_SYSTEM_TZDATA
#include "timezonedb.h"
+#endif
#if (defined(APPLE) || defined(APPLE_CC)) && (defined(BIG_ENDIAN) || defined(LITTLE_ENDIAN))
if defined(LITTLE_ENDIAN)
@@ -202,6 +213,86 @@
}
}
+#ifdef HAVE_SYSTEM_TZDATA
+#ifdef HAVE_SYSTEM_TZDATA_PREFIX
+#define ZONEINFO_PREFIX HAVE_SYSTEM_TZDATA_PREFIX
+#else
+#define ZONEINFO_PREFIX "/usr/share/zoneinfo"
+#endif
+static const timelib_tzdb timezonedb_system = { "0.system", 0, NULL, NULL
};
+/* Return the mmap()ed tzfile if found, else NULL. On success, the
-
- length of the mapped data is placed in *length. */
+static char *map_tzfile(const char *timezone, size_t *length)
+{
- length of the mapped data is placed in *length. */
- char fname[PATH_MAX];
- struct stat st;
- char *p;
- int fd;
- snprintf(fname, sizeof fname, ZONEINFO_PREFIX "/%s", timezone);
- fd = open(fname, O_RDONLY);
- if (fd == -1) {
-
return NULL;
- } else if (fstat(fd, &st) != 0 || st.st_size < 21) {
-
close(fd);
-
return NULL;
- }
- *length = st.st_size;
- p = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, fd, 0);
- close(fd);
- return p != MAP_FAILED ? p : NULL;
+}
+const timelib_tzdb *timelib_builtin_db(void)
+{
- return &timezonedb_system;
+}
+const timelib_tzdb_index_entry *timelib_timezone_builtin_identifiers_list(int *count)
+{
- *count = 0;
- return NULL;
+}
+int timelib_timezone_id_is_valid(char *timezone, const timelib_tzdb *tzdb)
+{
- char fname[PATH_MAX];
- snprintf(fname, sizeof fname, ZONEINFO_PREFIX "/%s", timezone);
- return access(fname, R_OK) == 0 ? 1 : 0;
+}
+timelib_tzinfo *timelib_parse_tzfile(char *timezone, const timelib_tzdb *tzdb)
+{
- char *tzf, *orig;
- timelib_tzinfo *tmp;
- size_t len;
- orig = map_tzfile(timezone, &len);
- if (orig == NULL) {
-
return NULL;
- }
- tmp = timelib_tzinfo_ctor(timezone);
- tzf = orig + 20;
- read_header(&tzf, tmp);
- read_transistions(&tzf, tmp);
- read_types(&tzf, tmp);
- munmap(orig, len);
- return tmp;
+}
+#else
static int seek_to_tz_position(const unsigned char **tzf, char *timezone, const timelib_tzdb *tzdb)
{
int left = 0, right = tzdb->index_size - 1;
@@ -258,6 +349,7 @@
return tmp;
}
+#endif
static ttinfo* fetch_timezone_offset(timelib_tzinfo *tz, timelib_sll ts, timelib_sll *transition_time)
{
Index: ext/date/lib/timelib.m4
RCS file: /repository/php-src/ext/date/lib/timelib.m4,v
retrieving revision 1.4
diff -u -r1.4 timelib.m4
--- ext/date/lib/timelib.m4 3 Jul 2005 23:30:52 -0000 1.4
+++ ext/date/lib/timelib.m4 9 Jan 2008 14:04:28 -0000
@@ -78,3 +78,17 @@
dnl Check for strtoll, atoll
AC_CHECK_FUNCS(strtoll atoll strftime)
+PHP_ARG_WITH(system-tzdata, for use of system timezone data,
+[ --with-system-tzdata[=DIR] to specify use of system timezone data],
+no, no)
+if test "$PHP_SYSTEM_TZDATA" != "no"; then
- AC_DEFINE(HAVE_SYSTEM_TZDATA, 1, [Define if system timezone data is used])
- if test "$PHP_SYSTEM_TZDATA" != "yes"; then
-
AC_DEFINE_UNQUOTED(HAVE_SYSTEM_TZDATA_PREFIX, "$PHP_SYSTEM_TZDATA",
-
[Define for location of system timezone data])
- fi
+fi
Definitely a great idea, since most linux distributions already bundle this
information and update it frequently.
BTW, your implementation seems to require some security checks for timezone
names like "../../../etc/passwd".
Nuno
----- Original Message -----
From: "Joe Orton" jorton@redhat.com
To: derick@php.net; internals@lists.php.net
Sent: Wednesday, January 09, 2008 2:15 PM
Subject: [PHP-DEV] [PATCH] date/timelib: use system timezone database
Hi folks. It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.I've prepared a patch to allow the timelib code to use the system
timezone database directly; would something like this be acceptable, any
thoughts? This passes the ext/date tests, and reduces the PHP binary
size by about 300K to boot.Notes:
I've not implemented timelib_timezone_builtin_identifiers_list() here
since it doesn't seem to be used, is it there for third-party
extensions? It could be implemented by iterating through the directory.there's no general way that I can find to obtain the database
version, so I just invented a string here.joe
Index: ext/date/lib/parse_tz.c
RCS file: /repository/php-src/ext/date/lib/parse_tz.c,v
retrieving revision 1.35
diff -u -r1.35 parse_tz.c
--- ext/date/lib/parse_tz.c 31 Dec 2007 07:12:08 -0000 1.35
+++ ext/date/lib/parse_tz.c 9 Jan 2008 14:04:28 -0000
@@ -20,6 +20,14 @@#include "timelib.h"
+#ifdef HAVE_SYSTEM_TZDATA
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <unistd.h>
+#endif
#include <stdio.h>
#ifdef HAVE_STRING_H
@@ -27,7 +35,10 @@
#else
#include <strings.h>
#endif
+#ifndef HAVE_SYSTEM_TZDATA
#include "timezonedb.h"
+#endif#if (defined(APPLE) || defined(APPLE_CC)) &&
(defined(BIG_ENDIAN) || defined(LITTLE_ENDIAN))if defined(LITTLE_ENDIAN)
@@ -202,6 +213,86 @@
}
}+#ifdef HAVE_SYSTEM_TZDATA
+#ifdef HAVE_SYSTEM_TZDATA_PREFIX
+#define ZONEINFO_PREFIX HAVE_SYSTEM_TZDATA_PREFIX
+#else
+#define ZONEINFO_PREFIX "/usr/share/zoneinfo"
+#endif
+static const timelib_tzdb timezonedb_system = { "0.system", 0, NULL,
NULL
};
+/* Return the mmap()ed tzfile if found, else NULL. On success, the
- length of the mapped data is placed in *length. */
+static char *map_tzfile(const char *timezone, size_t *length)
+{- char fname[PATH_MAX];
- struct stat st;
- char *p;
- int fd;
- snprintf(fname, sizeof fname, ZONEINFO_PREFIX "/%s", timezone);
- fd = open(fname, O_RDONLY);
- if (fd == -1) {
- return NULL;
- } else if (fstat(fd, &st) != 0 || st.st_size < 21) {
- close(fd);
- return NULL;
- }
- *length = st.st_size;
- p = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, fd, 0);
- close(fd);
- return p != MAP_FAILED ? p : NULL;
+}+const timelib_tzdb *timelib_builtin_db(void)
+{
- return &timezonedb_system;
+}+const timelib_tzdb_index_entry
*timelib_timezone_builtin_identifiers_list(int *count)
+{
- *count = 0;
- return NULL;
+}+int timelib_timezone_id_is_valid(char *timezone, const timelib_tzdb
*tzdb)
+{
- char fname[PATH_MAX];
- snprintf(fname, sizeof fname, ZONEINFO_PREFIX "/%s", timezone);
- return access(fname, R_OK) == 0 ? 1 : 0;
+}+timelib_tzinfo *timelib_parse_tzfile(char *timezone, const timelib_tzdb
*tzdb)
+{
- char *tzf, *orig;
- timelib_tzinfo *tmp;
- size_t len;
- orig = map_tzfile(timezone, &len);
- if (orig == NULL) {
- return NULL;
- }
- tmp = timelib_tzinfo_ctor(timezone);
- tzf = orig + 20;
- read_header(&tzf, tmp);
- read_transistions(&tzf, tmp);
- read_types(&tzf, tmp);
- munmap(orig, len);
- return tmp;
+}
+#elsestatic int seek_to_tz_position(const unsigned char **tzf, char *timezone,
const timelib_tzdb *tzdb)
{
int left = 0, right = tzdb->index_size - 1;
@@ -258,6 +349,7 @@return tmp;
}
+#endifstatic ttinfo* fetch_timezone_offset(timelib_tzinfo *tz, timelib_sll ts,
timelib_sll *transition_time)
{
Index: ext/date/lib/timelib.m4RCS file: /repository/php-src/ext/date/lib/timelib.m4,v
retrieving revision 1.4
diff -u -r1.4 timelib.m4
--- ext/date/lib/timelib.m4 3 Jul 2005 23:30:52 -0000 1.4
+++ ext/date/lib/timelib.m4 9 Jan 2008 14:04:28 -0000
@@ -78,3 +78,17 @@dnl Check for strtoll, atoll
AC_CHECK_FUNCS(strtoll atoll strftime)
+PHP_ARG_WITH(system-tzdata, for use of system timezone data,
+[ --with-system-tzdata[=DIR] to specify use of system timezone
data],
+no, no)
+if test "$PHP_SYSTEM_TZDATA" != "no"; then
- AC_DEFINE(HAVE_SYSTEM_TZDATA, 1, [Define if system timezone data is
used])- if test "$PHP_SYSTEM_TZDATA" != "yes"; then
AC_DEFINE_UNQUOTED(HAVE_SYSTEM_TZDATA_PREFIX, "$PHP_SYSTEM_TZDATA",
[Define for location of system timezone data])
- fi
+fi
2008/1/9, Joe Orton jorton@redhat.com:
Hi folks. It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.
+1000 :-D
2008/1/9, Joe Orton jorton@redhat.com:
Hi folks. It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.
It is also a bit of a headache when a server does not have a complete
timezone database for an application to utilize. PHP did use the system
timezone database for a long time. But, applications like Phorum would
have to end up making their own (and not nearly as robust) anyway
because servers would not have all the timezones on the system. The new
way of PHP solves this for application developers.
-1000
--
Brian Moon
Senior Developer
When you care enough to spend the very least.
http://dealnews.com/
The reason we do not use the system database is because it is
inconsistent and likely is updated less frequently then the one
included with PHP.
-1.
Hi folks. It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.I've prepared a patch to allow the timelib code to use the system
timezone database directly; would something like this be acceptable,
any
thoughts? This passes the ext/date tests, and reduces the PHP binary
size by about 300K to boot.Notes:
I've not implemented timelib_timezone_builtin_identifiers_list()
here
since it doesn't seem to be used, is it there for third-party
extensions? It could be implemented by iterating through the
directory.there's no general way that I can find to obtain the database
version, so I just invented a string here.joe
Index: ext/date/lib/parse_tz.c
RCS file: /repository/php-src/ext/date/lib/parse_tz.c,v
retrieving revision 1.35
diff -u -r1.35 parse_tz.c
--- ext/date/lib/parse_tz.c 31 Dec 2007 07:12:08 -0000 1.35
+++ ext/date/lib/parse_tz.c 9 Jan 2008 14:04:28 -0000
@@ -20,6 +20,14 @@#include "timelib.h"
+#ifdef HAVE_SYSTEM_TZDATA
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <unistd.h>
+#endif
#include <stdio.h>
#ifdef HAVE_STRING_H
@@ -27,7 +35,10 @@
#else
#include <strings.h>
#endif
+#ifndef HAVE_SYSTEM_TZDATA
#include "timezonedb.h"
+#endif#if (defined(APPLE) || defined(APPLE_CC)) &&
(defined(BIG_ENDIAN) || defined(LITTLE_ENDIAN))if defined(LITTLE_ENDIAN)
@@ -202,6 +213,86 @@
}
}+#ifdef HAVE_SYSTEM_TZDATA
+#ifdef HAVE_SYSTEM_TZDATA_PREFIX
+#define ZONEINFO_PREFIX HAVE_SYSTEM_TZDATA_PREFIX
+#else
+#define ZONEINFO_PREFIX "/usr/share/zoneinfo"
+#endif
+static const timelib_tzdb timezonedb_system = { "0.system", 0,
NULL,NULL
};
+/* Return the mmap()ed tzfile if found, else NULL. On success, the
- length of the mapped data is placed in *length. */
+static char *map_tzfile(const char *timezone, size_t *length)
+{- char fname[PATH_MAX];
- struct stat st;
- char *p;
- int fd;
- snprintf(fname, sizeof fname, ZONEINFO_PREFIX "/%s", timezone);
- fd = open(fname, O_RDONLY);
- if (fd == -1) {
return NULL;
- } else if (fstat(fd, &st) != 0 || st.st_size < 21) {
close(fd);
return NULL;
- }
- *length = st.st_size;
- p = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, fd, 0);
- close(fd);
- return p != MAP_FAILED ? p : NULL;
+}+const timelib_tzdb *timelib_builtin_db(void)
+{
- return &timezonedb_system;
+}+const timelib_tzdb_index_entry
*timelib_timezone_builtin_identifiers_list(int *count)
+{
- *count = 0;
- return NULL;
+}+int timelib_timezone_id_is_valid(char *timezone, const timelib_tzdb
*tzdb)
+{
- char fname[PATH_MAX];
- snprintf(fname, sizeof fname, ZONEINFO_PREFIX "/%s", timezone);
- return access(fname, R_OK) == 0 ? 1 : 0;
+}+timelib_tzinfo *timelib_parse_tzfile(char *timezone, const
timelib_tzdb *tzdb)
+{
- char *tzf, *orig;
- timelib_tzinfo *tmp;
- size_t len;
- orig = map_tzfile(timezone, &len);
- if (orig == NULL) {
return NULL;
- }
- tmp = timelib_tzinfo_ctor(timezone);
- tzf = orig + 20;
- read_header(&tzf, tmp);
- read_transistions(&tzf, tmp);
- read_types(&tzf, tmp);
- munmap(orig, len);
- return tmp;
+}
+#elsestatic int seek_to_tz_position(const unsigned char **tzf, char
*timezone, const timelib_tzdb *tzdb)
{
int left = 0, right = tzdb->index_size - 1;
@@ -258,6 +349,7 @@return tmp;
}
+#endifstatic ttinfo* fetch_timezone_offset(timelib_tzinfo *tz, timelib_sll
ts, timelib_sll *transition_time)
{
Index: ext/date/lib/timelib.m4RCS file: /repository/php-src/ext/date/lib/timelib.m4,v
retrieving revision 1.4
diff -u -r1.4 timelib.m4
--- ext/date/lib/timelib.m4 3 Jul 2005 23:30:52 -0000 1.4
+++ ext/date/lib/timelib.m4 9 Jan 2008 14:04:28 -0000
@@ -78,3 +78,17 @@dnl Check for strtoll, atoll
AC_CHECK_FUNCS(strtoll atoll strftime)
+PHP_ARG_WITH(system-tzdata, for use of system timezone data,
+[ --with-system-tzdata[=DIR] to specify use of system
timezone data],
+no, no)
+if test "$PHP_SYSTEM_TZDATA" != "no"; then
- AC_DEFINE(HAVE_SYSTEM_TZDATA, 1, [Define if system timezone data
is used])- if test "$PHP_SYSTEM_TZDATA" != "yes"; then
AC_DEFINE_UNQUOTED(HAVE_SYSTEM_TZDATA_PREFIX,
"$PHP_SYSTEM_TZDATA",
[Define for location of system timezone
data])
- fi
+fi--
Ilia Alshanetsky
2008/1/9, Ilia Alshanetsky ilia@prohost.org:
The reason we do not use the system database is because it is
inconsistent and likely is updated less frequently then the one
included with PHP.
please consider this patch as an alternative for us !!
in anycase, you will probably find this patch included in distros
anyway, because this particular "feature" causes a lot of unnecesary
manteniance pain, distributions just update the system timezonedb in
regular basis.
2008/1/9, Ilia Alshanetsky ilia@prohost.org:
The reason we do not use the system database is because it is
inconsistent and likely is updated less frequently then the one
included with PHP.please consider this patch as an alternative for us !!
Why do you need this? There is the PECL timezonedb extension which is
updated just as frequently.
in anycase, you will probably find this patch included in distros
anyway, because this particular "feature" causes a lot of unnecesary
manteniance pain, distributions just update the system timezonedb in
regular basis.
Yes, and that means bye-bye to support on it. If everybody would patch
PHP then how can we every support things?
regards,
Derick
Derick Rethans
http://derickrethans.nl | http://ezcomponents.org | http://xdebug.org
2008/1/9, Derick Rethans derick@php.net:
Why do you need this?
It is simple, even releasing an update for a particular extension,
trigger the whole manteniance and QA in distributions, and it
unneccesary work , when the system tz is used you have QA and mantain
just one component, the "timezone" package.
jfyi: in a complete distribution, there are only 3 components that has
this highly questionable behaviuor, Evolution, PHP and JAVA all the
other hundred of components do the right thing or at least offer an
option just like the Joe's patch.
my 2 CLP.
Hi Ilia, Joe,
On Wednesday 09 January 2008 08:50:59 pm Ilia Alshanetsky wrote:
The reason we do not use the system database is because it is
inconsistent and likely is updated less frequently then the one
included with PHP.-1.
Personally I think this patch would be great, and I will recommend it to the
other debian php maintainers for review. It would certainly be a lot better
than our current solution, where the timezone db has been ripped out of php
and put into a seperate package which is managed outside of the stable
release cycle (in the "volatile" release area).
I understand that it may make the job of developing/releasing/supporting
easier for you to bundle the timezone db (just like libgd, et c), but please
consider the position of those who are responsible for maintaining not just
your software but thousands of other projects' packages... wouldn't it be
possible to at least make this some kind of compile-time option for those of
us who do like the idea?
sean finney
(debian php package maintainer)
On Wednesday 09 January 2008 08:50:59 pm Ilia Alshanetsky wrote:
The reason we do not use the system database is because it is
inconsistent and likely is updated less frequently then the one
included with PHP.Personally I think this patch would be great, and I will recommend it to the
other debian php maintainers for review.
Let me just remind you:
This patch BREAKS functionality.
It would certainly be a lot better
than our current solution, where the timezone db has been ripped out of php
and put into a seperate package which is managed outside of the stable
release cycle (in the "volatile" release area).
That is a silly approach, when we (the PHP project) already provide a
way to update this without having to make hacks. Have a look at the PECL
timezonedb extension.
I understand that it may make the job of developing/releasing/supporting
easier for you to bundle the timezone db (just like libgd, et c), but please
consider the position of those who are responsible for maintaining not just
your software but thousands of other projects' packages... wouldn't it be
possible to at least make this some kind of compile-time option for those of
us who do like the idea?
It has very little to do with maintenance, actually, it is more work for
us to maintain that extension. But apparently you didn't read my other
mail or simply chose to ignore it.
regards,
Derick
--
Derick Rethans
http://derickrethans.nl | http://ezcomponents.org | http://xdebug.org
hi derick,
On Thursday 10 January 2008 09:50:14 am Derick Rethans wrote:
Personally I think this patch would be great, and I will recommend it to
the other debian php maintainers for review.Let me just remind you:
This patch BREAKS functionality.
which of course should be avoided if at all possible.
It would certainly be a lot better
than our current solution, where the timezone db has been ripped out of
php and put into a seperate package which is managed outside of the
stable release cycle (in the "volatile" release area).That is a silly approach, when we (the PHP project) already provide a
way to update this without having to make hacks. Have a look at the PECL
timezonedb extension.
before off-handedly dismissing others' needs and ways of working please
consider that your needs and ways of working may not always be the best for
all situations. as i mentioned in my previous mail (as well as rasmus in a
later mail) this type of bundling doesn't scale in an environment where there
are thousands of other software packages to maintain, and makes life very
difficult in a "stable release" environment.
jftr, i'm in no way saying what php does with bundling timezone info (or other
software) is wrong, in fact i think that it's perfectly fine. but there are
side effects which are problematic for distributions such as debian and
redhat and i don't see what's so wrong with trying to address the needs of
such users (modulo serious regressions/breakage of course).
also jftr i was incorrect in stating that the debian php tz db was ripped out
of php, it is in fact the pecl module which will be maintained in the
volatile repository.
I understand that it may make the job of developing/releasing/supportng
easier for you to bundle the timezone db (just like libgd, et c), but
please consider the position of those who are responsible for maintaining
not just your software but thousands of other projects' packages...
wouldn't it be possible to at least make this some kind of compile-time
option for those of us who do like the idea?It has very little to do with maintenance, actually, it is more work for
us to maintain that extension. But apparently you didn't read my other
mail or simply chose to ignore it.
actually i did, but i took most of it as orthogonal to the point at hand.
that is, it seemed to me most of your rationale against joe's suggestion was
with "implementation details" which could hopefully be fixed, or with "not
seeing things from our side", which i also hope can be fixed :)
sean
Hi Sean,
Hi Ilia, Joe,
On Wednesday 09 January 2008 08:50:59 pm Ilia Alshanetsky wrote:
The reason we do not use the system database is because it is
inconsistent and likely is updated less frequently then the one
included with PHP.-1.
Personally I think this patch would be great, and I will recommend it to the
other debian php maintainers for review. It would certainly be a lot better
than our current solution, where the timezone db has been ripped out of php
and put into a seperate package which is managed outside of the stable
release cycle (in the "volatile" release area).I understand that it may make the job of developing/releasing/supporting
easier for you to bundle the timezone db (just like libgd, et c), but please
consider the position of those who are responsible for maintaining not just
your software but thousands of other projects' packages... wouldn't it be
possible to at least make this some kind of compile-time option for those of
us who do like the idea?
Besides the issues listed by Derick, there is two problems with these
choices (distro making design changes to upstream software). First it
creates an uncontrollable gray area where you have to be lucky if your
software works out of the box on a given distribution. PHP (and GD in
particular) is a pain on Debian (btw, is the pollitics about libgd in
debian finally gone? :( ). Please don't add the timezone to the list
of troubles.
The 2nd problem are the ISPs and other sys admins. They keep outdated
or customized systems on their systems. If distros start to use the
system timezone, I'm sure we will see very old databases on many
servers in a couple of years. That's an upcoming bomb. It is not
directly the distributions fault, but this problem is particullary
true for Debian users (IPS using debian). Debian being conservative,
their users are even more conservative and reluctant to update their
systems. The worst is that they even follow blindly your choices.
Cheers,
hey pierre,
(removed some folks from the cc)
On Thursday 10 January 2008 11:22:52 am Pierre wrote:
Besides the issues listed by Derick, there is two problems with these
choices (distro making design changes to upstream software). First it
that's sort of the point of open-source software, though yes ideally the
changes (a) don't break or confuse things and (b) make their way "back
upstream", and (c) are acceptable by the author. in this case there's some
concern about (a) and (c), but it seems joe has followed up with another
patch and i'd be interested to see if that changes anything.
creates an uncontrollable gray area where you have to be lucky if your
software works out of the box on a given distribution. PHP (and GD in
i think that's taking things a bit into hyperbole...
particular) is a pain on Debian (btw, is the pollitics about libgd in
debian finally gone? :( ). Please don't add the timezone to the list
of troubles.
like you i gave up dealing with the debian libgd maintainer, but at least it
looks like he's keeping up to date with the latest releases now. would be
nice if he made an update to stable too....
The 2nd problem are the ISPs and other sys admins. They keep outdated
or customized systems on their systems. If distros start to use the
system timezone, I'm sure we will see very old databases on many
servers in a couple of years. That's an upcoming bomb. It is not
directly the distributions fault, but this problem is particullary
true for Debian users (IPS using debian). Debian being conservative,
their users are even more conservative and reluctant to update their
systems. The worst is that they even follow blindly your choices.
i think joe adequately addressed these points in a later mail, so i won't kick
a dead horse :)
sean
2008/1/10, Pierre pierre.php@gmail.com:
Besides the issues listed by Derick, there is two problems with these
choices (distro making design changes to upstream software).
Please don't add the timezone to the list
of troubles.
The vast mayority of PHP users obtain it via their OS vendor, isnt
that enough reason to make their life easier just providing an
alternative ?
We sometimes have to do those "design changes" because the original
behaviuor is simple not suitable.
If distros start to use the
system timezone
Distros use system timezonedb for %99 of the other components(some of
them far more critical than PHP) , why PHP has to be special ?
I'm sure we will see very old databases on many
servers in a couple of years.
yes but PHP nor the distro makers are to be blamed for that, because
distros do update
the system tzdb regulary otherwise you have literally gazillions of
components that will return out-of-date information.
Hi,
On Thu, 10 Jan 2008 21:02:17 -0300, in php.internals
judas.iscariote@gmail.com ("Cristian Rodriguez") wrote:
If distros start to use the
system timezoneDistros use system timezonedb for %99 of the other components(some of
them far more critical than PHP) , why PHP has to be special ?
Because distros only have to concern about themselves. PHP has to
concern about cross-system-compatibility.
Besides the specific exec functions it might be a good idea to make
PHP as OS and distribution independent as possible. It only takes a
few dependencies to wreck a whole "independent" language.
I wouldn't like to see php projects have to create different packages
with individual code for different types of os or distribution.
--
- Peter Brodersen
2008/1/10, Pierre pierre.php@gmail.com:
If distros start to use the
system timezoneDistros use system timezonedb for %99 of the other components(some of
them far more critical than PHP) , why PHP has to be special ?
Because if PHP application developers want to deal with timezones
correctly, they need to be able to rely on certain functionality and
data availability. PHP can not simply use the default OS system calls as
they do not expose enough information, therefore we do have to read the
database files directly. At the moment we basically concatenate all the
database files and give it an index. Parsing the database files could be
optimized more if we have control over this format (which we have at the
moment). It is quite possible that in the future we do munge the
database much more than just a concattenation of all files to increase
performance.
regards,
Derick
It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.
There is a PECL extension to provide those updates:
http://pecl.php.net/package/timezonedb
It is a drop-in replacement to update the rules.
I've prepared a patch to allow the timelib code to use the system
timezone database directly; would something like this be acceptable, any
thoughts?
Quite a few actually.
Notes:
- I've not implemented timelib_timezone_builtin_identifiers_list() here
since it doesn't seem to be used, is it there for third-party
extensions? It could be implemented by iterating through the directory.
It's not used (anymore). However, there is a PHP function
timezone_identifiers_list that uses the index information to enumerate
all entries. Your patch prevents this function from working as both
struct elements index_size and index are set to 0/null.
- there's no general way that I can find to obtain the database
version, so I just invented a string here.
Not having the version creates another problem, as the PECL timezonedb
extension will then always override the builtin database. The
extension uses the version number to see if the extension provides a
later rules set. With your patch, this extra check will not work
correctly anymore if PHP's built-in version is newer (which would happen
if PHP got upgraded).
Some other issues:
- This patchs allows accessing any file on the filesystem (because the
path is not sanitized). - Opening a file is less quick than having the data in memory.
- The system's format of tzfiles does not necessarily have to be the
same as the one that PHP reads. On my (Debian) system the tz provided
data is already 64bit read. Once I update the extension to use this
extra data, reading the system tz files won't work anymore. - It would be possible to use a "wrong set" of tzdata, resulting in bugs
like http://bugs.php.net/bug.php?id=35958
The reason to bundle the database is to have a operating system
independent source of tzdata, which is something that this patch
destroys. I am because of this, and all the other issues, quite against
adding this patch - especially because there is already a solution for
this.
regards,
Derick
--
Derick Rethans
http://derickrethans.nl | http://ezcomponents.org | http://xdebug.org
Thanks for all the feedback.
It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.There is a PECL extension to provide those updates:
http://pecl.php.net/package/timezonedb
It is a drop-in replacement to update the rules.
My aim is to eliminate the additional copies of the timezone database,
rather than ship and maintain yet more. Just as I don't want to
statically link all system libraries into PHP, nor have to build special
copies of any of them for PHP.
I'm not sure I find the logic of the "but the system-provided data will
become stale" arguments convincing; systems which are left unmaintained
by the administrators will have old versions of software on; that's a
given. I can't see why adding more packages (PECL timezonedb) which
those admins have to keep up-to-date will make any difference to that.
The point of the patch is to make it easier for admins and packagers
to keep systems up to date; fewer package updates to download, test,
deploy, etc. If every package on your system followed the same
philosophy of duplicating the timezone database, the system would be
unmaintainable.
- I've not implemented timelib_timezone_builtin_identifiers_list() here
since it doesn't seem to be used, is it there for third-party
extensions? It could be implemented by iterating through the directory.It's not used (anymore). However, there is a PHP function
timezone_identifiers_list that uses the index information to enumerate
all entries. Your patch prevents this function from working as both
struct elements index_size and index are set to 0/null.
OK, fixed in attached patch.
- there's no general way that I can find to obtain the database
version, so I just invented a string here.Not having the version creates another problem, as the PECL timezonedb
extension will then always override the builtin database. The
extension uses the version number to see if the extension provides a
later rules set. With your patch, this extra check will not work
correctly anymore if PHP's built-in version is newer (which would happen
if PHP got upgraded).
This seems fine, even desirable. Anybody who configures PHP
--with-system-tzdata and installs the PECL timezonedb always gets the
data provided by the timezonedb; the risk of stale timezone data is
little different to the current risk.
- This patchs allows accessing any file on the filesystem (because the
path is not sanitized).
Fixed in attached patch.
- Opening a file is less quick than having the data in memory.
Sure, this is a trade-off; it's configurable exactly so that users get
to make the choice.
- The system's format of tzfiles does not necessarily have to be the
same as the one that PHP reads. On my (Debian) system the tz provided
data is already 64bit read. Once I update the extension to use this
extra data, reading the system tz files won't work anymore.
I'm not sure exactly what the concern is here; the 64-bit TZif2 format
is backwards-compatible with the TZif format and the existing PHP code
reads it without problems. (I've been testing against it)
- It would be possible to use a "wrong set" of tzdata, resulting in bugs
like http://bugs.php.net/bug.php?id=35958
It's always going to be possible for people to shoot themselves in the
feet. Someone trying to "fix" the PHP-supplied database could similarly
screw it up. Again, eliminating extra copies of the timezone database
is intended to make it easier for people to keep up-to-date.
Regards,
joe
I'm not sure I find the logic of the "but the system-provided data
will
become stale" arguments convincing; systems which are left
unmaintained
by the administrators will have old versions of software on; that's a
given. I can't see why adding more packages (PECL timezonedb) which
those admins have to keep up-to-date will make any difference to that.
I think the point is not so much that the system provided one will
become stale, but more that it impossible to write portable
applications when relying on the system provided one.
regards,
Lukas
Lukas Kahwe Smith wrote:
I'm not sure I find the logic of the "but the system-provided data will
become stale" arguments convincing; systems which are left unmaintained
by the administrators will have old versions of software on; that's a
given. I can't see why adding more packages (PECL timezonedb) which
those admins have to keep up-to-date will make any difference to that.I think the point is not so much that the system provided one will
become stale, but more that it impossible to write portable applications
when relying on the system provided one.
But, if every one of the thousands of packages out there included their
own timezone database and there was a separate mechanism for updating
each one, pushing a new timezone db to a server would be impossible.
Making this an option for the distros is something we pretty much have
to do. And yes, I know it makes life harder for people writing portable
apps, but that's just the way it goes sometimes.
-Rasmus
2008/1/10, Rasmus Lerdorf rasmus@lerdorf.com:
Making this an option for the distros is something we pretty much have
to do.
Yes, or otherwise distros will simple implement their own options. ;-)
And yes, I know it makes life harder for people writing portable
apps, but that's just the way it goes sometimes.
As I said before, distros update the timezonedb everytime a change
occurs, so it is really no problem.
<snip>It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.There is a PECL extension to provide those updates:
http://pecl.php.net/package/timezonedb
It is a drop-in replacement to update the rules.
I'm not sure I find the logic of the "but the system-provided data will
become stale" arguments convincing; systems which are left unmaintained
by the administrators will have old versions of software on; that's a
given. I can't see why adding more packages (PECL timezonedb) which
those admins have to keep up-to-date will make any difference to that.
But for the admins who do update it doesn't really matter whether
there is one or two packages as both should show up when he requests a
"list of packages for which updates are available".
- I've not implemented timelib_timezone_builtin_identifiers_list() here
since it doesn't seem to be used, is it there for third-party
extensions? It could be implemented by iterating through the directory.It's not used (anymore). However, there is a PHP function
timezone_identifiers_list that uses the index information to enumerate
all entries. Your patch prevents this function from working as both
struct elements index_size and index are set to 0/null.OK, fixed in attached patch.
This is a fix that scans the whole directory - which is not really fast.
I find this an unacceptable action to cripple to PHP.
- there's no general way that I can find to obtain the database
version, so I just invented a string here.Not having the version creates another problem, as the PECL timezonedb
extension will then always override the builtin database. The
extension uses the version number to see if the extension provides a
later rules set. With your patch, this extra check will not work
correctly anymore if PHP's built-in version is newer (which would happen
if PHP got upgraded).This seems fine, even desirable. Anybody who configures PHP
--with-system-tzdata and installs the PECL timezonedb always gets the
data provided by the timezonedb; the risk of stale timezone data is
little different to the current risk.
However, how does the user know he can actually get a newer ruleset if
he does not have any way to see which database is currently active? Not
being able to see which version is currently active is yet another
degeneration of PHP's timezone functionality.
- This patchs allows accessing any file on the filesystem (because the
path is not sanitized).Fixed in attached patch.
I take your word for that, but I did not test it.
- Opening a file is less quick than having the data in memory.
Sure, this is a trade-off; it's configurable exactly so that users get
to make the choice.
No, users don't get this choice because it is not them who install PHP -
it's the admins. By degrading performance by forcing filesystem reads
you're not helping PHP users at all. It's quite similar to forcing only
the 1970-2038 range for timestamps to make it compatible with Windows in
older RHEL releases.
- The system's format of tzfiles does not necessarily have to be the
same as the one that PHP reads. On my (Debian) system the tz provided
data is already 64bit read. Once I update the extension to use this
extra data, reading the system tz files won't work anymore.I'm not sure exactly what the concern is here; the 64-bit TZif2 format
is backwards-compatible with the TZif format and the existing PHP code
reads it without problems. (I've been testing against it)
Ok, good - i wasn't totally sure here. However, like I mentioned in my
other mail, the TZif format is not the best way for PHP to access the
rulesets, and there is the possibility that while building the data the
format might change to make it easier to access rulesets for PHP. In
that case you can't simply switch the parsing to the system provided
database because the format will be different. This is at the moment
hypotetical though.
regards,
Derick
--
Derick Rethans
http://derickrethans.nl | http://ezcomponents.org | http://xdebug.org
It's not used (anymore). However, there is a PHP function
timezone_identifiers_list that uses the index information to enumerate
all entries. Your patch prevents this function from working as both
struct elements index_size and index are set to 0/null.OK, fixed in attached patch.
This is a fix that scans the whole directory - which is not really fast.
I find this an unacceptable action to cripple to PHP.
The performance cost is once per process invocation. To quantify it, I
timed a HEAD sapi/cli/php build with a fairly minimal statically-linked
module set, running a script which does:
<? date_default_timezone_set('Antarctica/South_Pole'); ?>
...does that seem like a reasonable benchmark? The average run times
reported by time(1) were (elapsed real time):
default: 0m0.018s
system tz: 0m0.021s
so a ~0.003s hit seems OK to me. I would hazard a guess that this
performance hit is little different to that of building a few extensions
as DSOs, or to using any other extension which has high init overhead
(openssl or snmp spring to mind). Would it help if this filesystem
trawl was hooked into the date extension's MINIT function?
(For perspective; the /usr/bin/php from the Fedora 5.2.5 package with a
much larger set of extensions built as DSOs takes an average 0m0.068s to
run the same script on the same system)
Not having the version creates another problem, as the PECL timezonedb
extension will then always override the builtin database. The
extension uses the version number to see if the extension provides a
later rules set. With your patch, this extra check will not work
correctly anymore if PHP's built-in version is newer (which would happen
if PHP got upgraded).This seems fine, even desirable. Anybody who configures PHP
--with-system-tzdata and installs the PECL timezonedb always gets the
data provided by the timezonedb; the risk of stale timezone data is
little different to the current risk.However, how does the user know he can actually get a newer ruleset if
he does not have any way to see which database is currently active? Not
being able to see which version is currently active is yet another
degeneration of PHP's timezone functionality.
I'm not sure I understand the question. For users of PECL timezonedb,
the situation is no different to the status quo. Otherwise; it's kind
of the whole point is that they don't have to care -- just as they don't
have to care for all the other apps on the system; they just have to
update the system tzdata package.
Ok, good - i wasn't totally sure here. However, like I mentioned in my
other mail, the TZif format is not the best way for PHP to access the
rulesets, and there is the possibility that while building the data the
format might change to make it easier to access rulesets for PHP. In
that case you can't simply switch the parsing to the system provided
database because the format will be different. This is at the moment
hypotetical though.
That doesn't sound like it would be an intractable problem even so;
if&when such a change happened, the current read_* functions could be
retained to parse the system's TZif-format files; the raw data is there
either way, regardless of what transformation is needed.
(BTW I noticed that the timezonedb*_builtin arrays could be marked
static in timezonedb.h; it doesn't look like they need to be exposed?)
Regards,
joe
hi everyone,
sorry for digging up this dead horse...
On Thursday 10 January 2008 01:05:35 pm Joe Orton wrote:
It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since this
needs to be updated frequently.There is a PECL extension to provide those updates:
http://pecl.php.net/package/timezonedb
It is a drop-in replacement to update the rules.My aim is to eliminate the additional copies of the timezone database,
rather than ship and maintain yet more. Just as I don't want to
statically link all system libraries into PHP, nor have to build special
copies of any of them for PHP.
joe: did you ever get further on this? it seems this thread died off and
nothing came of it, as far as I can tell.
sean
sorry for digging up this dead horse...
On Thursday 10 January 2008 01:05:35 pm Joe Orton wrote:
It's a bit of a maintenance headache for distributions when
packages include their own copy of the timezone database, since
this
needs to be updated frequently.There is a PECL extension to provide those updates:
http://pecl.php.net/package/timezonedb
It is a drop-in replacement to update the rules.My aim is to eliminate the additional copies of the timezone
database,
rather than ship and maintain yet more. Just as I don't want to
statically link all system libraries into PHP, nor have to build
special
copies of any of them for PHP.joe: did you ever get further on this? it seems this thread died off
and
nothing came of it, as far as I can tell.
Sorry for beating on the dead horse that has been dragged up, and late
to boot...
Just recently, somebody wondered why PHP had the CORRECT time when
their web-server didn't...
As I understand the situation, if you can get ALL the sysadmins of the
world to update their [bleep] timezonedb frequently, PHP can drop the
internal timezonedb.
:-)
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/from/lynch
Yeah, I get a buck. So?
2008/3/27, Richard Lynch ceo@l-i-e.com:
As I understand the situation, if you can get ALL the sysadmins of the
world to update their [bleep] timezonedb frequently, PHP can drop the
internal timezonedb.
OS vendors release timezone updates frecuently, there is no need for
such bundled tz database, just duplicates work.
--
"If debugging is the process of removing bugs, then programming must
be the process of putting them in." – Edsger Dijkstra
hi,
On Thursday 27 March 2008 10:26:57 pm Richard Lynch wrote:
Just recently, somebody wondered why PHP had the CORRECT time when
their web-server didn't...
and vice versa. see for example:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471104
if there were an option for php to use the system-bundled timezone database,
then this would be a moot point.
but instead, at this point to solve the problem someone will need to go
digging through cvs, produce a changeset, possibly massage the patch to apply
to an older release, ensure the software builds correctly, test the new
packages locally, create a new official release for the stable distribution,
wait for the users to get the update...
this is a process that has been done once already for the system timezone
database packages, why do it again for php? if the admin is using the
packaged versions of php (and that is the point of this discussion), it won't
magically fix itself... so something has to be upgraded. and either way
they're usually running "$packagemanager upgrade", taking whatever is new,
whether it is just the timezonedb or the php packages too. it's merely a
question of "get the updates for free?" vs "get the updates when the system
and php has been updated?"
As I understand the situation, if you can get ALL the sysadmins of the
world to update their [bleep] timezonedb frequently, PHP can drop the
internal timezonedb.
usually that's a matter of "$packagemanager upgrade" but i digress.
of course if you're in a hosted environment where you've built your own php
but don't have root to update the system timezone db, then it can be very
useful to have the bundled timezonedb too.
but ultimately, what may be best for you may be least optimal for someone
else... so why not provide an option to work both ways? i'm not even asking
that this be made the default, in case you're worried about the clueless
user/sysadmin factor.
sean