Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:8191 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 13229 invoked by uid 1010); 26 Feb 2004 21:59:14 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 13205 invoked by uid 1007); 26 Feb 2004 21:59:14 -0000 Message-ID: <20040226215914.13204.qmail@pb1.pair.com> To: internals@lists.php.net Reply-To: jay@php.net Mail-Copies-To: jay@php.net Date: Thu, 26 Feb 2004 16:59:11 -0500 Lines: 53 User-Agent: KNode/0.7.6 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit X-Posted-By: 216.94.11.234 Subject: Fix for #27291 (get_browser() problems) From: jay@php.net (Jay Smith) Hey all, I think I have a fix for the problems in get_browser() as per the bug report. (#27291: get_browser matches browscap.ini patterns incorrectly) Gary Keith, who handles the browscap.ini file we suggest in the get_browser() docs, uses IIS/ASP's browscap.dll for his benchmarking and I believe I gotten get_browser() to produce the same results as the DLL. The original bug reporter, Ryan Schmidt, pointed me in the direction of this documentation from MSFT: http://www.microsoft.com/windows2000/en/server/iis/htm/asp/comp1g11.htm So I basically just implemented what they say there. (The relevant section starts with "Note The BrowserType object first attempts to match...") The basic fix is basically... - use ^ and $ to anchor the regexes - do a case-insensitive search (lowercasing both the patterns and the user agent being checked, using REG_ICASE is painfully slow) - use an exact match where possible - if more than one match is found, use the match which replaces the fewest number of characters versus the original browser name patterns. (Ignoring any ?/* wildcard characters in the pattern for character counts.) Based on the list of 22954 user agents I got from Gary, this seems to take care of them all and should produce the same results as browscap.dll. The patch against HEAD can be found at http://216.94.11.234/browscap.c.diff It's kind of long so I'd like to get some feedback. It can probably be optimized a bit, as my C is getting a little rusty, but it seems to work. There is a bit more overhead versus the previous get_browser() implementation, but based on the little benchmarking I did I don't think it's overly serious. This doesn't address the ini parser issues in #27372; don't know what to do about that yet. I'd like to commit this to HEAD and also to PHP_4_3 if possible. (The patch for 4_3 is identical except for that call to zend_is_autoglobal().) I don't have access to a Windows box at the moment and have only tested on linux, but I think everything should check out. So, any yays or nays on the patch? J