Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:38879 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 8095 invoked from network); 9 Jul 2008 17:34:26 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 9 Jul 2008 17:34:26 -0000 Authentication-Results: pb1.pair.com smtp.mail=michal.dziemianko@gmail.com; spf=unknown; sender-id=unknown Authentication-Results: pb1.pair.com header.from=michal.dziemianko@gmail.com; sender-id=unknown Received-SPF: unknown (pb1.pair.com: domain gmail.com does not designate 82.132.130.152 as permitted sender) X-PHP-List-Original-Sender: michal.dziemianko@gmail.com X-Host-Fingerprint: 82.132.130.152 sidious.london.02.net Linux 2.6 Received: from [82.132.130.152] ([82.132.130.152:51962] helo=mail.o2.co.uk) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id FA/3F-30205-0A6F4784 for ; Wed, 09 Jul 2008 13:34:25 -0400 Received: from [192.168.1.64] (78.86.103.249) by mail.o2.co.uk (8.0.013.3) (authenticated as michal_dziemianko@o2.co.uk) id 4863729D02847291 for internals@lists.php.net; Wed, 9 Jul 2008 18:34:18 +0100 In-Reply-To: References: <7E62CA6E-83F4-4F9C-86FB-75EBE7D489C9@gmail.com> <484D36EB.9080202@macvicar.net> <31fe29920806110114i263a47b4sd66d4849ceb22980@mail.gmail.com> <484F910A.3080208@zend.com> <33FE7D54-B113-4C5F-9315-659F8550397E@gmail.com> <11BACFBA-C2C7-4405-9092-679606B41DB1@gmail.com> Mime-Version: 1.0 (Apple Message framework v753.1) X-Priority: 3 Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-ID: <8485ADFB-6F15-4489-BD13-205EB24863E2@gmail.com> Content-Transfer-Encoding: 7bit Date: Wed, 9 Jul 2008 18:33:55 +0100 To: internals@lists.php.net X-Mailer: Apple Mail (2.753.1) Subject: Re: [PHP-DEV] Algorithm Optimizations - string search From: michal.dziemianko@gmail.com (Michal Dziemianko) Hello, I was again in bed for few days:/// Hope this time I have fully recovered:/ Anyways the patch for today is for auxiliary php_charmask function used by trim, str_word_count and addcslashes. It is basically just bunch of if-statements. But it can sped up by reordering them. Original needs 4+3 tests executed in case of single char, and 4 in case of range. Optimized needs only 1 test for single char + 3 for range. Maybe not a big improvement, but it is way faster for chars, and about the same speed for ranges. Here are some evaluation results (the results for MOD1): http://212.85.117.53/gsoc/index.php? option=com_content&view=article&id=62:phpmask-possible- reimplementation&catid=36:patches&Itemid=56 The diff is below. Cheers Michal Index: ext/standard/string.c =================================================================== RCS file: /repository/php-src/ext/standard/string.c,v retrieving revision 1.445.2.14.2.69.2.29 diff -u -d -r1.445.2.14.2.69.2.29 string.c --- ext/standard/string.c 3 Jul 2008 14:00:20 -0000 1.445.2.14.2.69.2.29 +++ ext/standard/string.c 8 Jul 2008 20:27:50 -0000 @@ -663,36 +663,38 @@ memset(mask, 0, 256); for (end = input+len; input < end; input++) { c=*input; - if ((input+3 < end) && input[1] == '.' && input[2] == '.' - && input[3] >= c) { - memset(mask+c, 1, input[3] - c + 1); - input+=3; - } else if ((input+1 < end) && input[0] == '.' && input[1] == '.') { - /* Error, try to be as helpful as possible: - (a range ending/starting with '.' won't be captured here) */ - if (end-len >= input) { /* there was no 'left' char */ - php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid '..'-range, no character to the left of '..'"); - result = FAILURE; - continue; - } - if (input+2 >= end) { /* there is no 'right' char */ - php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid '..'-range, no character to the right of '..'"); + if (input[1] != '.') { + mask[c] = 1; + } else { + if ((input+3 < end) && input[2] == '.' && input[3] >= c) { + memset(mask+c, 1, input[3] - c + 1); + input+=3; + } else if ((input+1 < end) && input[0] == '.' && input[1] == '.') { + if (end-len >= input) { /* there was no 'left' char */ + php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid '..'- range, no character to the left of '..'"); + result = FAILURE; + continue; + } + if (input+2 >= end) { /* there is no 'right' char */ + php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid '..'- range, no character to the right of '..'"); + result = FAILURE; + continue; + } + if (input[-1] > input[2]) { /* wrong order */ + php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid '..'- range, '..'-range needs to be incrementing"); + result = FAILURE; + continue; + } + /* FIXME: better error (a..b..c is the only left possibility?) */ + php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid '..'-range"); result = FAILURE; continue; + } else { + mask[c] = 1; } - if (input[-1] > input[2]) { /* wrong order */ - php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid '..'-range, '..'-range needs to be incrementing"); - result = FAILURE; - continue; - } - /* FIXME: better error (a..b..c is the only left possibility?) */ - php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid '..'-range"); - result = FAILURE; - continue; - } else { - mask[c]=1; } - } + } + return result; } /* }}} */