Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:104373 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 2558 invoked from network); 13 Feb 2019 00:30:12 -0000 Received: from unknown (HELO mail-qk1-f174.google.com) (209.85.222.174) by pb1.pair.com with SMTP; 13 Feb 2019 00:30:12 -0000 Received: by mail-qk1-f174.google.com with SMTP id f196so69958qke.10 for ; Tue, 12 Feb 2019 13:13:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bGnCWIIabZSv+8eQl9rHt1lsG2Vc+qIKrHRmIIrK6H8=; b=b74wOuelRSZ63x+UNPLfi+vQuEZqWyi0EbBn3a6iFhg0uYmLB2AHKa+A9NDHUk4Shu +TzWFlVyUfv/oeJ3v+vSd7YtmyajqSwPnwOVvaX2+qrCZuQUoZ9sKCqDCTsjQ619me3v LHSiJCWLTniwk5c9vxELkONq3t9+FY0QG7kvuZsWKd0uXxzu4bEIDyKc6Vuz1Ma4e2ys l+ifTUrUvvpYPFhyGWR2xZbaAVRr+G5UYWtaIjKMWl+fT47TLEvgQB8HwJIiQ5u1y/NT n2exCGQGzcrJWtDMlSbosXZ7/uiY8JejPAV6UeM8gs5Xa0Y4sXzdjLoMdW49OSIQutxJ 093Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bGnCWIIabZSv+8eQl9rHt1lsG2Vc+qIKrHRmIIrK6H8=; b=AaNSKDl7AG4d8y+Or+4O+rvgIv8YFR6zN3rKyZk90Yj5VcizcujDTvqBjCmagzy5ck p0wZhNFtGZr8Txz7JQgma9ouCYbvwo0qil6TXeSPV+uZ9p9AAavuB1HbWBzPBI24o/OK CiXueTQXB4yhjxo7Gw8Hcd2RvOfP3494eoxd7UxwMU4mUucCrWHXOVujp1L4+ua2uFpK Y0vEcH+SSarA1UrBGzLTvRhukPtPvmbr3vY8EFeOGwyph7VCQszxwTfgb7VZ2eREIpxl 9tc2Ocwcj+JWiqKMBrOln1KC6M3qrExMYUvVpj9QAOlG0mndNUG830U8BfeRSH7soE3Y XRjA== X-Gm-Message-State: AHQUAuajnP1By2pNdBgPhiZWIjp0HiCduAx/lKCKzuuvqGQeRHa7QbmU jQB4UcyGfrlerWghlKXxK9hjmavuHZmmfC/r3+w= X-Google-Smtp-Source: AHgI3IbCLI8lQMbYxCTqnnI0f4w6SWBp2FRwK2FNJVa447p08nlirJALOMEZn1t5e0pbWUOVgVbo8N23DA99qFan++Q= X-Received: by 2002:a05:620a:1103:: with SMTP id o3mr4276681qkk.157.1550005990018; Tue, 12 Feb 2019 13:13:10 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Tue, 12 Feb 2019 22:12:58 +0100 Message-ID: To: Dan Ackroyd , Rowan Collins Cc: PHP internals Content-Type: multipart/alternative; boundary="0000000000006f52640581b8e418" Subject: Re: [PHP-DEV] reasonability of change the mbfl library From: legale.legale@gmail.com (Legale Legage) --0000000000006f52640581b8e418 Content-Type: text/plain; charset="UTF-8" Hello, internals. As Rowan Collins suggested i've replaced lookup table with simple macros: #define UTF16_LE_CODE_UNIT_IS_HIGH_SURROGATE (code_unit & 0xFC00 == 0xD800) #define UTF16_BE_CODE_UNIT_IS_HIGH_SURROGATE (code_unit & 0x00FC == 0x00D8) I repeated the benchmarks again. Here is the results: String foobar was repeated 1000000 times. Result string size is 11.4mb mb_str_split(): string was splitted by 50 into 120000 chunks 1 in 0.400670 s mb_str_split_utf16(): string was splitted by 50 into 120000 chunks 1 in 0.038947 s I satisfied my research interest. The question is there practical value? Interested in your opinion. php benchmark code: wrote: > On Sun, 10 Feb 2019 at 12:29, Legale Legage > wrote: > > > > > > > https://github.com/php/php-src/pull/3715/commits/d868059626290b7ba773b957045e08c3efb1d603#diff-22d593ced03b2cb94450d9f9990865c8R38 > > > > To do, or not to do: that is the question. > > What do you think? > > Opening separate pull requests for separate changes is good as it > allows them to be discussed separately. That change is bundled with > the mb_str_split() changes, so it's quite hard to see what is > optimisation and what is part of the approved RFC. > > Although memory is cheap, the change appears to increase the static > allocation of memory by 128KB for something that >95% of PHP > programmers will never use, which is not a good idea. > > > show a more than 2 times speed increase. > > Lies, damn lies and statistics. > > If it takes the time to parse a megabyte string from 0.000002 to > 0.000001, no one cares. > If it takes the time to parse a megabyte string from 2 seconds to 1 > second, wow that's great! > > i.e. Saying a two times speed increase without context doesn't give > people enough information to evaluate it. > > But this would be easier to discuss as a separate PR. > > cheers > Dan > --0000000000006f52640581b8e418--