Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:78076 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 8699 invoked from network); 14 Oct 2014 21:58:07 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 14 Oct 2014 21:58:07 -0000 Authentication-Results: pb1.pair.com smtp.mail=aleksey.tulinov@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=aleksey.tulinov@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.42 as permitted sender) X-PHP-List-Original-Sender: aleksey.tulinov@gmail.com X-Host-Fingerprint: 74.125.82.42 mail-wg0-f42.google.com Received: from [74.125.82.42] ([74.125.82.42:64840] helo=mail-wg0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id CA/1A-18603-E6C9D345 for ; Tue, 14 Oct 2014 17:58:07 -0400 Received: by mail-wg0-f42.google.com with SMTP id z12so14229wgg.1 for ; Tue, 14 Oct 2014 14:58:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=K4Vjxtn/LooLVOPS3QWyWeyhBHFhXZxO6sAL26w3bQE=; b=y/u2Bkxkqe7uwF3nt4tEQzK8uwmBmAjjOjgzX7cZ+VHuhttdja+T1PCT77BoyzsbtN /ecoS9hPl3UcU9CSG03KUpAzGfxrkIJxQq35Eoz0P8tFVDZJB5Ag5eNRORQfbd4YRn5S 6NRv07q+nijqdkj74gsEX9SqIon5vCD9+nJ7inPem/B6rE/xm1iWOzLrw7I19ddsEhjS QT6ldbZtLVbOxnd1NPR201/MuxC2CuHIr/iyeU+MUGlFOrjsZHpXuGFBgdHA4DTHijgB y+tll4/FqYZVGKCdM/dElhomwoecxJeFLEsO7Co8chBxWPv7GtOhbFCFUd5dliDQavkM 6Byg== X-Received: by 10.180.188.41 with SMTP id fx9mr7861917wic.59.1413323883762; Tue, 14 Oct 2014 14:58:03 -0700 (PDT) Received: from [172.16.0.137] ([195.177.73.61]) by mx.google.com with ESMTPSA id u15sm11784218wjw.32.2014.10.14.14.58.02 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Oct 2014 14:58:03 -0700 (PDT) Message-ID: <543D9C68.3030709@gmail.com> Date: Wed, 15 Oct 2014 00:58:00 +0300 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: =?UTF-8?B?Sm9oYW5uZXMgU2NobMO8dGVy?= CC: internals@lists.php.net References: <543CE705.7030203@gmail.com> <4575A816-43F4-462D-8150-A2D35516D914@ajf.me> <543D64E5.8000706@gmail.com> <543D8528.1060605@gmail.com> <1413319737.22588.2.camel@kuechenschabe> In-Reply-To: <1413319737.22588.2.camel@kuechenschabe> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] Unicode support From: aleksey.tulinov@gmail.com (Aleksey Tulinov) On 14/10/14 23:48, Johannes Schlüter wrote: > On Tue, 2014-10-14 at 23:18 +0300, Aleksey Tulinov wrote: >> Very good point. I'll give another example: is there a substring "s" in >> string "Maße"? If it's case-sensitive search, when there is no such >> substring, but if it's case-insensitive search, then "ß" folds into "ss" >> and substring "s" appears. > > In Unicode 5.1 there is "ẞ" U+1E9E LATIN CAPITAL LETTER SHARP S. > > (The point of this post mostly is to show that there is another > dimension making this even more complicated, again - different Unicode > versions) > It's still in Unicode 7.0. According to Unicode character database "ß" uppercase is "SS", "ẞ" lowercase is "ß", both casefolds into "ss". Thus upper(lower("ẞ")) should produce "SS". There is another dimension indeed.