Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:104334 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 30343 invoked from network); 11 Feb 2019 01:16:57 -0000 Received: from unknown (HELO mail-qt1-f170.google.com) (209.85.160.170) by pb1.pair.com with SMTP; 11 Feb 2019 01:16:57 -0000 Received: by mail-qt1-f170.google.com with SMTP id o6so10139039qtk.6 for ; Sun, 10 Feb 2019 13:59:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1cidQXTs0qpfttHGOTUqIFWq1aarttaBgOhsDuwAJ4M=; b=uPRtkZcrmiy5c8oRCH+jPkZcYnWZqEW7xonbYcP12yjF4rOlzEGvZ0dzXciUL2l3uT 8/cT4a5VaDT5gAiNPY1ghUF1gKYuojrxGlGol8LHp/IGQyR0I+WQbUjnJASPVbcydukL eIOS1c4xRJqYibZML16QKfX8FXGT43Hu3M8Yp0lwuz17/cShGEvu5d+1kecIfkgGzQZt DoHmtJQGX1jtx+K3Ht1kWBbAUI3aOKWVbp1OaYjIUA2Y+W6+KDzu8V+smCg9qGA5tNut 58WWpoI4S2Q4mZnQJ/JKyLjKEoOhkoWKaoJI7kB9ALnb5qbJzRgBcgkXc8gLAjbtJpMn opBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1cidQXTs0qpfttHGOTUqIFWq1aarttaBgOhsDuwAJ4M=; b=IsYVLkGUeltpzPIapUhfatD0hkPZTAopFLH3liv9wgV0ABmqnr8ZsxMxh22CO2o1gx F+tEDlkU/iZ8Srehiq+J+duDn7khL0kWp3qiI0NpGr93WJpjuLgf5mp+8+yMCm+UmEkk a3EgTHu00oobVSI4zit2Uv3cO9DH/V2KcMw+G6fiER6BFRlmLeaCkwzzuwSl0DyXxrHB ahrYahpRw+EMCGnsmZp9yvCXmUWXYG0h9oNV+A7OyzAbb8SxY2FzFPvCAu+7/yH8tK0y Pq4bF7NKreki+Yqx8xfUIX+5cfLVAUpfi2weJeyDnJkaSx1XWIKjXIYdWaOqd7hgZWgj kwXQ== X-Gm-Message-State: AHQUAub3T5JlmYSZVRa/TQTa7O4o/Dv0c/lBjTUZQuRuhqpAyhplZjQT gGomAFqlaTIO6kHo4DMl2g/X91yGGR0KOWth21U= X-Google-Smtp-Source: AHgI3IbWerYJm1ATj/ZfL1+GeybDTh98a875w86jGX+wHj/MDm2Dy5fXqvf5vEs1ey//JFH6qSXB0xSMlKjruxH+93k= X-Received: by 2002:aed:3aa7:: with SMTP id o36mr24716714qte.240.1549835965888; Sun, 10 Feb 2019 13:59:25 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Sun, 10 Feb 2019 22:59:12 +0100 Message-ID: To: Rowan Collins Cc: PHP internals Content-Type: multipart/alternative; boundary="00000000000034fc770581914e93" Subject: Re: [PHP-DEV] reasonability of change the mbfl library From: legale.legale@gmail.com (Legale Legage) --00000000000034fc770581914e93 Content-Type: text/plain; charset="UTF-8" Good idea, thanks. should be a bit slower than lookup table, but faster then now. On Sun, Feb 10, 2019, 21:02 Rowan Collins On 10/02/2019 12:29, Legale Legage wrote: > > This conception can be used for the utf-16 encoding, but table size > > would be 65536 bytes against 256 byte for the utf-8 table. > > Rather than two 65 kilobyte lookup tables with most entries identical, > would it be reasonable to use a bit mask to check for the range we care > about? > > I may have this slightly wrong, but something like: > > #define UTF16_LE_CODE_UNIT_IS_HIGH_SURROGATE (code_unit & 0xFC00 == 0xD800) > #define UTF16_BE_CODE_UNIT_IS_HIGH_SURROGATE (code_unit & 0x00FC == 0x00D8) > > m = UTF16_LE_CODE_UNIT_IS_HIGH_SURROGATE(*(uint16_t *)p) ? 4 : 2; > > Regards, > > -- > Rowan Collins > [IMSoP] > > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > > --00000000000034fc770581914e93--