Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:102939 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 98262 invoked from network); 21 Jul 2018 12:31:00 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Jul 2018 12:31:00 -0000 Authentication-Results: pb1.pair.com smtp.mail=rasmus@lerdorf.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=rasmus@lerdorf.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain lerdorf.com designates 209.85.218.54 as permitted sender) X-PHP-List-Original-Sender: rasmus@lerdorf.com X-Host-Fingerprint: 209.85.218.54 mail-oi0-f54.google.com Received: from [209.85.218.54] ([209.85.218.54:36714] helo=mail-oi0-f54.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E4/16-47674-287235B5 for ; Sat, 21 Jul 2018 08:31:00 -0400 Received: by mail-oi0-f54.google.com with SMTP id n21-v6so10434811oig.3 for ; Sat, 21 Jul 2018 05:30:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lerdorf-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=CZnjMQFhDBgvjlRWtvrnUdry++EP+5Qw+LGRXthSP6k=; b=jdnTUE65jvh9V5vFwVRKGmeOKdpRaCyLLpgZveEr59Mvn/ov3SS7Z3lwK0hyZW6DfX //PhsQ+TjUNctx/hvjjGz4Ct0uFAUQMebPTVYgOFmL2wanKGUGFAyCzGoJInDrMyoXxP RKvP3+eaWfXjGdmDRpRQXHm+0Vh0WwdB1Qtbpb7n0bQhoYNlzmIibpY0t5q2QMWgB/Qi TU4jggDO+orhJyP1vAjDK5CT6dHn7YeL7POWwgiUmv5nEKDGZa2Avr5hPHeXzln42x+o 7bBvArIkfktTZp5eudwNBd5ZIxB2DEa6Q7zep6AYjXvYHyYiy9XMuTk272kOR/0R2juv rSEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=CZnjMQFhDBgvjlRWtvrnUdry++EP+5Qw+LGRXthSP6k=; b=t590Ny0H3kfMDva7PretY7xbxYU/JHlSnnRLTK+Wz7bdlIGbQqa+7XhXUUmayEr2mP 9d7AwyaL3um/Y2uJcBwvGaiGM2rfYre2So8lqmsKJTlkdFvxkka3IjK4CgsBdma98ODU 1JF5yBDFIcdpzV0yrYJxgpuZEq0QrVrX6NEKVQG0G2511zUjfOL2An4T68CQ8GOdxt2K uE1DMtiUATzDuR1wkeyL70qkv1zLWDy87NkqwRSJCpMXzd0kv0MMyxGOVDd4X0zXgy4e CodyLixcRJjtezT27/l4KLUfHWu0XcQ9ttz/URXoDcPQYFEqtMmpImOZE+pKdq830RM3 PJ9A== X-Gm-Message-State: AOUpUlEBH2Rb6fadx/nTCQZJCh6WGe3nzBZUQI1jmp6hE2+u+r/HRESH Vv+Sq5xnc8J/6RestDNuZi4Ver/wzRy18CzkLuapDQ== X-Google-Smtp-Source: AAOMgpd++KCo4S74snR3Iw8X0LMqYJlryPRxVRDySYKP74m8xZryVOgFhf/vEzbeoKMG5n9dLmSWPjcbOrUVhZZ7eOU= X-Received: by 2002:aca:4455:: with SMTP id r82-v6mr1733043oia.260.1532176255504; Sat, 21 Jul 2018 05:30:55 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a4a:4786:0:0:0:0:0 with HTTP; Sat, 21 Jul 2018 05:30:15 -0700 (PDT) In-Reply-To: References: <3ce44a21a935f3d458bd4fea99db89a4fd2c9603.camel@ku.edu> Date: Sat, 21 Jul 2018 08:30:15 -0400 Message-ID: To: "Hoffman, Zachary Robert" Cc: "mapopa@gmail.com" , "me@kelunik.com" , "internals@lists.php.net" Content-Type: multipart/alternative; boundary="00000000000071693a057181952c" Subject: Re: [PHP-DEV] bugs.php.net downtime From: rasmus@lerdorf.com (Rasmus Lerdorf) --00000000000071693a057181952c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable For future reference, here is what I did to fix the encoding problem: MariaDB [phpbugsdb]> select sdesc from bugdb where id=3D76553; +--------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------+ | sdesc | +--------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------+ | =C3=90=CB=9C=C3=90=C2=BC=C3=91 =C3=90=C2=BF=C3=90=C2=B5=C3=91=E2=82=AC=C3= =90=C2=B5=C3=90=C2=BC=C3=90=C2=B5=C3=90=C2=BD=C3=90=C2=BD=C3=90=C2=BE=C3=90= =C2=B9 =C3=90=C2=BC=C3=90=C2=BE=C3=90=C2=B6=C3=90=C2=B5=C3=91=E2=80=9A =C3= =91 =C3=90=C2=BE=C3=90=C2=B4=C3=90=C2=B5=C3=91=E2=82=AC=C3=90=C2=B6=C3=90= =C2=B0=C3=91=E2=80=9A=C3=91=C5=92 =C3=91=C6=92=C3=90=C2=BF=C3=91=E2=82=AC= =C3=90=C2=B0=C3=90=C2=B2=C3=90=C2=BB=C3=91 =C3=91=C5=BD=C3=91=E2=80=B0=C3=90=C2=B8=C3=90=C2=B5 | +--------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------+ 1 row in set (0.00 sec) MariaDB [phpbugsdb]> alter table bugdb drop index email; Query OK, 76298 rows affected (0.85 sec) Records: 76298 Duplicates: 0 Warnings: 0 MariaDB [phpbugsdb]> alter table bugdb modify sdesc varbinary(80) NOT NULL DEFAULT '', modify ldesc binary NOT NULL, modify email varbinary(40) NOT NULL DEFAULT ''; Query OK, 76298 rows affected, 65535 warnings (0.65 sec) Records: 76298 Duplicates: 0 Warnings: 76091 MariaDB [phpbugsdb]> alter table bugdb modify sdesc varchar(80) CHARACTER SET utf8mb4 NOT NULL DEFAULT '', modify ldesc text CHARACTER SET utf8mb4 NOT NULL, modify email varchar(40) CHARACTER SET utf8mb4 NOT NULL DEFAULT ''; Query OK, 76298 rows affected, 127 warnings (0.57 sec) Records: 76298 Duplicates: 0 Warnings: 127 MariaDB [phpbugsdb]> alter table bugdb add FULLTEXT INDEX `email` (`email`,`sdesc`,`ldesc`); Query OK, 76298 rows affected (1.56 sec) Records: 76298 Duplicates: 0 Warnings: 0 MariaDB [phpbugsdb]> select sdesc from bugdb where id=3D76553; +--------------------------------------------------------------------------= --------+ | sdesc | +--------------------------------------------------------------------------= --------+ | =D0=98=D0=BC=D1=8F =D0=BF=D0=B5=D1=80=D0=B5=D0=BC=D0=B5=D0=BD=D0=BD=D0=BE= =D0=B9 =D0=BC=D0=BE=D0=B6=D0=B5=D1=82 =D1=81=D0=BE=D0=B4=D0=B5=D1=80=D0=B6= =D0=B0=D1=82=D1=8C =D1=83=D0=BF=D1=80=D0=B0=D0=B2=D0=BB=D1=8F=D1=8E=D1=89= =D0=B8=D0=B5 | +--------------------------------------------------------------------------= --------+ 1 row in set (0.00 sec) The trick was to convert the columns to binary first. When I went straight from latin1 to utf8 I got the utf8 equivalent of the latin1 characters. By telling it that the data was actually binary first, it converted from binary to utf8 which appears to have worked. There were some warnings, which I assume are invalid utf8 byte sequences somewhere. -Rasmus --00000000000071693a057181952c--