Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:29136 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 36492 invoked by uid 1010); 3 May 2007 08:30:19 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 36477 invoked from network); 3 May 2007 08:30:19 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 May 2007 08:30:19 -0000 Authentication-Results: pb1.pair.com header.from=indeyets@gmail.com; sender-id=pass; domainkeys=bad Authentication-Results: pb1.pair.com smtp.mail=indeyets@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.132.250 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: indeyets@gmail.com X-Host-Fingerprint: 209.85.132.250 an-out-0708.google.com Received: from [209.85.132.250] ([209.85.132.250:58784] helo=an-out-0708.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 63/07-08359-F8D99364 for ; Thu, 03 May 2007 04:30:08 -0400 Received: by an-out-0708.google.com with SMTP id c28so435350ana for ; Thu, 03 May 2007 01:30:05 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Ydh+YKwAxiGJJn2TJuqm76+B5tIdGd3vM3p0ZiqDridLoCKG24//xzgTV9/8VM0jUycMMHoMsYEtYsQlzDKFnPSJpcLS3CU9G7ssK97qIGGETvmWAoEoZigTuaSUn2D4DxVah0ee+WEKJyvCQvYePMgiFCl7YdL2YWjzCJ+12SA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=tepzi7DVjjnlVyDLe5THtbIm5GTFUGqsQMnTPOfBRBCWQK98M81tUkVyLrOS1d9NQe/2tzqIrTxbkBLuHzEWw42eNoXG+hXP7mdEzN5pQfzVMWAhBy9/7dTnQ9YSE1U7TZ8G7z0zVa4ko3o9bIw78J7iGdZC3z+OLud9wzQ7VHQ= Received: by 10.100.122.8 with SMTP id u8mr1314370anc.1178181005248; Thu, 03 May 2007 01:30:05 -0700 (PDT) Received: by 10.100.177.20 with HTTP; Thu, 3 May 2007 01:30:05 -0700 (PDT) Message-ID: Date: Thu, 3 May 2007 12:30:05 +0400 To: "Rangel Reale" Cc: internals@lists.php.net In-Reply-To: <006c01c78d26$beda26f0$0301a8c0@rangeldc> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <006c01c78d26$beda26f0$0301a8c0@rangeldc> Subject: Re: [PHP-DEV] Trying to understand PHP6's unicode support From: indeyets@gmail.com ("Alexey Zakhlestin") php6 always handles all "internal work" in utf-8 But, from what I remember, there should be a way to specify encoding in which you expect data to arrive (in this example from mysql_*) I think, that someone who knows more will give you more details On 5/3/07, Rangel Reale wrote: > Hello! > > I am trying to understand how PHP6 handling of unicode works, I think I am > missing something. > > My config is: > > ;;;;;;;;;;;;;;;;;;;; > ; Unicode settings ; > ;;;;;;;;;;;;;;;;;;;; > > unicode.semantics = on > unicode.runtime_encoding = iso-8859-1 > unicode.script_encoding = iso-8859-1 > unicode.output_encoding = utf-8 > unicode.from_error_mode = U_INVALID_SUBSTITUTE > unicode.from_error_subst_char = 3f > unicode.fallback_encoding = iso-8859-1 > > I use a mysql database, with iso-8859-1 (Portuguese - latin 1) text, with > accented characters. > > What I was trying to understand was, because unicode.runtime_encoding = > iso-8859-1, I tought that all internal operations were done in this > encoding, and only when outputting (unicode.output_encoding = utf-8) data > would be converted to utf-8. So to me, I did a mysql query with latin1, data > comes to my variables as iso-8859-1, I use them, and only when I echo'ed > them, they would become utf-8, from a iso-8859-1-to-utf-8-like function. > > But when I do query in any record that have accented characters I get this > warning (using mysql_fetch_assoc): > > ---------- > Could not convert binary string to Unicode string (converter UTF-8 failed on > bytes (0xE7) at offset 9) > ---------- > > for all accented characters in all fields. > > The strange thing to me, is the mysql_fetch_assoc function give this error > even before I accessed the field values, as I understanded from the above > explanation. > > If I changed the set names query to: > > mysql_query('set names utf8', $this->mysql_link); > > then it works, but I would like to understand how this works, to make my > program the right way from the start. > > Did I misundertood something? > > Thanks, > Rangel > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > > -- Alexey Zakhlestin http://blog.milkfarmsoft.com/