Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:17775 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 55257 invoked by uid 1010); 10 Aug 2005 10:45:37 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 55242 invoked from network); 10 Aug 2005 10:45:37 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 10 Aug 2005 10:45:37 -0000 X-Host-Fingerprint: 195.225.34.5 unknown Received: from ([195.225.34.5:21780] helo=localhost.localdomain) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 1B/49-24081-1DAD9F24 for ; Wed, 10 Aug 2005 06:45:37 -0400 Message-ID: <1B.49.24081.1DAD9F24@pb1.pair.com> To: internals@lists.php.net References: <937066F0-AA5F-41E2-99A0-D74C7F44FFCA@gravitonic.com> Date: Wed, 10 Aug 2005 12:45:27 +0200 Lines: 40 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 X-Posted-By: 195.225.34.5 Subject: Re: PHP Unicode support design document From: r.korving@xit.nl ("Ron Korving") This looks very promising, I'm impressed by the work you guys have done (big thumbs up). There are a few issues/questions I have after reading your document: "Therefore, command such as 'print' and 'echo' automatically convert their arguments to the specified encoding. No automatic output encoding is performed for anything else." What about the other functions that output to stdout directly, such as readfile() and passthru()? "The conversion failure behavior can be customized"... Maybe it would be a nice feature to have an U_INVALID_EXCEPTION, so that users can actually catch the error and deal with it. Just an idea. Of course it's not usual for the PHP core and extensions to throw exceptions, but perhaps this could change with PHP6. "In order to create binary string literals, a new syntax is necessary: prefixing a string literal with letter 'b' creates a binary string." The b-prefix for binary strings is great, but how does that work with a function like file_get_contents() or fread() ? One can't do: $data = bfile_get_contents("somefile.bin"); And even if one could (somehow), wouldn't file_get_contents() already unicode-encode all data it reads? How does such a function know if the user is expecting binary or textual data or does the encoding simply happen after the string is returned? In that case it's up to the user to use the b-prefix, but then there's the syntax problem I mentioned. Keep up the good work, Ron