Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:35971 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 72039 invoked from network); 5 Mar 2008 12:05:45 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 5 Mar 2008 12:05:45 -0000 Authentication-Results: pb1.pair.com smtp.mail=et@php.net; spf=unknown; sender-id=unknown Authentication-Results: pb1.pair.com header.from=et@php.net; sender-id=unknown Received-SPF: unknown (pb1.pair.com: domain php.net does not designate 62.75.137.136 as permitted sender) X-PHP-List-Original-Sender: et@php.net X-Host-Fingerprint: 62.75.137.136 fuer-et.de Linux 2.5 (sometimes 2.4) (4) Received: from [62.75.137.136] ([62.75.137.136:43342] helo=eve.fuer-et.de) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id EB/AA-27925-79C8EC74 for ; Wed, 05 Mar 2008 07:05:45 -0500 Received: from edea.local (mainau.mis.informatik.tu-darmstadt.de [130.83.165.199]) by eve.fuer-et.de (Postfix) with ESMTP id 184D21C59A1A; Wed, 5 Mar 2008 12:04:44 +0000 (UTC) To: internals@lists.php.net, RQuadling@googlemail.com Date: Wed, 5 Mar 2008 13:06:15 +0100 User-Agent: KMail/1.9.7 References: <10845a340803050221p5ca210a4mde18021d28e8e848@mail.gmail.com> In-Reply-To: <10845a340803050221p5ca210a4mde18021d28e8e848@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-ID: <200803051306.16131.et@php.net> Subject: Re: [PHP-DEV] PHP5.3.0-dev and Unicode BOM. From: et@php.net (Stefan Walk) On Wednesday 05 March 2008 11:21:14 Richard Quadling wrote: > But, having just inherited a load of code which are UTF-8 encoded and > all have a BOM marker, all the BOM markers are being sent to the > browser. This screws up sessions as output is sent before headers > (obviously). > > The files are english/non english mix, so I'm assuming the BOM needs > to be present to indicate the order of the encoding (I'm new to the > Unicode stuff, so this is my first real work with anything other than > pure english PHP code). Because of this assumption, I think I can't > remove the BOM marker. Since the codeunits for UTF-8 are only one Byte long, the BOM is unnecessary. It's just added from some editors that use it as a marker "This file is UTF-8". So, you can safely remove it. Additionally, i don't think PHP has ignored the BOM before. Regards, Stefan