By now, Unicode merge into the public tree has taken place. How do you
get started?
-
Take a deep breath.
-
Download and build ICU 3.4.
Location:
http://www-306.ibm.com/software/globalization/icu/downloads.jspExtract and cd into icu/source. Execute configure (replacing
/usr/local with your prefix):./configure --prefix=/usr/local --disable-threads --enable-extras
--enable-icuio --enable-layout
make and make install
-
Update to PHP CVS HEAD (cvs upd -dPA) or better, do a clean
check-out. -
Run ./buildconf.
-
Run ./configure and use --with-icu-dir=<dir> if you put ICU in a
non-standard location. -
Hopefully it configures without a problem.
-
Cross your fingers (better, cross your toes too) and run 'make'.
-
Once the smoke dissipates (and if you're on Powerbook, once it cools
down from nuclear to just melting hot), you can continue.
Since you have not turned unicode_semantics switch on, you should be
able to run all the old scripts. If you'd like to dabble in the Unicode
land, the suggested way to configure for it is something like this:
unicode_semantics = on
unicode.runtime_encoding = iso-8859-1 (or your favorite one)
unicode.script_encoding = utf-8
unicode.output_encoding = utf-8
unicode.from_error_mode = U_INVALID_SUBSTITUTE
unicode.from_error_subst_char = 3f
Now you can get your hands dirty.
If you'd like to see how certain functions have been upgraded to
accommodate Unicode and binary types, check out substr()
, explode()
,
trim()
, str_repeat()
, and strlen()
.
ICU API reference is at http://icu.sourceforge.net/
There is a new 'make' target called 'utest' - it is supposed to turn
unicode_semantics switch on and run all the tests. I don't think it
will quite work, due to changes in the streams and some other things,
so that might be the first thing to get fixed.
A writeup about new APIs and function upgrade guidelines will be coming
up, but not today, because you need time to digest this and I need time
to recover. :-)
Have fun,
-Andrei
Hi Andrei Zmievski, you wrote:
- Download and build ICU 3.4.
Quick question: can one use the provided binary packages for Win32?
Thanks,
Michael - < mike(@)php.net
I wrote:
Quick question: can one use the provided binary packages for Win32?
Not unless one has MSVC-7.1 -- it seems...
--
Michael - < mike(@)php.net
Here:
http://ftp.emini.dk/pub/php/win32/icu-3.4-MSVC-6.0.zip
Edin
Michael Wallner wrote:
I wrote:
Quick question: can one use the provided binary packages for Win32?
Not unless one has MSVC-7.1 -- it seems...
--
Michael - < mike(@)php.net
Download and build ICU 3.4.
Location: http://www-306.ibm.com/software/globalization/icu/downloads.jsp
For Debian users, doing "apt-get install libicu34-dev" should pull it in
(on debian unstable).
Derick
On Thu, 11 Aug 2005 16:36:56 -0700
Andrei Zmievski andrei@gravitonic.com wrote:
By now, Unicode merge into the public tree has taken place. How do you
get started?
Take a deep breath.
Download and build ICU 3.4.
SuSE users can use fresh 3.4 rpms from the link below.
You'll most likely need all icu* & libicu* rpms.
http://tony2001.phpclub.net/files/rpms/suse/9.3/
--
Wbr,
Antony Dovgal
Just a note to tell you that the build is broken when configured with
'--disable-zend-memory-manager'.
Thats because you forgot to add the eumaloc() and friends in the else part
of USE_ZEND_MALLOC (zend_alloc.h).
Nuno
Should be fixed now.
-Andrei
Just a note to tell you that the build is broken when configured with
'--disable-zend-memory-manager'.
Thats because you forgot to add the eumaloc() and friends in the else
part of USE_ZEND_MALLOC (zend_alloc.h).Nuno
It doesn't output so many errors as before, but still fails:
"undefined reference to 'zend_ustrdup'"
Nuno
----- Original Message -----
Should be fixed now.
-Andrei
Just a note to tell you that the build is broken when configured with
'--disable-zend-memory-manager'.
Thats because you forgot to add the eumaloc() and friends in the else
part of USE_ZEND_MALLOC (zend_alloc.h).Nuno
Attached is a patch to fix the build with zend mm disabled. (just a little
typo)
Nuno
missing "charset=" in content-type header when unicode.output_encoding
is set to other than utf-8
is_binary() is_unicode()? (while there is is_string()
)
and is_buffer() for string/binary/unicode?
but pls fix extract()
at very first as it inject variable in active
symbol table.