Hi all!
We have started a project to make it easier to support international
markets using PHP. A number of internationalization functions from IBM
ICU will be made available in PHP as an extension.
This project targets both PHP 5 and PHP 6. The goal is to support the
most useful i18n services on both, while ensuring that any code running
in PHP 5 using these functions would work the same in PHP 6. The PHP 6
implementation may provide additional functionality.
The demand for internationalization services is large and is needed
today, so we decided to support them in PHP 5 and provide one common
solution that will also work going forward. There will be no PHP 4
support in the project.
The base for the extension is the ICU library
(http://www.icu-project.org/) already used by PHP 6, and the intent is
to follow the ICU model, so that people having experience working
with ICU in either C/C++ or Java could easily use the PHP API.
The extension is composed of mostly independent functionality modules,
each of which would implement one of the functionalities below.
The APIs support both procedural and object-oriented notation
(internally referring to the same APIs). In PHP 5, the extension assumes
all incoming and outgoing strings are in UTF-8 encoding.
The scope of the extension was defined as follows:
- Collation (http://www.icu-project.org/apiref/icu4c/ucol_8h.html)
- Number formatting (http://www.icu-project.org/apiref/icu4c/unum_8h.html)
- Date/time formatting
(http://www.icu-project.org/apiref/icu4c/udat_8h.html) - Locales (http://www.icu-project.org/apiref/icu4c/uloc_8h.html)
- Calendars (http://www.icu-project.org/apiref/icu4c/ucal_8h.html)
- International domain names
(http://www.icu-project.org/apiref/icu4c/uidna_8h.html) - Message formatting (http://www.icu-project.org/apiref/icu4c/umsg_8h.html)
- Resource bundles (http://www.icu-project.org/apiref/icu4c/ures_8h.html)
We have initial implementations of collation and number formatting APIs
for PHP 5, which will be publicly available soon. PHP 6 implementation
and other APIs will follow. The project code will be available through PECL.
The project is supported by LiveNation, Yahoo! and Zend Technologies.
We welcome all feedback about the project - especially suggestions about
what functionality is needed and comments about existing implementation.
We intend to discuss it on the PHP Internationalization list -
php-i18n@lists.php.net. We welcome you to join the list and participate
in the discussion. We will publish API descriptions for existing
functions on the i18n list in a couple of days, to start the things rolling.
Regards,
PHP-ICU team
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
We have started a project to make it easier to support international
markets using PHP. A number of internationalization functions from IBM
ICU will be made available in PHP as an extension.
I realize that my natural state is the state of confusing, but...
Hunh?
So now there's going to be a PHP-ICU extension for PHP 5 and PHP 6,
and PHP 6 will have ICU built-in to such an extent that it's backwards
compatible with PHP 5?
And what in the world would you do with PHP-ICU extension in PHP 6?
I mean, unless you've type-casted a string to binary or whatever, it's
already ICU, no?...
So then you'd by typecasting a string to binary so you can use ICU to
make it Unicode, rather than just leaving it as Unicode in the first
place?
I think one of your first documentation issues is going to be
explaining how this co-exists, replaces, or has zero effect on PHP 6
built-in Unicode :-)
Hopefully I'm not just being ignorant, though that's entirely possible
with this Unicode stuff... :-)
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
So now there's going to be a PHP-ICU extension for PHP 5 and PHP 6,
and PHP 6 will have ICU built-in to such an extent that it's backwards
compatible with PHP 5?
Both extensions would be (are being) written in such a way that code
that worked on PHP 5 would work on PHP 6, with regard to the extension.
Of course, particular code still could fail for some other reason, not
related to the ICU extension. I.e. if you call collator_sort($foo,
$bar), collator_sort being ICU extension function, it would mean the
same in PHP 5 and PHP 6.
And what in the world would you do with PHP-ICU extension in PHP 6?
I mean, unless you've type-casted a string to binary or whatever, it's
already ICU, no?...
I think you are confusing UTF-16 and ICU. ICU is a huge library of
Unicode text functions, which so far weren't supported in PHP except for
some collation support. There's a lot of functions to add there. UTF-16
is just a way to represent text in bits (and ICU uses it, and so does
PHP 6). The extension would expose a part of ICU functionality to PHP
users - such as collators, formatters, resources, etc.
HTH,
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
So now there's going to be a PHP-ICU extension for PHP 5 and PHP 6,
and PHP 6 will have ICU built-in to such an extent that it's backwards
compatible with PHP 5?Both extensions would be (are being) written in such a way that code
that worked on PHP 5 would work on PHP 6, with regard to the extension.
Of course, particular code still could fail for some other reason, not
related to the ICU extension. I.e. if you call collator_sort($foo,
$bar), collator_sort being ICU extension function, it would mean the
same in PHP 5 and PHP 6.And what in the world would you do with PHP-ICU extension in PHP 6?
I mean, unless you've type-casted a string to binary or whatever, it's
already ICU, no?...I think you are confusing UTF-16 and ICU. ICU is a huge library of
Unicode text functions, which so far weren't supported in PHP except for
some collation support. There's a lot of functions to add there. UTF-16
is just a way to represent text in bits (and ICU uses it, and so does
PHP 6). The extension would expose a part of ICU functionality to PHP
users - such as collators, formatters, resources, etc.
So (from another character-set-intricacy-challenged individual), would ICU it
be analogous to the DOM functions for manipulating XML-like structures? (The
methods parentNode(), childNodes(), appendNode(), etc. are all supposed to
mean the same thing in every language.)
--
Larry Garfield AIM: LOLG42
larry@garfieldtech.com ICQ: 6817012
"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson
So (from another character-set-intricacy-challenged individual),
would ICU it be analogous to the DOM functions for manipulating
XML-like structures? (The methods parentNode(), childNodes(),
appendNode(), etc. are all supposed to mean the same thing in every
language.)
I'm not sure I understand the question correctly, but I think if you
mean that methods would do the same in PHP 5 and 6 - yes.
--
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
So (from another character-set-intricacy-challenged individual),
would ICU it be analogous to the DOM functions for manipulating
XML-like structures? (The methods parentNode(), childNodes(),
appendNode(), etc. are all supposed to mean the same thing in every
language.)I'm not sure I understand the question correctly, but I think if you
mean that methods would do the same in PHP 5 and 6 - yes.
As far as I understand, he was referring to DOM as an example of
portability. A method name has the same behavior and signature (nearly
the same) in all languages, js, c# or php. His question being: will it
be the same with this project?
--Pierre
the same) in all languages, js, c# or php. His question being: will it
be the same with this project?
Maybe not exactly the same, since even C, C++ and Java API differ
enough, but close to it - i.e. the API set would follow what ICU APIs have.
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
Interesting question here: what about overloaded methods, especially
overloaded ctors in Calendar etc?
David
Am 16.07.2007 um 17:44 schrieb Stanislav Malyshev:
the same) in all languages, js, c# or php. His question being:
will it
be the same with this project?Maybe not exactly the same, since even C, C++ and Java API differ
enough, but close to it - i.e. the API set would follow what ICU
APIs have.Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
Stanislav Malyshev wrote:
Hi all!
We have started a project to make it easier to support international
markets using PHP. A number of internationalization functions from IBM
ICU will be made available in PHP as an extension.
I notice normalization is not on your list. Would you consider adding
it? In MediaWiki we currently have a pure PHP implementation of the NFC
algorithm, which is obviously rather slow. We tried writing our own
interface to the ICU normalization functions, but lost interest when we
hit some stability issues after deployment. Normalization is important
because client systems may apply either NFC or NFD, or just leave the
text unnormalized, leading to spurious instability of text in a version
tracking system like a wiki.
-- Tim Starling
Great news...
i was waiting for some movement on this front. i suppost there will be web
project space for this ?
also, i'd like to see specs if available. i may have some spare cycles to
put some work in. in addition i
have some code sitting around implementing various bits of i18n
functionality that i'd like to submit
pending discussion and approval....
l0t3k
"Stanislav Malyshev" stas@zend.com wrote in message
news:4697D3EA.2090201@zend.com...
Hi all!
We have started a project to make it easier to support international
markets using PHP. A number of internationalization functions from IBM ICU
will be made available in PHP as an extension.This project targets both PHP 5 and PHP 6. The goal is to support the most
useful i18n services on both, while ensuring that any code running in PHP
5 using these functions would work the same in PHP 6. The PHP 6
implementation may provide additional functionality.The demand for internationalization services is large and is needed today,
so we decided to support them in PHP 5 and provide one common solution
that will also work going forward. There will be no PHP 4 support in the
project.The base for the extension is the ICU library
(http://www.icu-project.org/) already used by PHP 6, and the intent is
to follow the ICU model, so that people having experience working
with ICU in either C/C++ or Java could easily use the PHP API.The extension is composed of mostly independent functionality modules,
each of which would implement one of the functionalities below.
The APIs support both procedural and object-oriented notation
(internally referring to the same APIs). In PHP 5, the extension assumes
all incoming and outgoing strings are in UTF-8 encoding.The scope of the extension was defined as follows:
- Collation (http://www.icu-project.org/apiref/icu4c/ucol_8h.html)
- Number formatting (http://www.icu-project.org/apiref/icu4c/unum_8h.html)
- Date/time formatting
(http://www.icu-project.org/apiref/icu4c/udat_8h.html)- Locales (http://www.icu-project.org/apiref/icu4c/uloc_8h.html)
- Calendars (http://www.icu-project.org/apiref/icu4c/ucal_8h.html)
- International domain names
(http://www.icu-project.org/apiref/icu4c/uidna_8h.html)- Message formatting
(http://www.icu-project.org/apiref/icu4c/umsg_8h.html)- Resource bundles (http://www.icu-project.org/apiref/icu4c/ures_8h.html)
We have initial implementations of collation and number formatting APIs
for PHP 5, which will be publicly available soon. PHP 6 implementation and
other APIs will follow. The project code will be available through PECL.The project is supported by LiveNation, Yahoo! and Zend Technologies.
We welcome all feedback about the project - especially suggestions about
what functionality is needed and comments about existing implementation.We intend to discuss it on the PHP Internationalization list -
php-i18n@lists.php.net. We welcome you to join the list and participate in
the discussion. We will publish API descriptions for existing functions on
the i18n list in a couple of days, to start the things rolling.Regards,
PHP-ICU teamStanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
I totally lack the words to describe the awesomeness of this. Really,
really fantastic. At Agavi (http://www.agavi.org/), we've ported
parts of ICU (locale, calendar, date) to PHP and we're using it
together with CLDR data, but as you might imagine, it's awfully slow.
This new extension will save many kitten's lives. Keep it up!
David
Am 13.07.2007 um 21:35 schrieb Stanislav Malyshev:
Hi all!
We have started a project to make it easier to support
international markets using PHP. A number of internationalization
functions from IBM ICU will be made available in PHP as an extension.This project targets both PHP 5 and PHP 6. The goal is to support
the most useful i18n services on both, while ensuring that any code
running in PHP 5 using these functions would work the same in PHP
- The PHP 6
implementation may provide additional functionality.The demand for internationalization services is large and is needed
today, so we decided to support them in PHP 5 and provide one
common solution that will also work going forward. There will be no
PHP 4 support in the project.The base for the extension is the ICU library
(http://www.icu-project.org/) already used by PHP 6, and the intent is
to follow the ICU model, so that people having experience working
with ICU in either C/C++ or Java could easily use the PHP API.The extension is composed of mostly independent functionality modules,
each of which would implement one of the functionalities below.
The APIs support both procedural and object-oriented notation
(internally referring to the same APIs). In PHP 5, the extension
assumes
all incoming and outgoing strings are in UTF-8 encoding.The scope of the extension was defined as follows:
- Collation (http://www.icu-project.org/apiref/icu4c/ucol_8h.html)
- Number formatting (http://www.icu-project.org/apiref/icu4c/
unum_8h.html)- Date/time formatting
(http://www.icu-project.org/apiref/icu4c/udat_8h.html)- Locales (http://www.icu-project.org/apiref/icu4c/uloc_8h.html)
- Calendars (http://www.icu-project.org/apiref/icu4c/ucal_8h.html)
- International domain names
(http://www.icu-project.org/apiref/icu4c/uidna_8h.html)- Message formatting (http://www.icu-project.org/apiref/icu4c/
umsg_8h.html)- Resource bundles (http://www.icu-project.org/apiref/icu4c/
ures_8h.html)We have initial implementations of collation and number formatting
APIs for PHP 5, which will be publicly available soon. PHP 6
implementation and other APIs will follow. The project code will be
available through PECL.The project is supported by LiveNation, Yahoo! and Zend Technologies.
We welcome all feedback about the project - especially suggestions
about
what functionality is needed and comments about existing
implementation.We intend to discuss it on the PHP Internationalization list -
php-i18n@lists.php.net. We welcome you to join the list and
participate in the discussion. We will publish API descriptions for
existing functions on the i18n list in a couple of days, to start
the things rolling.Regards,
PHP-ICU teamStanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com