Hi internals,
I came across this old bug: https://bugs.php.net/bug.php?id=34407
The desired behaviour for that particular person:
"the (world) now" => "The (World) Now"
Currently PHP adopts a very simple rule:
- Capitalise first character (no matter what it is)
- Capitalise character preceded by a space, tab, etc.
Using string.title() from Python you'd get the expected behaviour; they use:
- Capitalise first letter of a word
- Lowercase subsequent letters of a word
- Non-letters delimit words
Personally I find that the latter is too much of a departure from what we
currently have; a compromise could be to treat punctuation as a word
delimiter.
Thoughts?
--
Tjerk
Hi internals,
I came across this old bug: https://bugs.php.net/bug.php?id=34407
The desired behaviour for that particular person:
"the (world) now" => "The (World) Now"
Currently PHP adopts a very simple rule:
- Capitalise first character (no matter what it is)
- Capitalise character preceded by a space, tab, etc.
Using string.title() from Python you'd get the expected behaviour; they use:
- Capitalise first letter of a word
- Lowercase subsequent letters of a word
- Non-letters delimit words
Personally I find that the latter is too much of a departure from what we
currently have; a compromise could be to treat punctuation as a word
delimiter.
Hmm. Why not make it follow what \b in a regex would do, looking for “word boundaries”?
Andrea Faulds
http://ajf.me/
2014-06-30 13:54 GMT+02:00 Tjerk Meesters tjerk.meesters@gmail.com:
Hi internals,
I came across this old bug: https://bugs.php.net/bug.php?id=34407
The desired behaviour for that particular person:
"the (world) now" => "The (World) Now"
Currently PHP adopts a very simple rule:
- Capitalise first character (no matter what it is)
- Capitalise character preceded by a space, tab, etc.
Using string.title() from Python you'd get the expected behaviour; they
use:
- Capitalise first letter of a word
- Lowercase subsequent letters of a word
Actually letting a function "uppercase first (letter)" lowercase all
subsequent would be even more unexpected.
- Non-letters delimit words
Personally I find that the latter is too much of a departure from what we
currently have; a compromise could be to treat punctuation as a word
delimiter.Thoughts?
--
Tjerk