unread
Hi Dennis,
Overall it sounds like a reasonable RFC.
Dennis:
Niels:
I'm not so sure that the name "decode_html" is self-descriptive enough,
it sounds very generic.The name is not very important to me. For the sake of history, the reason
I have chosen “decode HTML” is because, unlike an HTML parser, this is
focused on taking a snippet of HTML “text” content and decoding it into a
“plain PHP string.”
Why not make it two methods called "decode_html_text" and
"decode_html_attribute"?
Consider the following reasons:
- The function doesn't actually decode html as such, it decodes either an
html text node string or an html attribute string. - Saves the $context parameter and the constants/enums, making the call
significantly shorter. - It feels like decoding either text or attribute are two significantly
different things. I admit I could be wrong, if code like
decode_html($e->isAttritbute() ? HtmlContext::Attribute :
HtmlContext::Text, $e->getContent()) is likely to be seen. But I somehow
don't foresee a lot of situations where text and attribute strings end up
in the same code path?
A couple of other options that would silence anyone opposed to implicitly
favouring utf-8:
html_text_to_utf8 and html_attribute_to_utf8
Best,
Jakob