Hi,
I opened a PR making FILTER_VALIDATE_URL more strict and more compliant 
with standards: https://github.com/php/php-src/pull/826
Can anyone review (and merge) this patch?
Thanks!
Kévin Dunglas 
Consultant et développeur freelance
http://dunglas.fr 
Tél. : 06 60 91 20 20
Nice work man, it looks really good.
Daniel Ribeiro 
http://danielribeiro.org
Hi,
I opened a PR making
FILTER_VALIDATE_URLmore strict and more compliant
with standards: https://github.com/php/php-src/pull/826Can anyone review (and merge) this patch?
Thanks!
Kévin Dunglas
Consultant et développeur freelancehttp://dunglas.fr
Tél. : 06 60 91 20 20
Hi,
According to the discussion on GitHub, I've made some changes on this PR:
- Added a new FILTER_VALIDATE_DOMAINfilter validating domain names
- Added a FILTER_FLAG_HOSTNAMEflag to allow checking hostnames (_ are
 forbidden in hostname but not in domains)
- Changed FILTER_VALIDATE_URLto use this new validator
When https://github.com/php/php-src/pull/890 will be merged, it will be 
easy to add IDN support to this new domain validator.
2014-10-14 13:48 GMT+02:00 Daniel Ribeiro drgomesp@gmail.com:
Nice work man, it looks really good.
Daniel Ribeiro
http://danielribeiro.orgHi,
I opened a PR making
FILTER_VALIDATE_URLmore strict and more compliant
with standards: https://github.com/php/php-src/pull/826Can anyone review (and merge) this patch?
Thanks!
Kévin Dunglas
Consultant et développeur freelancehttp://dunglas.fr
Tél. : 06 60 91 20 20
-- 
Kévin Dunglas 
Consultant et développeur freelance
http://dunglas.fr 
Tél. : 06 60 91 20 20
Hi,
- Added a new
FILTER_VALIDATE_DOMAINfilter validating domain names- Added a
FILTER_FLAG_HOSTNAMEflag to allow checking hostnames (_ are
forbidden in hostname but not in domains)
This doesn't make any sense. A domain is a hostname and underscores 
are forbidden.
Cheers, 
Andrey.
Hi Andrey,
Sorry but I think you're wrong. Domain != hostname. Underscore are allowed 
in domains (RFC 2181) but not in hostnames (RFC 1123 and next). To quote 
Wikipedia:
"While a hostname may not contain other characters, such as the underscore 
character (_), other DNS names may contain the underscore. Systems such 
asDomainKeys and service records use the underscore as a means to assure 
that their special character is not confused with hostnames. For 
example,_http._sctp.www.example.com specifies a service pointer for an SCTP 
capable webserver host (www) in the domain example.com." 
http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names
You can also see this StackOverflow answer 
http://stackoverflow.com/a/2183140/1352334
2014-11-06 0:32 GMT+01:00 Andrey Andreev narf@devilix.net:
Hi,
- Added a new
FILTER_VALIDATE_DOMAINfilter validating domain names- Added a
FILTER_FLAG_HOSTNAMEflag to allow checking hostnames (_ are
forbidden in hostname but not in domains)This doesn't make any sense. A domain is a hostname and underscores
are forbidden.Cheers,
Andrey.
-- 
Kévin Dunglas 
Consultant et développeur freelance
http://dunglas.fr 
Tél. : 06 60 91 20 20
Hi,
Hi Andrey,
Sorry but I think you're wrong. Domain != hostname. Underscore are allowed
in domains (RFC 2181) but not in hostnames (RFC 1123 and next). To quote
Wikipedia:"While a hostname may not contain other characters, such as the underscore
character (_), other DNS names may contain the underscore. Systems such
asDomainKeys and service records use the underscore as a means to assure
that their special character is not confused with hostnames. For
example,_http._sctp.www.example.com specifies a service pointer for an SCTP
capable webserver host (www) in the domain example.com."
http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_namesYou can also see this StackOverflow answer
http://stackoverflow.com/a/2183140/1352334
I agree to an extent, but that is highly contextual.
Who said that 'domain' === 'DNS record' (which is a very broad term 
anyway)? And IF we assume this, why do you need FILTER_VALIDATE_DOMAIN 
for it if it's only going to check length?
Cheers, 
Andrey.
FILTER_VALIDATE_DOMAIN checks conformance with DNS RFCs : total length, 
label length and allowed characters (_ are allowed in domain names but many 
other characters are forbidden such as ~/+...). I'll add IDN support too 
when IDN support for streams will be merged.
FILTER_VALIDATE_URL checks conformance with URL RFCs (and not URI, as 
discussed on GitHub). URL's host part RFCs conformance implies DNS RFCs 
conformance, IPv4 and IPv6 RFCs conformance + some additional checks (no 
underscore allowed in hostnames and IPv6 enclosed with brackets for 
instance). It's why I've added the convenience flag FILTER_FLAG_HOSTNAME. 
Btw, there is many use case for validating that a string is a valid domain 
(or a valid hostname): hoster and registar apps, mail server management 
apps and anything else DNS related.
Maybe be can we find a better name for FILTER_VALIDATE_DOMAIN. Such as 
FILTER_VALIDATE_DOMAIN_NAME 
or FILTER_VALIDATE_DNS_DOMAIN (a bit redundant, DNS = Domain Name System) but 
please not something related "DNS Record" because a valid DNS record can 
have the following format:
les-tilleuls.coop. 3600 IN SOA monsite.nnx.com .root.monsite.nnx.com. ( 
2014092300 ; serial 
21600 ; refresh (6 hours) 
3600 ; retry (1 hour) 
604800 ; expire (1 week) 
86400 ; minimum (1 day) 
)
2014-11-06 13:55 GMT+01:00 Andrey Andreev narf@devilix.net:
Hi,
Hi Andrey,
Sorry but I think you're wrong. Domain != hostname. Underscore are
allowed
in domains (RFC 2181) but not in hostnames (RFC 1123 and next). To quote
Wikipedia:"While a hostname may not contain other characters, such as the
underscore
character (_), other DNS names may contain the underscore. Systems such
asDomainKeys and service records use the underscore as a means to assure
that their special character is not confused with hostnames. For
example,_http._sctp.www.example.com specifies a service pointer for an
SCTP
capable webserver host (www) in the domain example.com."
http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_namesYou can also see this StackOverflow answer
http://stackoverflow.com/a/2183140/1352334I agree to an extent, but that is highly contextual.
Who said that 'domain' === 'DNS record' (which is a very broad term
anyway)? And IF we assume this, why do you needFILTER_VALIDATE_DOMAIN
for it if it's only going to check length?Cheers,
Andrey.
-- 
Kévin Dunglas 
Consultant et développeur freelance
http://dunglas.fr 
Tél. : 06 60 91 20 20
Hi,
FILTER_VALIDATE_DOMAINchecks conformance with DNS RFCs : total length,
label length and allowed characters (_ are allowed in domain names but many
other characters are forbidden such as ~/+...). I'll add IDN support too
when IDN support for streams will be merged.
I am not trying to argue, but where does it say that ~/+ are 
disallowed, yet an underscore is? The only rule allowing underscores 
that I've seen is the one that says any binary string is a valid DNS 
record value.
Cheers, 
Andrey.
FILTER_VALIDATE_DOMAINchecks conformance with DNS RFCs : total length,
label length and allowed characters (_ are allowed in domain names but many
other characters are forbidden such as ~/+...). I'll add IDN support too
when IDN support for streams will be merged.I am not trying to argue, but where does it say that ~/+ are
disallowed, yet an underscore is? The only rule allowing underscores
that I've seen is the one that says any binary string is a valid DNS
record value.
FWIW, there is a practical in-use (de facto if nothing else) convention of using _ in hosts for DKIM:
$ dig +short -ttxt mandrill._domainkey.mailchimp.com 
"k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCrLHiExVd55zd/IQ/J/mRwSRMAocV/hMB3jXwaHH36d9NaVynQFYV8NaWi69c1veUtRzGt7yAioXqLj7Z4TeEUoOLgrKsn8YnckGs9i3B3tVFB+Ch/4mPhXWiNfNdynHWBcPcbJ8kjEQ2U8y78dHZj1YeRXXVvWob2OaKynO8/lQIDAQAB;”
S
FWIW, there is a practical in-use (de facto if nothing else) convention of using _ in hosts for DKIM:
_domainkey is actually in all the DKIM RFCs and in the formal STD 76, 
see § 3.6.2.1. Namespace, so it's more than a convention!
Hi all,
On Fri, Nov 7, 2014 at 6:48 AM, Sanford Whiteman figureonecpr@gmail.com 
wrote:
FWIW, there is a practical in-use (de facto if nothing else)
convention of using _ in hosts for DKIM:_domainkey is actually in all the DKIM RFCs and in the formal STD 76,
see § 3.6.2.1. Namespace, so it's more than a convention!
"" is used for service name. Active Directory uses "" a lot, for example. 
e.g. _tcp, _sites, _ldap, etc.
https://tools.ietf.org/html/rfc2782
Regards,
-- 
Yasuo Ohgaki 
yohgaki@ohgaki.net
Hi,
I'll change my PR according to the RFC I've quoted earlier:
- check for valid characters (excluding underscore) only when 
 FILTER_FLAG_HOSTNAMEis set
- allow any character but check lengths by default
- use FILTER_FLAG_HOSTNAMEto validate URLs
What do you think about that?
Best regards,
2014-11-12 7:38 GMT+01:00 Yasuo Ohgaki yohgaki@ohgaki.net:
Hi all,
On Fri, Nov 7, 2014 at 6:48 AM, Sanford Whiteman figureonecpr@gmail.com
wrote:FWIW, there is a practical in-use (de facto if nothing else)
convention of using _ in hosts for DKIM:_domainkey is actually in all the DKIM RFCs and in the formal STD 76,
see § 3.6.2.1. Namespace, so it's more than a convention!"" is used for service name. Active Directory uses "" a lot, for
example. e.g. _tcp, _sites, _ldap, etc.https://tools.ietf.org/html/rfc2782
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
-- 
Kévin Dunglas
Hi Kevin,
I'll change my PR according to the RFC I've quoted earlier:
- check for valid characters (excluding underscore) only when
FILTER_FLAG_HOSTNAMEis set- allow any character but check lengths by default
- use
FILTER_FLAG_HOSTNAMEto validate URLsWhat do you think about that?
I haven't read diff closely, but it seems ok to me. 
How email domain is checked? I cannot see changes for it from the diff.
Validating host correctly is difficult, I would like to have your PR.
Regards,
-- 
Yasuo Ohgaki 
yohgaki@ohgaki.net
Hi Yasuo,
I've not changed (and even read) the email validator. I'll take a look at 
it.
2014-11-12 10:41 GMT+01:00 Yasuo Ohgaki yohgaki@ohgaki.net:
Hi Kevin,
I'll change my PR according to the RFC I've quoted earlier:
- check for valid characters (excluding underscore) only when
FILTER_FLAG_HOSTNAMEis set- allow any character but check lengths by default
- use
FILTER_FLAG_HOSTNAMEto validate URLsWhat do you think about that?
I haven't read diff closely, but it seems ok to me.
How email domain is checked? I cannot see changes for it from the diff.Validating host correctly is difficult, I would like to have your PR.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
-- 
Kévin Dunglas
I've just pushed some changes in the PR. FILTER_VALIDATE_DOMAIN now checks 
characters validity only if FILTER_FLAG_HOSTNAME is set. I've also rebased 
and fixed some issues detailed on GitHub.
Yasuo, it's not trivial to use this new validator in FILTER_VALIDATE_EMAIL. 
Its current implementation use a big regex that doesn't extract the domain 
part. Anyway, having a good RFC compliant email validator cannot be done 
with a regex. See https://github.com/egulias/EmailValidator for instance. I 
think it's a work for another PR. I'll keep the email validator in it's 
current state for now.
Do you guys are OK to get the current PR merged?
2014-11-12 19:10 GMT+01:00 Kévin Dunglas dunglas@gmail.com:
Hi Yasuo,
I've not changed (and even read) the email validator. I'll take a look at
it.2014-11-12 10:41 GMT+01:00 Yasuo Ohgaki yohgaki@ohgaki.net:
Hi Kevin,
I'll change my PR according to the RFC I've quoted earlier:
- check for valid characters (excluding underscore) only when
FILTER_FLAG_HOSTNAMEis set- allow any character but check lengths by default
- use
FILTER_FLAG_HOSTNAMEto validate URLsWhat do you think about that?
I haven't read diff closely, but it seems ok to me.
How email domain is checked? I cannot see changes for it from the diff.Validating host correctly is difficult, I would like to have your PR.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net--
Kévin Dunglas
-- 
Kévin Dunglas 
Consultant et développeur freelance
http://dunglas.fr 
Tél. : 06 60 91 20 20