Hello developers.
I saw some days ago that
there is need in rewriting of check_parameters.php to be less
false-positive and so on...
So i wrote new version - based on state machines and regex.
Base checks are the same, only reporting is improved.
There is simple comparision:
-- on old version my php-5.2.5 source old util gives ~180 problems
(excluding optional, non optional params initialization,
reporting_level is setted up to 5):
php ./scripts/dev/check_parameters.php /xxxx/php5.2-200803061530 |
grep -iv 'optional var not initialized' | grep -iv 'not optional var
is initialized' > ./old1.log
wc -l ./old1.log
181 ./old1.log
new gives ~150 problems :)
php /check_code.php -v 5 /xxxx/php5.2-200803061530 |grep -iv
'OPTIONAL var IS NOT' | grep -iv 'NOT OPTIONAL var IS initialized' >
/new1.log
wc -l /new1.log
147 /new1.log
Some problems are unavoided:
- external variable definition (~60)
- 'php_com_variant_class_entry' - 11 reports
- 'date_ce_timezone' - 2 reports
- 'oci_lob_class_entry_ptr' - 26 reports
- 'text**' - 7 reports
- 'oci_coll_class_entry_ptr' - 10 reports
- 'zend_ce_traversable' - 3 reports
- others ...
- too complex cases of parsing ( can be fixed by hacks but it's wrong...) (~5)
, others will be solved by more correct parsing
and some will be fixed in source code (i think :) ), for example:
ext/iconv/iconv.c [] iconv_mime_encode : field_value: expected
"char**" but got "const char**" [3]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : hostname_len: expected
"int*" but got "unsigned int*" [2]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : username_len: expected
"int*" but got "unsigned int*"[4]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : passwd_len: expected
"int*" but got "unsigned int*" [6]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : dbname_len: expected
"int*" but got "unsigned int*" [8]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : socket_len: expected
"int*" but got "unsigned int*" [11]
ext/openssl/openssl.c [] openssl_seal : the '/' specifier cannot be
applied to 'a'
ext/pgsql/pgsql.c [] pg_field_table : the '!' specifier cannot be applied to 'b'
ext/pgsql/pgsql.c [] pg_copy_from : the '/' specifier cannot be applied to 's'
ext/pgsql/pgsql.c [] pg_meta_data : table_name_len: expected "int*"
but got "uint*" [3]
ext/pgsql/pgsql.c [] pg_convert : option: expected "long*" but got "ulong*" [5]
ext/pgsql/pgsql.c [] pg_insert : option: expected "long*" but got "ulong*" [5]
ext/pgsql/pgsql.c [] pg_update : option: expected "long*" but got "ulong*" [6]
ext/pgsql/pgsql.c [] pg_delete : option: expected "long*" but got "ulong*" [5]
ext/pgsql/pgsql.c [] pg_select : option: expected "long*" but got "ulong*" [5]
ext/standard/streamsfuncs.c [] stream_socket_client : the '!'
specifier cannot be applied to 'd'
, so i believe final number will be ~70-80 ...
There are also need to do:
- write tests
- rewrite dropping comments from code
- return missed current_line function
- replace other regexes by statemachines where it is possible
- improve more maintainability
- write docu
- smth else ?
So it's not last version...
But please check it if you have time. You can see utility on [1]
(I don't creare patch because of it is the same as downloading new
version - too many changes)
Also i have questions:
-
in many problem reports there are unsigned int (or uint, or ulong
-> long or smth else) to int assigning. Is it safe ? If yes, then i
need to process this in code -
There are really many problem reports about "optional var is not initialized"
Is there a requrement for reporting about it ? And why ? -
There are really many problem reports about "not optional var is initialized"
Also in most part of these cases not optional var is inialized by null
value. Why is this requrement ? And why ? -
there is code like:
// separate_zval_if_not_ref
case '/':
//FIXME could not understand from why it so ????
if ( !in_array($prev_char, array('r', 'z')) ) {
self::error("the '/' specifier cannot be applied to
'$prev_char'");
}
break;
Why ? In readme.parameter_parsing_api docu i can't find any related to
it information.
Links:
[1] http://sawoy.mylivepage.com/file/?fileid=2830
--
Greetings,
Alexandr Savchuk
- There are really many problem reports about "not optional var is initialized"
Also in most part of these cases not optional var is inialized by null
value. Why is this requrement ? And why ?
I believe this notice should be disabled, as it's not an error or something bad.
Removing an initialization that is not required doesn't fix any problem,
even though it can be done without any harm.
--
Wbr,
Antony Dovgal
Hi,
Thanks for working on this. However I don't think more effort should be
wasted with this script. It's a bogus approach to the problem and it will
always generate many false-positives (disclaimer: I'm the author of the
original script and it was like a POC).
Thus my idea is to move along to use a real C/C++ front-end and perform the
things correctly. For example, the new LLVM's clang frontend is really easy
to work with. That's why I've proposed this for a gsoc project. It's fairly
easy for someone with little compiler knowledge, yet it gives enough work to
be a gsoc project.
Regards,
Nuno
----- Original Message -----
From: "Alexandr Savchuk" a.u.savchuk@gmail.com
To: internals@lists.php.net
Sent: Tuesday, March 11, 2008 5:00 AM
Subject: [PHP-DEV] new version of check_parameters.php
Hello developers.
I saw some days ago that
there is need in rewriting of check_parameters.php to be less
false-positive and so on...So i wrote new version - based on state machines and regex.
Base checks are the same, only reporting is improved.There is simple comparision:
-- on old version my php-5.2.5 source old util gives ~180 problems
(excluding optional, non optional params initialization,
reporting_level is setted up to 5):php ./scripts/dev/check_parameters.php /xxxx/php5.2-200803061530 |
grep -iv 'optional var not initialized' | grep -iv 'not optional var
is initialized' > ./old1.logwc -l ./old1.log
181 ./old1.log
new gives ~150 problems :)
php /check_code.php -v 5 /xxxx/php5.2-200803061530 |grep -iv
'OPTIONAL var IS NOT' | grep -iv 'NOT OPTIONAL var IS initialized' >
/new1.logwc -l /new1.log
147 /new1.log
Some problems are unavoided:
- external variable definition (~60)
- 'php_com_variant_class_entry' - 11 reports
- 'date_ce_timezone' - 2 reports
- 'oci_lob_class_entry_ptr' - 26 reports
- 'text**' - 7 reports
- 'oci_coll_class_entry_ptr' - 10 reports
- 'zend_ce_traversable' - 3 reports
- others ...
- too complex cases of parsing ( can be fixed by hacks but it's wrong...)
(~5), others will be solved by more correct parsing
and some will be fixed in source code (i think :) ), for example:ext/iconv/iconv.c [] iconv_mime_encode : field_value: expected
"char**" but got "const char**" [3]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : hostname_len: expected
"int*" but got "unsigned int*" [2]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : username_len: expected
"int*" but got "unsigned int*"[4]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : passwd_len: expected
"int*" but got "unsigned int*" [6]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : dbname_len: expected
"int*" but got "unsigned int*" [8]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : socket_len: expected
"int*" but got "unsigned int*" [11]
ext/openssl/openssl.c [] openssl_seal : the '/' specifier cannot be
applied to 'a'
ext/pgsql/pgsql.c [] pg_field_table : the '!' specifier cannot be applied
to 'b'
ext/pgsql/pgsql.c [] pg_copy_from : the '/' specifier cannot be applied to
's'
ext/pgsql/pgsql.c [] pg_meta_data : table_name_len: expected "int*"
but got "uint*" [3]
ext/pgsql/pgsql.c [] pg_convert : option: expected "long*" but got
"ulong*" [5]
ext/pgsql/pgsql.c [] pg_insert : option: expected "long*" but got "ulong*"
[5]
ext/pgsql/pgsql.c [] pg_update : option: expected "long*" but got "ulong*"
[6]
ext/pgsql/pgsql.c [] pg_delete : option: expected "long*" but got "ulong*"
[5]
ext/pgsql/pgsql.c [] pg_select : option: expected "long*" but got "ulong*"
[5]
ext/standard/streamsfuncs.c [] stream_socket_client : the '!'
specifier cannot be applied to 'd', so i believe final number will be ~70-80 ...
There are also need to do:
- write tests
- rewrite dropping comments from code
- return missed current_line function
- replace other regexes by statemachines where it is possible
- improve more maintainability
- write docu
- smth else ?
So it's not last version...
But please check it if you have time. You can see utility on [1](I don't creare patch because of it is the same as downloading new
version - too many changes)Also i have questions:
in many problem reports there are unsigned int (or uint, or ulong
-> long or smth else) to int assigning. Is it safe ? If yes, then i
need to process this in codeThere are really many problem reports about "optional var is not
initialized"
Is there a requrement for reporting about it ? And why ?There are really many problem reports about "not optional var is
initialized"
Also in most part of these cases not optional var is inialized by null
value. Why is this requrement ? And why ?there is code like:
// separate_zval_if_not_ref
case '/':
//FIXME could not understand from why it so ????
if ( !in_array($prev_char, array('r', 'z')) ) {
self::error("the '/' specifier cannot be applied to
'$prev_char'");
}
break;Why ? In readme.parameter_parsing_api docu i can't find any related to
it information.Links:
[1] http://sawoy.mylivepage.com/file/?fileid=2830--
Greetings,
Alexandr Savchuk
Yes, i understand.
It was interesting for me -
how i can decrease false positive and others by improvings on current script.
If there will be not GSoC participant for this project,
possibly i can do it.
Also base questions, which will be asked in any case, are still alive:
- in many problem reports there are unsigned int (or uint, or ulong
-> long or smth else) to int assigning. Is it safe ? If yes, then i
need to process this in code- There are really many problem reports about "optional var is not
initialized"
Is there a requrement for reporting about it ? And why ?- There are really many problem reports about "not optional var is
initialized"
Also in most part of these cases not optional var is inialized by null
value. Why is this requrement ? And why ?- there is code like:
// separate_zval_if_not_ref
case '/':
//FIXME could not understand from why it so ????
if ( !in_array($prev_char, array('r', 'z')) ) {
self::error("the '/' specifier cannot be applied to
'$prev_char'");
}
break;
Why ? In readme.parameter_parsing_api docu i can't find any related to
it information.
2008/3/12, Nuno Lopes nlopess@php.net:
Hi,
Thanks for working on this. However I don't think more effort should be
wasted with this script. It's a bogus approach to the problem and it will
always generate many false-positives (disclaimer: I'm the author of the
original script and it was like a POC).
Thus my idea is to move along to use a real C/C++ front-end and perform the
things correctly. For example, the new LLVM's clang frontend is really easy
to work with. That's why I've proposed this for a gsoc project. It's fairly
easy for someone with little compiler knowledge, yet it gives enough work to
be a gsoc project.Regards,
Nuno
----- Original Message -----
From: "Alexandr Savchuk" a.u.savchuk@gmail.com
To: internals@lists.php.net
Sent: Tuesday, March 11, 2008 5:00 AM
Subject: [PHP-DEV] new version of check_parameters.phpHello developers.
I saw some days ago that
there is need in rewriting of check_parameters.php to be less
false-positive and so on...So i wrote new version - based on state machines and regex.
Base checks are the same, only reporting is improved.There is simple comparision:
-- on old version my php-5.2.5 source old util gives ~180 problems
(excluding optional, non optional params initialization,
reporting_level is setted up to 5):php ./scripts/dev/check_parameters.php /xxxx/php5.2-200803061530 |
grep -iv 'optional var not initialized' | grep -iv 'not optional var
is initialized' > ./old1.logwc -l ./old1.log
181 ./old1.log
new gives ~150 problems :)
php /check_code.php -v 5 /xxxx/php5.2-200803061530 |grep -iv
'OPTIONAL var IS NOT' | grep -iv 'NOT OPTIONAL var IS initialized' >
/new1.logwc -l /new1.log
147 /new1.log
Some problems are unavoided:
- external variable definition (~60)
- 'php_com_variant_class_entry' - 11 reports
- 'date_ce_timezone' - 2 reports
- 'oci_lob_class_entry_ptr' - 26 reports
- 'text**' - 7 reports
- 'oci_coll_class_entry_ptr' - 10 reports
- 'zend_ce_traversable' - 3 reports
- others ...
- too complex cases of parsing ( can be fixed by hacks but it's wrong...)
(~5), others will be solved by more correct parsing
and some will be fixed in source code (i think :) ), for example:ext/iconv/iconv.c [] iconv_mime_encode : field_value: expected
"char**" but got "const char**" [3]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : hostname_len: expected
"int*" but got "unsigned int*" [2]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : username_len: expected
"int*" but got "unsigned int*"[4]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : passwd_len: expected
"int*" but got "unsigned int*" [6]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : dbname_len: expected
"int*" but got "unsigned int*" [8]
ext/mysqli/mysqli_nonapi.c [] mysqli_connect : socket_len: expected
"int*" but got "unsigned int*" [11]
ext/openssl/openssl.c [] openssl_seal : the '/' specifier cannot be
applied to 'a'
ext/pgsql/pgsql.c [] pg_field_table : the '!' specifier cannot be applied
to 'b'
ext/pgsql/pgsql.c [] pg_copy_from : the '/' specifier cannot be applied to
's'
ext/pgsql/pgsql.c [] pg_meta_data : table_name_len: expected "int*"
but got "uint*" [3]
ext/pgsql/pgsql.c [] pg_convert : option: expected "long*" but got
"ulong*" [5]
ext/pgsql/pgsql.c [] pg_insert : option: expected "long*" but got "ulong*"
[5]
ext/pgsql/pgsql.c [] pg_update : option: expected "long*" but got "ulong*"
[6]
ext/pgsql/pgsql.c [] pg_delete : option: expected "long*" but got "ulong*"
[5]
ext/pgsql/pgsql.c [] pg_select : option: expected "long*" but got "ulong*"
[5]
ext/standard/streamsfuncs.c [] stream_socket_client : the '!'
specifier cannot be applied to 'd', so i believe final number will be ~70-80 ...
There are also need to do:
- write tests
- rewrite dropping comments from code
- return missed current_line function
- replace other regexes by statemachines where it is possible
- improve more maintainability
- write docu
- smth else ?
So it's not last version...
But please check it if you have time. You can see utility on [1](I don't creare patch because of it is the same as downloading new
version - too many changes)Also i have questions:
in many problem reports there are unsigned int (or uint, or ulong
-> long or smth else) to int assigning. Is it safe ? If yes, then i
need to process this in codeThere are really many problem reports about "optional var is not
initialized"
Is there a requrement for reporting about it ? And why ?There are really many problem reports about "not optional var is
initialized"
Also in most part of these cases not optional var is inialized by null
value. Why is this requrement ? And why ?there is code like:
// separate_zval_if_not_ref
case '/':
//FIXME could not understand from why it so ????
if ( !in_array($prev_char, array('r', 'z')) ) {
self::error("the '/' specifier cannot be applied to
'$prev_char'");
}
break;Why ? In readme.parameter_parsing_api docu i can't find any related to
it information.Links:
[1] http://sawoy.mylivepage.com/file/?fileid=2830--
Greetings,
Alexandr Savchuk
--
Greetings,
Alexandr Savchuk
Yes, i understand.
It was interesting for me -
how i can decrease false positive and others by improvings on current
script.If there will be not GSoC participant for this project,
possibly i can do it.
OK, great! :)
Also base questions, which will be asked in any case, are still alive:
- in many problem reports there are unsigned int (or uint, or ulong
-> long or smth else) to int assigning. Is it safe ? If yes, then i
need to process this in code
well, it depends. an integer in PHP is a signed long, so probably you want
to stick to this, although in most cases using an unsigned long is safe.
So there's no real answer to that question: It depends.
- There are really many problem reports about "optional var is not
initialized"
Is there a requrement for reporting about it ? And why ?
yes! an optinal argument may not get initialized. So reading such an
optional argument that the user left unspecified may result in a crash.
However, most of these reports are bogus. The good news is that these
false-positives can be killed with a simple path-sensitive flow-analysis
(really simple in clang).
- There are really many problem reports about "not optional var is
initialized"
Also in most part of these cases not optional var is inialized by null
value. Why is this requrement ? And why ?
This warning is only displayed if the error reporting level is high, at it
produces too much reports. It is just a help to show dead assigments that
could be removed (because non-optional arguments are garanteed to be
initialized).
- there is code like:
// separate_zval_if_not_ref
case '/':
//FIXME could not understand from why it so ????
if ( !in_array($prev_char, array('r', 'z')) ) {
self::error("the '/' specifier cannot be applied to
'$prev_char'");
}
break;
Why ? In readme.parameter_parsing_api docu i can't find any related to
it information.
Well applying the '/' specifier to 'l' (long) or 'd' (double) doesn't really
make sense. It may even create a memleak on those cases (unconfirmed, but it
seems so).
Nuno
an off topic suggestion, why not add a shebang
#!/usr/bin/php
...