Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:30207 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 6837 invoked by uid 1010); 17 Jun 2007 12:59:43 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 6822 invoked from network); 17 Jun 2007 12:59:43 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Jun 2007 12:59:43 -0000 Authentication-Results: pb1.pair.com smtp.mail=andrew@ashearer.com; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=andrew@ashearer.com; sender-id=unknown Received-SPF: error (pb1.pair.com: domain ashearer.com from 68.230.240.7 cause and error) X-PHP-List-Original-Sender: andrew@ashearer.com X-Host-Fingerprint: 68.230.240.7 eastrmmtao101.cox.net Solaris 10 (beta) Received: from [68.230.240.7] ([68.230.240.7:40084] helo=eastrmmtao101.cox.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id D4/61-25319-C3035764 for ; Sun, 17 Jun 2007 08:59:42 -0400 Received: from eastrmimpo01.cox.net ([68.1.16.119]) by eastrmmtao101.cox.net (InterMail vM.7.08.02.01 201-2186-121-102-20070209) with ESMTP id <20070617125831.WHLS13718.eastrmmtao101.cox.net@eastrmimpo01.cox.net> for ; Sun, 17 Jun 2007 08:58:31 -0400 Received: from [192.168.0.2] ([72.195.132.65]) by eastrmimpo01.cox.net with bizsmtp id CcyY1X0061Qoypk0000000; Sun, 17 Jun 2007 08:58:32 -0400 Mime-Version: 1.0 (Apple Message framework v752.3) Content-Transfer-Encoding: 7bit Message-ID: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: internals@lists.php.net Date: Sun, 17 Jun 2007 08:58:33 -0400 X-Mailer: Apple Mail (2.752.3) Subject: Proposal: array_get, a more palatable alternative to ifsetor From: andrew@ashearer.com (Andrew Shearer) MOTIVATION There is an unmet need for an accessor that doesn't generate an E_NOTICE when the value is missing, as shown by ongoing discussions and repeated requests for an ifsetor operator. However, ifsetor had a special-case syntax and generally didn't fit very well with the rest of the language. http://devzone.zend.com/node/view/id/1481#Heading2 has a brief summary. See the Related Functions and Proposals section for more. Reading over those ideas (firstset(), coalesce(), :?, ifset(), and a workaround using settype()), most of the best uses boil down to retrieving values from arrays. PROPOSAL As a simpler alternative to constructs such as this common double array reference... $value = isset($_POST['command']) ? $_POST['command'] : ''; I propose an array_get function, like this... $value = array_get($_POST, 'command', ''); The third argument provides a default. This function would require no special syntax, and makes a very common construct easier to read and less error-prone to type. It's a concise way of saying that missing values can be handled gracefully. Though request processing was used as an example, the function has wide applicability across many other uses of associative arrays. GREAT, BUT WHY NOT ADD IT TO AN INCLUDE FILE, INSTEAD OF THE CORE? One of the goals is to make everyday PHP code simpler and clearer. Writers of sample code snippets should be able to rely on array_get() being available. Otherwise, they will not use it. Clearer sample code particularly benefits beginners, who would probably find array_get easier to understand, but anyone else who has to read or maintain other people's code would benefit from its wide deployment in core as well. The function is generally useful enough to be part of the language, and the implementation in C is also more efficient than a PHP version. That said, a compatibility function for older versions of PHP is given below. SEMANTICS mixed array_get(array $array, mixed $key[, mixed $default = FALSE]); If $array contains the key $key, $array[$key] is returned. Otherwise $default is returned. If $default is not specified, it defaults to FALSE. (NULL would also be possible, and would more closely match other languages such as Python with its dict.get method, but other PHP functions tend to return FALSE to indicate no value.) The semantics match array_key_exists($key, $array) ? $array[$key] : $default ... but for comparison, isset($array[$key]) ? $array[$key] : $default is subtly different. The preferred array_key_exists version has these differences: 1. If $array[$key] exists but has been set to null, that null value will be returned instead of $default. This is likely to be the least surprising thing to do. 2. If $array itself is unset, an error is generated. This is good. The intention is to gracefully handle a missing $key. But if even $array itself doesn't exist, there may be another problem, such as misspelling the array variable. isset() ignores all errors, sweeping more under the rug than we typically want. IMPLEMENTATION A core C implementation of array_get() benchmarked between two and three times as fast as the implementation in PHP. I'll attach the patch after responding to feedback. See the last section for the code of the PHP implementation. RELATED FUNCTIONS AND PROPOSALS This function is different than the array_get function proposed and rejected in http://bugs.php.net/bug.php?id=28185. That function had no default value and throws a notice when the key doesn't exist, eliminating the major purpose of this function. The ?: operator doesn't serve the same purpose, because it causes an E_NOTICE for missing values. However, ?: and array_get can be used together to provide short-circuit evaluation, overcoming the limitations of both. See the LIMITATIONS section for an example. ifsetor: as discussed above, ifsetor wasn't a regular function. It required special language syntax support because it attempted to test whether a direct parameter itself was set or unset, and was ultimately rejected. ifset: a related proposal to ifsetor, with a simpler syntax, ifset was missing a way to control the default value. See here for more discussion about the 'E_STRICT ternary pain-in-the- ass expression' and alternatives: http://keithdevens.com/weblog/archive/2005/Nov/24 http://www.php.net/~derick/meeting-notes.html#id39 http://devzone.zend.com/node/view/id/1481#Heading2 LIMITATIONS This proposal doesn't address every requested feature. The third parameter is always evaluated, so calling a slow function there would be undesirable. However, the limitation appears to be unavoidable without special language support, and there are workarounds. These snippets have approximately equal meanings (though they may differ is the handling of array values that convert to false): array_get($_GET, 'foo', slowDefaultCalculation()) $val = array_get($_GET, 'foo'); if (!$val) $val = slowDefaultCalculation(); array_get($_GET, 'foo') ?: slowDefaultCalculation() The last example uses the new PHP 6 ?: operator. This function applies only to array elements. Unlike other proposed functions, it doesn't also attempt to determine whether variables are set. However, the practical uses suggested for the other functions generally ended up applying to array elements. COMPATIBILITY FUNCTION FOR OLDER VERSIONS OF PHP if (!function_exists('array_get')) { function array_get($arr, $key, $default = false) { if (array_key_exists($key, $arr)) { return $arr[$key]; } else { return $default; } } } (This version turned in the fastest times out of several variants. Passing $arr by reference or attempting to return the result by reference had a huge negative impact, and using the ternary ? : operator instead of the if/else was slightly slower.) Comments? -- Andrew Shearer http://ashearer.com/