Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:22649 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 27488 invoked by uid 1010); 3 Apr 2006 00:07:37 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 27473 invoked from network); 3 Apr 2006 00:07:37 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Apr 2006 00:07:37 -0000 X-Host-Fingerprint: 69.12.155.130 69-12-155-130.dsl.static.sonic.net Linux 2.4/2.6 Received: from ([69.12.155.130:3400] helo=pigeon.alphaweb.net) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 76/10-09123-84760344 for ; Sun, 02 Apr 2006 20:07:36 -0400 Received: from localhost ([127.0.0.1] helo=OHRLVN4523SG) by pigeon.alphaweb.net with smtp (Exim 4.10) id 1FQC2e-0001C9-00 for internals@lists.php.net; Sun, 02 Apr 2006 16:31:32 -0700 Message-ID: <000e01c656b2$98e4b290$88051fac@OHRLVN4523SG> To: Date: Sun, 2 Apr 2006 17:07:29 -0700 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2670 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2670 Subject: RETURN_RT_STRING() and family leakage From: pollita@php.net ("Sara Golemon") I was going through failing tests today and noticed some which failed because of memory leaks using code like the following: RETURN_RT_STRING(s, 0); } For non-unicode mode this is functionally identical to: RETURN_STRING(s, 0); } However for unicode mode, this binary value is translated to unicode using (effectively) this: { UChar *u; int u_len; UErrorCode status = U_ZERO_ERROR; zend_convert_to_string(ZEND_U_CONVERTER(UG(runtime_encoding_conv)), &u, &u_len, s, strlen(s), &status); RETURN_UNICODE(u, 0); } } The trouble with this, is that the original s value winds up getting leaked. I noticed a few other spots in the source where this is handled as such: RETVAL_RT_STRING(s, 0); if (UG(unicode)) { efree(s); } return; } Which is, of course, perfectly valid, but it feels a bit cludgy to me as it's a little inconsistent with the normal RETURN_STRING()/RETURN_UNICODE() semantics when it comes to "giving away" the variable. It's also a problem for the RETURN_RT_STRING(s, 0); usage in general as there's no point after the macro to free the original var. I see two solutions: (1) Modify the RETURN_RT_STRING(L)() macros to the following: #define RETURN_RT_STRING(t, duplicate) \ { RETVAL_RT_STRING(t, duplicate); if (duplicate && UG(unicode)) efree(t); return; } (2) Modify ZVAL_U_STRINGL() to: #define ZVAL_U_STRINGL(conv, z, s, l, duplicate, auto_free) \ if (UG(unicode)) { \ UErrorCode status = U_ZERO_ERROR; \ UChar *u_str; \ int u_len; \ zend_convert_to_unicode(conv, &u_str, &u_len, s, l, &status); \ if (auto_free && !duplicate) { \ efree(s); \ } \ ZVAL_UNICODEL(z, u_str, u_len, 0); \ } else { \ char *__s=(s); int __l=l; \ Z_STRLEN_P(z) = __l; \ Z_STRVAL_P(z) = (duplicate?estrndup(__s, __l):__s); \ Z_TYPE_P(z) = IS_STRING; \ } Along with changes to the (ZVAL|RETVAL)_RT_STRING() macros: to support the additional auto_free option (with RETURN_* assuming auto_free=1). The first option solves the one real problem with the current implementation by making the uncatchable RETURN_RT_STRING() macros free the original char* as needed, whereas the latter takes that work away from the calling scope at the cost of adding to the proto and requiring more work to go back and clean up existing uses. Thoughts? -Sara