Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:29374 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 37447 invoked by uid 1010); 9 May 2007 14:23:25 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 37432 invoked from network); 9 May 2007 14:23:25 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 9 May 2007 14:23:25 -0000 Authentication-Results: pb1.pair.com smtp.mail=brianm@dealnews.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=brianm@dealnews.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain dealnews.com designates 129.41.69.185 as permitted sender) X-PHP-List-Original-Sender: brianm@dealnews.com X-Host-Fingerprint: 129.41.69.185 smtp.dealnews.com Linux 2.5 (sometimes 2.4) (4) Received: from [129.41.69.185] ([129.41.69.185:41238] helo=smtp.dealnews.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id D1/3A-60325-959D1464 for ; Wed, 09 May 2007 10:23:22 -0400 Received: (qmail 8741 invoked from network); 9 May 2007 10:23:18 -0400 Received: from unknown (HELO mail.dealnews.com) (10.1.1.7) by -H with (DHE-RSA-AES256-SHA encrypted) SMTP; 9 May 2007 10:23:18 -0400 Received: (qmail 11234 invoked from network); 9 May 2007 10:23:18 -0400 Received: from unknown (HELO ?10.1.6.4?) (brianm@75.91.28.233) by -H with ESMTPA; 9 May 2007 10:23:18 -0400 Message-ID: <4641D92C.4050208@dealnews.com> Date: Wed, 09 May 2007 09:22:36 -0500 User-Agent: Thunderbird 2.0.0.0 (Macintosh/20070326) MIME-Version: 1.0 To: internals@lists.php.net CC: maurice@phorum.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Tree sort in C From: brianm@dealnews.com (Brian Moon) A common issue in lots of applications is tree sorting with unlimited depth. Phorum has used a recursive function since 1999 to solve this problem. However, at MySQL Conference this year, we finally wrote a non recursive function for it and acheived both memory and time savings on very large data sets. I always knew it could be done, I had just never stopped and worked it out. Most examples I found involved a left/right model that our data does not use. The left/right model involves altering rows when new rows are inserted into the tree. That is not attractive. Phorum uses just a parent_id field to track child members. We would like to take this to another level and make a PHP internal funciton to do the work. In fact, since MySQL Conference, one of team members has written the function as a PHP extension. My questions is, if we made a generic tree sorting function, could this be a function that we could get considered for addition to the other array sorting functions. I have probably ported the Phorum PHP based for tree sorting over to 100 different other applications for sorting trees in PHP. It would be very nice to finally have an internal function for it. Speed is not our main benefit, although there is a noticable speed boost. In addition to sorting the function, each node would be assigned a depth value (optionally named by a function parameter if needed). This is where the biggest memory savings have been found for us. On large data sets (2000 members) in PHP, we have found that altering the input array would cause copy of the whole array, blowing up the memory usage of the script to as much as 10x. Doing this in C helps to remove that problem. Basically, an array would look like: $array = array( 1 => array( "id" => 1, "parent" => 0, "name" => "item 1" ), 2 => array( "id" => 2, "parent" => 0, "name" => "item 2" ), 3 =>array( "id" => 3, "parent" => 1, "name" => "item 1" ), ); The function call would look like: array_treesort($array, "id", "parent"); The returned array would look like: $array = array( 1 => array( "id" => 1, "parent" => 0, "name" => "item 1", "depth" => 0 ), 3 =>array( "id" => 3, "parent" => 1, "name" => "item 1", "depth" => 1 ), 2 => array( "id" => 2, "parent" => 0, "name" => "item 2", "depth" => 0 ), ); Its a little funky to have a function take names of fields like this, but this is the only way we could think of to make this work for people without them having to change their data structure. You can see our current work at http://www.phorum.org/tracfcgi/browser/phorum5/trunk/extension_src That code has been written specifically for Phorum to replace our existing PHP function. It does more than the PHP internal function would do. But, the extra parts are not the things that we were trying to overcome. They are just gravy. -- Brian Moon Senior Developer ------------------------------ http://dealnews.com/ It's good to be cheap =)