Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:45277 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 10337 invoked from network); 15 Aug 2009 09:30:01 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 15 Aug 2009 09:30:01 -0000 Authentication-Results: pb1.pair.com smtp.mail=paul.biggar@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=paul.biggar@gmail.com; sender-id=pass; domainkeys=bad Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.218.208 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: paul.biggar@gmail.com X-Host-Fingerprint: 209.85.218.208 mail-bw0-f208.google.com Received: from [209.85.218.208] ([209.85.218.208:60234] helo=mail-bw0-f208.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 0A/74-06172-710868A4 for ; Sat, 15 Aug 2009 05:30:00 -0400 Received: by bwz4 with SMTP id 4so1618076bwz.24 for ; Sat, 15 Aug 2009 02:29:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=9NqCFp3n4zMpHY3JaQoLmBAB7QlYf5ydaxgDV5BUAyk=; b=OVFPSdtacrppbQW76APezjpZP6GdASyb7rxpW0UaIEh6I8TtLUW/lR4Vc+5a9+ez0y 12vz+cCsxh8f7mV0E+8auSdMeqmImlUqyVsQlMOLSPTOYyjxisbyOHZUl/B/SqvPKM4g StE1rrromEb1b6ckDW9pDaPlgyrCz3w3Myy3o= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=egfv4WTzXH2E43TZWd10SjNfKmT9GIQK7HRelco3ReJ1zYoLrdF0a2d8K/H2/PAG3D G1CVUvbO/ci2bzSrRoVoRLq6O6uqse+X60e+aTtcoKNlHZrUMqeCj+3NghKCaUAwrFtX +fDh61cO7TjGYmSJhjlo2Vo24LaLRwOR8qiyE= MIME-Version: 1.0 Received: by 10.239.135.208 with SMTP id e16mr195905hbe.44.1250328594068; Sat, 15 Aug 2009 02:29:54 -0700 (PDT) In-Reply-To: References: Date: Sat, 15 Aug 2009 10:29:34 +0100 Message-ID: To: Stefan Marr Cc: internals@lists.php.net Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Design of the Zend Engine's Instruction Set From: paul.biggar@gmail.com (Paul Biggar) Hi Stefan, On Thu, Aug 13, 2009 at 1:42 PM, Stefan Marr wrote: > Hello internals: > > I had a look at the Zend Engine to understand some > details about its internal design with respect > to its opcodes and machine model. To start with, the best reference about the Zend engine that I know of is a presentation by Andy Wharmby at IBM: www.zapt.info/PHPOpcodes_Sep2008.odp. It should answer a lot of your questions. > Would like to ask you for some comments if the > following sounds wrong or misinterpreted to you: > > > So, the basic design of the Zend Engine is a > a stack-based interpreter for a fixed length No, its a register based interpreter. There is a stack, but thats used for calling functions only. The operands to the opcodes are pointed to by the opcodes in the case of compiled variables, or in symbol tables otherwise. That's as close to a register machine as we can get I think, but its not very close to a stack machine. In a stack-based VM, the operands to an opcode would be implicit, with add for example using the top two stack operands, and thats not the case at all. > instruction set (76byte on a 32bit architecture), Andy's presentation says 96 bytes, but that might be 64 bit. I presume this means sizeof(strict _zend_op)? > where the instruction encoding > is much more complex then for instance for the > JVM, Python, or Smalltalk. Yes, definitely. > Even so, the source code is compiled to a linearized > instruction stream, the instruction itself contain not just opcode and > operands. > > The version I looked at had some 136 opcodes encoded > in one byte, but the rest of the instruction has > many similarities with a AST representation. Are you referring to the IS_TMP_VAR type of a znode? > Instructions encode: > =C2=A0- a function pointer to the actual handler which is > =C2=A0 used to execute it The type of interpreter dispatch used can be chosen at configure-time using the --with-vm-kind flag. The call-based interpreter is the default. I've heard the others are buggy, but I'm not certain where I heard that. > However, its not a simple, single stack model, > but uses several purpose-specific stacks. How so? > What I am not so sure about is especially the > semantics of the result field and the pointer > to the other function (op_array). > > Would be grateful if someone could comment on that. I'm not sure whats confusing about the result field? It points to a zval, same as op1 and op2. I _think_ that op_array is used to attach extra information to the opcode by special extensions. I can't think of an example off the top of my head. > I am also not really sure with these complexity, > whether is not actually some kind of abstract syntax > tree instead of a instruction set like Java > bytecode. Thats not a technical problem, but merely > an academic question to categorize/characterize PHP. I think the result field of a znode can make it seem like that, but I would characterize it as you have before. An instruction set just like Java bytecode. Way more complicated, obviously, but I dont think its very close to an AST. Certainly the interpreter does not really resemble an AST walker. I hope I answered what you were looking for. I'm not certain about a few of my answers, since I've really avoided the interpreter in my work, but I think most of it is OK. Best of luck, Paul --=20 Paul Biggar paul.biggar@gmail.com