Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:51460 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 61964 invoked from network); 17 Feb 2011 19:52:48 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Feb 2011 19:52:48 -0000 Authentication-Results: pb1.pair.com header.from=christian.stocker@liip.ch; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=christian.stocker@liip.ch; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain liip.ch designates 194.50.176.144 as permitted sender) X-PHP-List-Original-Sender: christian.stocker@liip.ch X-Host-Fingerprint: 194.50.176.144 host-194.50.176.144.colo-4.net Linux 2.6 Received: from [194.50.176.144] ([194.50.176.144:40656] helo=bkmail.chregu.tv) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id D6/C2-46346-C8C7D5D4 for ; Thu, 17 Feb 2011 14:52:46 -0500 Received: from air11.local (84-72-46-231.dclient.hispeed.ch [84.72.46.231]) by bkmail.chregu.tv (Postfix) with ESMTPSA id 286E75C48C9; Thu, 17 Feb 2011 20:52:36 +0100 (CET) Message-ID: <4D5D7C88.1090204@liip.ch> Date: Thu, 17 Feb 2011 20:52:40 +0100 User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: James Devine CC: Pierre Joye , internals@lists.php.net References: In-Reply-To: X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] PHP Patch for loadHTML options From: christian.stocker@liip.ch (Christian Stocker) Hi Looks good to me. But can you add the constants from: typedef enum { HTML_PARSE_RECOVER = 1<<0, /* Relaxed parsing */ HTML_PARSE_NODEFDTD = 1<<2, /* do not default a doctype if not found */ HTML_PARSE_NOERROR = 1<<5, /* suppress error reports */ HTML_PARSE_NOWARNING= 1<<6, /* suppress warning reports */ HTML_PARSE_PEDANTIC = 1<<7, /* pedantic error reporting */ HTML_PARSE_NOBLANKS = 1<<8, /* remove blank nodes */ HTML_PARSE_NONET = 1<<11,/* Forbid network access */ HTML_PARSE_NOIMPLIED= 1<<13,/* Do not add implied html/body... elements */ HTML_PARSE_COMPACT = 1<<16 /* compact small text nodes */ } htmlParserOption; also to it, so that they are available from PHP? I don't like guessing integers :) Maybe we don't have to extra define the ones already defined in xmlParserOption , eg XML_PARSE_NOERROR, but there's no compagnion to HTML_PARSE_NODEFDTD and HTML_PARSE_NOIMPLIED chregu On 17.02.11 16:29, James Devine wrote: > Will do, thanks! > > On Thu, Feb 17, 2011 at 6:43 AM, Pierre Joye wrote: >> hi, >> >> Can you make a patch against trunk instead please? >> >> Also pls follow the CS: >> >> if (foo) { >> } >> >> Ideally attach your patch to a feature request at bugs.php.net, so we >> won't loose it :) >> >> thanks for your work! >> >> Cheers, >> >> On Thu, Feb 17, 2011 at 12:57 AM, James Devine wrote: >>> I've included a patch for review adding the ability to optionally pass >>> options to the DOMDocument::loadHTML[File] functions >>> >>> >>> diff -ru php-5.3.5.orig/ext/dom/document.c php-5.3.5.new/ext/dom/document.c >>> --- php-5.3.5.orig/ext/dom/document.c 2010-04-02 14:08:15.000000000 -0600 >>> +++ php-5.3.5.new/ext/dom/document.c 2011-02-16 16:49:20.000000000 -0700 >>> @@ -149,10 +149,12 @@ >>> >>> ZEND_BEGIN_ARG_INFO_EX(arginfo_dom_document_loadhtml, 0, 0, 1) >>> ZEND_ARG_INFO(0, source) >>> + ZEND_ARG_INFO(0, options) >>> ZEND_END_ARG_INFO(); >>> >>> ZEND_BEGIN_ARG_INFO_EX(arginfo_dom_document_loadhtmlfile, 0, 0, 1) >>> ZEND_ARG_INFO(0, source) >>> + ZEND_ARG_INFO(0, options) >>> ZEND_END_ARG_INFO(); >>> >>> ZEND_BEGIN_ARG_INFO_EX(arginfo_dom_document_savehtml, 0, 0, 0) >>> @@ -2157,10 +2159,11 @@ >>> char *source; >>> int source_len, refcount, ret; >>> htmlParserCtxtPtr ctxt; >>> + int options = 0; >>> >>> id = getThis(); >>> >>> - if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", >>> &source, &source_len) == FAILURE) { >>> + if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s|l", >>> &source, &source_len, &options) == FAILURE) { >>> return; >>> } >>> >>> @@ -2180,6 +2183,9 @@ >>> RETURN_FALSE; >>> } >>> >>> + if(options) >>> + htmlCtxtUseOptions(ctxt, options); >>> + >>> ctxt->vctxt.error = php_libxml_ctx_error; >>> ctxt->vctxt.warning = php_libxml_ctx_warning; >>> if (ctxt->sax != NULL) { >>> >>> -- >>> PHP Internals - PHP Runtime Development Mailing List >>> To unsubscribe, visit: http://www.php.net/unsub.php >>> >>> >> >> >> >> -- >> Pierre >> >> @pierrejoye | http://blog.thepimp.net | http://www.libgd.org >> >