Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:120904 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 2336 invoked from network); 14 Aug 2023 21:56:13 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 14 Aug 2023 21:56:13 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id D74E11804B0 for ; Mon, 14 Aug 2023 14:56:10 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_20,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 14 Aug 2023 14:56:07 -0700 (PDT) Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-3fe4a89e8c4so44317915e9.3 for ; Mon, 14 Aug 2023 14:56:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692050166; x=1692654966; h=content-transfer-encoding:in-reply-to:content-language:references :to:subject:from:user-agent:mime-version:date:message-id:from:to:cc :subject:date:message-id:reply-to; bh=L50nnmRP1FyfoCuN3irexVKtq11mGsH+/l5oe5Uk2tM=; b=VFuiszsNksG4HGBfSbHJ7ANpyNyP6ZG0QllDHNOSCzcyknusYSybFB+lUV1/kz+mfh nBhpMcl/CvupqQzmmEGBUWLXxmpQTcKf7rQlCswqb7Z51spbkdFZUf7OTu2XFAJiqEfR Wc7ij4fBK/n/RNZr37t0MdRtVJLmdrFd96NN+HFWCYbvHAnvr717pn+B3Adte8X3+qS5 /4c8gztmvbu6KUxMBXi/Ux0Efvoj/CDKJ4mWqw6T49OMk13Nxbj1miztEtiArYB/EIYa gZV4WD4MLnQvl2Q7KGlVdnhjHLA2mRXS4OKbR3yfsvebPsmIiQ4UoWfdA3QMt32KmrSr ksHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692050166; x=1692654966; h=content-transfer-encoding:in-reply-to:content-language:references :to:subject:from:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=L50nnmRP1FyfoCuN3irexVKtq11mGsH+/l5oe5Uk2tM=; b=k5lR2+qsuDMJg9r+LRipGOem59yu1n0kJdbMIfb9MiPKGGxwUI71od2uNuJrVzU5cC aywr46cXC0JE5lPIDiLnkRAc9QuhUFGQHQOvuDwl/qKhsAjOhGQxWGb5hKXlaE7JkzVr 7Ldo54QUWONriSdaMDnfm/0Gy8YGasr1HvE2ZQoPA3rVJd6SwxYu539AwwF85UFckSC7 ESrm3LjZbw2L4WCU1YnYBATuaMyqg3FAOSeKlCdNjwVVCWhQsAeEPXZApfNzlwtgyVjP kU6UzE2o2EauRxFkLvfmYlYvm7Vy9a6YlXTXckCa1DGNoQiuzVNwtCeew2Hr6J9L9g0h YMEw== X-Gm-Message-State: AOJu0YxobDXc69m5qPc3koY8u4A81pYdT3DaBMaZ4mhbtGhoMqAOLc4i 8JfH6DvkiMXlXnSf/jWymOqpMIn0dnY= X-Google-Smtp-Source: AGHT+IG2/MF5bq+8t9Tl9HefwkR3DdRY9+cgUsPqb68lRFyoYTvOSUKIoP6bcPWqFOpDuvjfipi5AA== X-Received: by 2002:a1c:750f:0:b0:3fe:2624:484e with SMTP id o15-20020a1c750f000000b003fe2624484emr8643853wmc.38.1692050165623; Mon, 14 Aug 2023 14:56:05 -0700 (PDT) Received: from [192.168.0.22] (cpc83311-brig21-2-0-cust191.3-3.cable.virginm.net. [86.20.40.192]) by smtp.googlemail.com with ESMTPSA id z15-20020a1c4c0f000000b003fc0505be19sm15505419wmf.37.2023.08.14.14.56.04 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 14 Aug 2023 14:56:04 -0700 (PDT) Message-ID: <9d3b5634-4958-0351-f5a2-2b35e2ec1d3b@gmail.com> Date: Mon, 14 Aug 2023 22:56:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 To: internals@lists.php.net References: <0ea64cea-a2d8-44ad-bb54-8ed321716ac8@gmail.com> Content-Language: en-GB In-Reply-To: <0ea64cea-a2d8-44ad-bb54-8ed321716ac8@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] SimpleXML and JSON From: rowan.collins@gmail.com (Rowan Tommins) On 14 August 2023 13:40:40 BST, Niels Dossche wrote: >And you load it into simpleXML, the result of calling json_encode($the_simplexml_object) My usual reaction to this is "why would you take an object designed for accessing parts of an XML document, and serialise it to JSON?" Often, the answer turns out to be "because I don't understand SimpleXML objects, and have copied and pasted a weird hack to get a less useful array representation by round-tripping to JSON". On the other hand, the fact that the *debug* representation of SimpleXML objects misses out some parts causes a lot of confusion, and I've actually considered the *opposite* of what you suggest - leave the JSON alone, because people will have written production code based on it, but make the debug array more descriptive of how to use the object. Either way, the challenge is coming up with something that's concise for simple structures, but comprehensive for more complex ones, particularly if you want it to be consistent. For instance: - Do you assume tag names are unique within a parent, so use key=>value directly; or assume they're not, so use key=>[list,of,values]; or dynamically switch between the two? - Do you care about the order of elements with different names, or prefer to group by name? - Do you have any elements with both child tags and text, or attributes and text, or all three? - Do you need to retain the order of text in relation to child elements (important for markup languages like HTML or DocBook)? Or is it enough to have a representation of "all text content" (the behaviour of SimpleXML's string cast)? - Do you have any elements with namespaces? If so, do you want to use local prefixes (and include the xmlns attributes somewhere), or repeat the full namespace URI? There's a reason why both the DOM and SimpleXML provide object-oriented APIs for accessing the document, not a representation flattened to native types, and why both APIs are useful for different jobs - XML just isn't designed for flattening, and different patterns make sense for different documents / use cases. Ultimately, I'm not that interested in trying to come up with a JSON or array representation that covers every possibility, because I think the only consistent answer would be horribly verbose - basically, describe every property that DOM would expose on each node. For debug output, the main concern is showing what you'll get with various styles of access in SimpleXML, so a single "@text" => "foobarbaz" would make sense; or maybe even "(string)" => "foobarbaz" and rename "@attributes" to "->attributes()" Regards, -- Rowan Tommins [IMSoP]