Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:120895 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 68322 invoked from network); 14 Aug 2023 12:40:45 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 14 Aug 2023 12:40:45 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7D7D61804B3 for ; Mon, 14 Aug 2023 05:40:44 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_20,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-ej1-f46.google.com (mail-ej1-f46.google.com [209.85.218.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 14 Aug 2023 05:40:44 -0700 (PDT) Received: by mail-ej1-f46.google.com with SMTP id a640c23a62f3a-991c786369cso575152866b.1 for ; Mon, 14 Aug 2023 05:40:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692016843; x=1692621643; h=content-transfer-encoding:to:subject:from:content-language :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=NnuwSk1exEi59AWj3aXDn/zCAMHDy1rze8jNqR2qbx4=; b=F7pNnVEaJivt8SbulYPNcgvLGtr2US+xGBJW2gJ0F17dBl2gvRJLPRh8+FZ7PSiGas B/9eeqKV8uFFLaytwwp67FsGxUYnsg+GA4q/joccqYttMOBI86t4z2MToWecw36+IDOd GUIolUeYaDDFIa6QEAzY7C9WG5+ONNwDihp2aGGUqZ6XtFCXtBt33eLWKr/uJp4mMkhm 8ypxjRlEbZ9A8CdqPapt4HQ6+dtlvimSFg4EO8C66I5pusDNDGWeDak5ExEruGvMp8cH fI3CCeFOcSFX1IMjNENXHp0VW60a5CZ27OltQQOSuVgALdfGwAb3INJrzF0tOB58up1A IVSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692016843; x=1692621643; h=content-transfer-encoding:to:subject:from:content-language :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=NnuwSk1exEi59AWj3aXDn/zCAMHDy1rze8jNqR2qbx4=; b=XfZas2ecpEtAAhPhVNkp0lK13Yz3xc2f2CebEAZYkrO/SyjNVHWOlmeG8LPRTiDPah f7pc5MHYjI+Q57JXoSuBPlnQ9ZydxhuriESve00NS9tAkoNKNxrn4q8lpD1KTdfiF4yy JvoLSV65O7XiT4ZpOsE63xBzkotp4ujNKzx8FxbUK4PwnTerU9hH6MvhZlyY72F40Hb+ JXTVif7IYq6HTX0MyTs/bBso6mvY9KqmqYyZWudaxRRfyKabfGhlZ7p7XDgcwtDFy4jN V54Sa9yZ1r7ecQFPo5DqwABYSxF4fOHPfT3V2dUnee8atAWv69M3KI+bXT5txBdkntW9 jnyA== X-Gm-Message-State: AOJu0YxXDK3ydnUHHdpmyVLsX1DDWtYnghKvwfldjpgAMbZoz8d/1KeQ iY9xHzteF8Lqf2i4GT+TtNEbzRd3SpA= X-Google-Smtp-Source: AGHT+IGBFwNa2mBX+uesezyO3uQdofKmm2e244EpBfZ0FuLjKFT1WL4hUnhhHO0iBSlFKSr71QDi3g== X-Received: by 2002:a17:906:cc0d:b0:99c:8b9b:b886 with SMTP id ml13-20020a170906cc0d00b0099c8b9bb886mr7086127ejb.56.1692016842558; Mon, 14 Aug 2023 05:40:42 -0700 (PDT) Received: from ?IPV6:2a02:1811:cc83:ee50:280e:1e36:3a00:824? (ptr-dtfv08akcem5xburtic.18120a2.ip6.access.telenet.be. [2a02:1811:cc83:ee50:280e:1e36:3a00:824]) by smtp.gmail.com with ESMTPSA id ko28-20020a170907987c00b0099bc0daf3d7sm5642940ejc.182.2023.08.14.05.40.41 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 14 Aug 2023 05:40:41 -0700 (PDT) Message-ID: <0ea64cea-a2d8-44ad-bb54-8ed321716ac8@gmail.com> Date: Mon, 14 Aug 2023 14:40:40 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: PHP internals Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: SimpleXML and JSON From: dossche.niels@gmail.com (Niels Dossche) Hi internals! While browsing through bugsnet I encountered this SimpleXML issue with 252 votes: https://bugs.php.net/bug.php?id=54632 TLDR: when you have a XML document (modified a bit from the example in the bugtracker): foobar And you load it into simpleXML, the result of calling json_encode($the_simplexml_object) on that is: {"b":{"@attributes":{"id":"foo"}}} There's 2 strange things here: - Where is a? - Where is the text for b (and a)? What's going on here is that json_encode() gives the JSON representation of what var_dump() gives you. This behaviour is perceived as a bug, given the number of votes and the comment section. It's possible to change the JSON encoding, without affected var_dump() and the way you access simpleXML objects. One comment suggests the following JSON representation for the above XML: {"a":{"b":{"@attributes":{"id":"foo"},"@text":"foo"},"@text":"bar"}} This seems reasonable. Let's take a look at how multiple tags are handled right now and how that would work for text nodes. SimpleXML currently handles multiple tags with the same name by placing them in an array: Given: You'll get: {"b":{"@attributes":{"id":"foo"}},"x":[{},{}],"y":{}} We could do the same for text nodes. Given: foobarbaz Could give: {"a":{"b":{"@attributes":{"id":"foo"}},"x":[{},{}],"y":{}}, "@text": ["foo", "bar", "baz"]}} Now, this would still not allow to reconstruct the document based on the JSON however, as the ordering between tags&text is lost (just as is the case now for ordering between different tags). I'm not sure what the community specifically wants here. Are there opinions on how this should behave? Kind regards Niels