Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122339 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 29140 invoked from network); 8 Feb 2024 07:59:12 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 8 Feb 2024 07:59:12 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1707379205; bh=6eAc+vNsQMy24O05DtXAKM0jcWIIdl8/vtknIzCWJxY=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=nYc4vWK4t8AmiDgSSPesf4JRZ1P6Xqj4fpa7qmyi4IKTCZcdVi6zCsk75jVfLCFEe /Z2TTi4dFjzzvL660nTO/OtQn5LW48Jp/W1xDHJFPCfi7hfPEBCbqhpsGuszc574m0 p73v4hFuSFG1HgbMhF+NmCJsMmPNqAUYziTDn6p3fB/4mUrXeAiP3EsQLUOKYUUeQF kqyxC/JPFKtgMM08t373Sy5dP04hxWZWC1alrNrey+NicBg3pQgfd74jurV8o2T0ys P2X7T/olFSe3P8zKU5JAjGm8a6AGTHKcxJDHX7nBgQku9CLXH04jzZ37+SIkAyS0hP d8BxowKYg5fgw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id A722C180064 for ; Thu, 8 Feb 2024 00:00:04 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=BAYES_20,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-yb1-f174.google.com (mail-yb1-f174.google.com [209.85.219.174]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 7 Feb 2024 23:59:58 -0800 (PST) Received: by mail-yb1-f174.google.com with SMTP id 3f1490d57ef6-db3a09e96daso1527892276.3 for ; Wed, 07 Feb 2024 23:59:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707379144; x=1707983944; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=79ew3I+xd84RIx63xg+6uVR4Kdr7wFC4M3ht3XtZKN0=; b=aie3HDKM47NWVfY9QCR1LFt50ulBTj5yfZ4E3waym3mmVFscSL7fx7mDmScAkh00vc Q06ESJgYjIMzomimEDZmdQH081aixuKA6EQa6duHavXU1CucJYx0LNP7KlJbSJ/2A8Ka o1QlDpWYxHbsM3hHZCzuNaTlQKtAC/MKmK9bR56l9guYumae7UzNjsg68FhaimifFIfb WyUHRenzQyfSGZ4pjXROaOP7o+v1QsUNRi1m6OpjanpP9Hv41M6HnOYxeq2fuUc+WtiE /hfa3gXkVi0dImq6hHex0ecVou+/tnTFrPflw38OEi+uTSISwC5TTFHRf3RwCN6+z3ZI 4/zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707379144; x=1707983944; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=79ew3I+xd84RIx63xg+6uVR4Kdr7wFC4M3ht3XtZKN0=; b=q7kq74a2w7iocKQZJr7adyJDglRh3aq4ja6rrDOlIBJP3cKTLoFWXILDB/mNj16pl8 5/3Gx/TaY+zABfEBU0zFl4KJbJLeKegtBjrr8z4yfDePzM4gZZakpgERuJ7G7/YU3SaL x1ZYIXHduvnXy38Sul0gye5CzYxMhApdx47CcooSyI7eRSA4JIluhSILzSIjcZsRa+pf Z69T3Kd9nzrp5XnU0V2advxxRI9gwDbMu8B3bvuwbba7J/0Gwn71pXd6wzwlMiGNyUz4 a2iS6Q7x4cKcNe/OpmNUnnH439aXYXZBRbMMqe8N52RhzHhHe6etntMVEN2GJmV3pyVr vyzQ== X-Gm-Message-State: AOJu0Yxxb7+rn0O97hGibJ1PZ46NdRa6EkHhIFsHowMHMLtU2CfuEpTw JBc80Q/giCaNZbzNmFIgBR1eJyD1GwIjHgfuUTE49PSa33S9eZ0QD6IQgX9bUs8odXKsJ6R7Odi +N0Cl839mIzMMovpAtirOBGYksx8= X-Google-Smtp-Source: AGHT+IGvHF9VyzRbQUSnyauPrSACeu+Fggvj+mDl3V5OR5UmIAPTYSZ3eicdYbcm6rMdeMfgOr81swHg9P7aX8nQ4oA= X-Received: by 2002:a05:6902:2004:b0:dc7:3663:96e with SMTP id dh4-20020a056902200400b00dc73663096emr3690357ybb.53.1707379144106; Wed, 07 Feb 2024 23:59:04 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 8 Feb 2024 08:58:52 +0100 Message-ID: To: Sanford Whiteman Cc: PHP Internals Content-Type: multipart/alternative; boundary="00000000000061b5e40610da2f5d" Subject: Re: [PHP-DEV] Why are serialized strings wrapped in double quotes? (s::"") From: michal.brzuchalski@gmail.com (=?UTF-8?Q?Micha=C5=82_Marcin_Brzuchalski?=) --00000000000061b5e40610da2f5d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Sandy, wt., 6 lut 2024 o 21:19 Sanford Whiteman napisa=C5=82(a): > Howdy all, haven't posted in ages but good to see the list going strong. > > I'd like a little background on something we've long accepted: why > does the serialization format need double quotes around a string, even > though the byte length is explicit? > > Example: > > s:5:"hello"; > > All else being equal I would think we could have just > > s:5:hello; > > and skip forward 5 bytes. Instead we need to be aware of the leading > and trailing " in our state machine but I'm not sure what the > advantage is. > You inspired me to play with serialization format to spot even more unnecessary chars https://3v4l.org/DLh1U From my PoV there are more candidates to reduce and still keep the safety, for eg: removing leading ':' before array/object and trailing ';' inside brackets, you reduce by 2 bytes a:4:{i:0;i:123;i:1;b:1;i:2;d:1.1;i:3;s:3:"baz";} Could be simply a:4{i:0;i:123;i:1;b:1;i:2;d:1.1;i:3;s:3:baz} This example saves 4 bytes: double-quotes, one ; and : If you go further all types that require size/length also don't need extra double-colon meaning: a:4 could become a4 s:3 could become s3 The same could apply to O: and E: O3:Foo:5{s4:date;O17:DateTimeImmutable:3{s4:date;s26:2024-02-08 08:41:10.009742;s13:timezone_type;i:3;s8:timezone;s16:Europe/Amsterdam}s6:*= foo;s11:Foo bar baz;s8:Foobar;i:123456789;s3:tbl;a4{i:0;i:123;i:1;b:1;i:2;d:1.1;i:3;s3:baz}= s8:*color;E12:Color:Yellow} This is still readable by humans and keep the size/length in all places where needed. My attached example is poor but shows up to ~20% size reduction. Interestingly when an array is serialized as object property it is not followed by ; in field list https://3v4l.org/4p6ve O:3:"Foo":2:{s:3:"foo";a:3:{i:0;i:1;i:1;i:2;i:2;i:3;}s:3:"bar";s:3:"baz";} Missing ; between }s was a surprise to me. Best regards, Micha=C5=82 Marcin Brzuchalski --00000000000061b5e40610da2f5d--