Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:123045
X-Original-To: internals@lists.php.net
Delivered-To: internals@lists.php.net
Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5])
by qa.php.net (Postfix) with ESMTPS id 84A931A009C
for ; Mon, 8 Apr 2024 19:21:47 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail;
t=1712604138; bh=aJg2iC6CjSCtYLuasHxL/w2yHZHPJtvKjOma7Zg2p78=;
h=Date:Subject:To:References:From:In-Reply-To:From;
b=ggfcueJgtUBMTlUJNkviIhr8bbG8sR4b0ptgjnCM3w52d2/9sO7sr1FPOhgCJHBTJ
/tbz1aEOzeJDZbIMXFXBzm6+RzDUBbxz2NDG9n+lK1eu7eVKSzje0BPG9RVYaqnmrh
MvhM5kRzrfVj29Vc2I0tWFMofrUn5veIrc65sv2hBY1MULs2dJGsAoePV025ti4E9L
pYtLm/HwD+TjGE85xe7jewXo4WnAxzxapxJmLZUN+CGHDfK7vJxUZ5DNQWQ3Y7vdLw
FaeorKkzN+oRLYiz2LwNNAWdxfjUWoYLIqpx9iyB3cK4r6KERIZ9AfF1RqZAL4EsaV
bOd0F1zkr4I7w==
Received: from php-smtp4.php.net (localhost [127.0.0.1])
by php-smtp4.php.net (Postfix) with ESMTP id 1A9D8180088
for ; Mon, 8 Apr 2024 19:22:17 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net
X-Spam-Level:
X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED,
DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE,
RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE
autolearn=no autolearn_force=no version=4.0.0
X-Spam-Virus: No
X-Envelope-From:
Received: from fhigh3-smtp.messagingengine.com (fhigh3-smtp.messagingengine.com [103.168.172.154])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
(No client certificate requested)
by php-smtp4.php.net (Postfix) with ESMTPS
for ; Mon, 8 Apr 2024 19:22:16 +0000 (UTC)
Received: from compute5.internal (compute5.nyi.internal [10.202.2.45])
by mailfhigh.nyi.internal (Postfix) with ESMTP id 3DAF0114011F
for ; Mon, 8 Apr 2024 15:21:44 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
by compute5.internal (MEProxy); Mon, 08 Apr 2024 15:21:44 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rwec.co.uk; h=cc
:content-type:content-type:date:date:from:from:in-reply-to
:in-reply-to:message-id:mime-version:references:reply-to:subject
:subject:to:to; s=fm2; t=1712604104; x=1712690504; bh=ZTaUAGgpc3
3SMt4fNcvX/p95LQNcZy3sNwRagTieloE=; b=BMiVyiGpW9ZGGAjjPZucKPu2mF
xJ6IHHdzwjPq8mkoTS0xUp07JJ5itk9boUEE0JsI8TnDtVVYbwNyjshL2RlYPkSJ
qouFdOUpTar6VopzH5GhIiw8peMjtsNdOlARee86Ce8lygqew4tyo3l1n5oWnllk
EvLYjX3t27GCEV6jbPGt1zjxRrcELVqJhPDxer3R6LmlN0tJajtEBHr2UXn3ehil
TSTiQYc/5Fo9oRp/nC5H5ANvjDI/OWyFddtZOlhD2eueIew/0s4aiLI2Eygr8KTC
jVQ2F0ru4MZHjywrdn5VqLdDaUXMsB9w/j2TPMQal/jtYXywO+/s2mQ8i2kQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
messagingengine.com; h=cc:content-type:content-type:date:date
:feedback-id:feedback-id:from:from:in-reply-to:in-reply-to
:message-id:mime-version:references:reply-to:subject:subject:to
:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=
fm2; t=1712604104; x=1712690504; bh=ZTaUAGgpc33SMt4fNcvX/p95LQNc
Zy3sNwRagTieloE=; b=mx4gfs6ENQ+Yno9Rz3t5SLAfGslQqg+hG10B15PoMC1L
YE2gHrTatULNHE3T4uMRtybirq/hCsZ3Nj/QBi7tB3dcK5SMS7DTl+uR5J2iWs9T
PHGhsAvNhPsewPnRaXrw9rmFfLXv1c0kTHGuXoik1xUBQv2nzlEQKBzIIBkHqNPk
BpUZGhxFkPFa+Yg3AsEDqo4kVQ00VFmJZn4tnxsza4LsT/7YO6NwUs7BTuWwZzJ0
uqCEaWwxP0LQ9MDXCWYcZWPgZULciQqdojHEcj9aoL7LuZclWvWjETIyvdhYYmxh
taSmq0aUh0KiD8kxKCThP5ERlg30wd0353/8WTC+tQ==
X-ME-Sender:
X-ME-Received:
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedgudefhecutefuodetggdotefrod
ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh
necuuegrihhlohhuthemuceftddtnecuogfuuhhsphgvtghtffhomhgrihhnucdlgeelmd
enucfjughrpegtkfffgggfuffvfhfhjgesrgdtreertddvjeenucfhrhhomhepfdftohif
rghnucfvohhmmhhinhhsucglkffoufhorfgnfdcuoehimhhsohhprdhphhhpsehrfigvtg
drtghordhukheqnecuggftrfgrthhtvghrnhepveekueeuhfevvdfhieehgfdtgfegveff
tdethfeuieejueffvdekueduleehheegnecuffhomhgrihhnpegvgihtvghrnhgrlhhsrd
hiohdpfehvgehlrdhorhhgnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehm
rghilhhfrhhomhepihhmshhophdrphhhphesrhifvggtrdgtohdruhhk
X-ME-Proxy:
Feedback-ID: id5114917:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA for
; Mon, 8 Apr 2024 15:21:43 -0400 (EDT)
Content-Type: multipart/alternative;
boundary="------------VlTkDBWUHXAv9ovfv8imiw0m"
Message-ID:
Date: Mon, 8 Apr 2024 20:21:38 +0100
Precedence: bulk
list-help:
list-post:
List-Id: internals.lists.php.net
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [PHP-DEV] Native decimal scalar support and object types in
BcMath - do we want both?
To: internals@lists.php.net
References: <40553F28-2EC2-475A-BD8E-1D6517AA2A51@rwec.co.uk>
<2B518F62-B774-45C9-82A2-EF6653AAE34E@sakiot.com>
<0f3d0f89-3064-4d56-9fb2-801bb0cda8a5@rwec.co.uk>
Content-Language: en-GB
In-Reply-To:
From: imsop.php@rwec.co.uk ("Rowan Tommins [IMSoP]")
This is a multi-part message in MIME format.
--------------VlTkDBWUHXAv9ovfv8imiw0m
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
On 07/04/2024 23:50, Jordan LeDoux wrote:
> By a "scalar" value I mean a value that has the same semantics for
> reading, writing, copying, passing-by-value, passing-by-reference, and
> passing-by-pointer (how objects behave) as the integer, float, or
> boolean types.
Right, in that case, it might be more accurate to talk about "value
types", since arrays are not generally considered "scalar", but have
those same behaviours. And Ilija recently posted a draft proposal for
"data classes", which would be object, but also value types:
https://externals.io/message/122845
> As I mentioned in the discussion about a "scalar arbitrary precision
> type", the idea of a scalar in this meaning is a non-trivial
> challenge, as the zval can only store a value that is treated in this
> way of 64 bits or smaller.
Fortunately, that's not true. If you think about it, that would rule out
not only arrays, but any string longer than 8 bytes long!
The way PHP handles this is called "copy-on-write" (COW), where multiple
variables can point to the same zval until one of them needs to write to
it, at which point a copy is transparently created.
> The pointer for this value would fit in the 64 bits, which is how
> objects work, but that's also why objects have different semantics for
> scope than integers. Objects are potentially very large in memory, so
> we refcount them and pass the pointer into child scopes, instead of
> copying the value like is done with integers.
Objects are not the only thing that is refcounted. In fact, in PHP 4.x
and 5.x, *every* zval used a refcount and COW approach; changing some
types to be eagerly copied instead was one of the major performance
improvements in the "PHP NG" project which formed the basis of PHP 7.0.
You can actually see this in action here: https://3v4l.org/oPgr4
This is all completely transparent to the user, as are a bunch of other
memory/speed optimisations, like interned string literals, packed
arrays, etc.
So, there may be performance gains if we can squeeze values into the
zval memory, but it doesn't need to affect the semantics of the new type.
> In general I would say that libbcmath is different enough from other
> backends that we should not expect any work on a BCMath implementation
> to be utilized in other implementations. It *could* be that we are
> able to do that, but it should not be something people *expect* to
> happen because of the technical differences.
>
> Some of the broader language design choices would be transferable
> though. For instance, the standard names of various calculation
> functions/methods are something that would remain independent, even
> with the differences in the implementation.
Yes, that makes sense. Even if we don't have an interface, it would be
annoying if one class provided $foo->div($bar), and another provided
$foo->dividedBy($bar)
> For money calculations, scale is always likely to be a more useful
> configuration. For mathematical calculations (such as machine learning
> applications, which I would say is the other very large use case for
> this kind of capability), precision is likely to be the more useful
> configuration. Other applications that I have personally encountered
> include: simulation and modeling, statistical distributions, and data
> analysis. Most of these can be done with fair accuracy without
> arbitrary precision, but there are certainly types of applications
> that would benefit from or even require arbitrary precision in these
> spaces.
This probably relates quite closely to Arvid's point that for a lot of
uses, we don't actually need arbitrary precision, just something that
can represent small-to-medium decimal numbers without the inaccuracies
of binary floating point. That some libraries can be used for both
purposes is not necessarily evidence that we could ever "bless" one for
both use cases and make it a single native type.
> My intuition at the moment is that a single number-handling API would
> be challenging to do without an actual proposed implementation on the
> table for MPDec/MPFR.
I think it would certainly be wise to experiment with how each library
can interface to the language as an extension, before spending the extra
time needed to integrate it as a new zval type.
> But even with these extensions available in PHP, they are barely used
> by developers at all because (at least in part) of the enormous
> difference between PECL and PIP. For PHP, I do not think that
> extensions are an adequate substitute like PIP modules are for Python.
Yes, this is something of a problem. On the plus side, a library doesn't
need to be incorporated into the language to be widely installed,
because we have the concept of "bundled" extensions; and in practice,
Linux distributions add a few "popular" PECL extensions to their list of
installable binary packages. On the minus side, even making it into the
"bundled" list doesn't mean it's installed by default everywhere, and
userland libraries spend a lot of effort polyfilling things which would
ideally be available by default.
> This is, essentially, the thesis of the research and work that I have
> done in the space since joining the internals mailing list.
Thanks, there's some really useful perspective there.
Regards,
--
Rowan Tommins
[IMSoP]
--------------VlTkDBWUHXAv9ovfv8imiw0m
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit
On 07/04/2024 23:50, Jordan LeDoux
wrote:
By a "scalar" value I mean a value that has the same
semantics for reading, writing, copying, passing-by-value,
passing-by-reference, and passing-by-pointer (how objects
behave) as the integer, float, or boolean types.
Right, in that case, it might be more accurate to talk about
"value types", since arrays are not generally considered "scalar",
but have those same behaviours. And Ilija recently posted a draft
proposal for "data classes", which would be object, but also value
types: https://externals.io/message/122845
As I mentioned in the discussion about a "scalar
arbitrary precision type", the idea of a scalar in this
meaning is a non-trivial challenge, as the zval can only
store a value that is treated in this way of 64 bits or
smaller.
Fortunately, that's not true. If you think about it, that would
rule out not only arrays, but any string longer than 8 bytes
long!
The way PHP handles this is called "copy-on-write" (COW), where
multiple variables can point to the same zval until one of them
needs to write to it, at which point a copy is transparently
created.
The pointer for this value would fit in the 64 bits,
which is how objects work, but that's also why objects have
different semantics for scope than integers. Objects are
potentially very large in memory, so we refcount them and
pass the pointer into child scopes, instead of copying the
value like is done with integers.
Objects are not the only thing that is refcounted. In fact, in
PHP 4.x and 5.x, *every* zval used a refcount and COW approach;
changing some types to be eagerly copied instead was one of the
major performance improvements in the "PHP NG" project which
formed the basis of PHP 7.0. You can actually see this in action
here: https://3v4l.org/oPgr4
This is all completely transparent to the user, as are a bunch of
other memory/speed optimisations, like interned string literals,
packed arrays, etc.
So, there may be performance gains if we can squeeze values into
the zval memory, but it doesn't need to affect the semantics of
the new type.
In general I would say that libbcmath is different enough
from other backends that we should not expect any work on a
BCMath implementation to be utilized in other
implementations. It *could* be that we are able to do that,
but it should not be something people *expect* to happen
because of the technical differences.
Some of the broader language design choices would be
transferable though. For instance, the standard names of
various calculation functions/methods are something that
would remain independent, even with the differences in the
implementation.
Yes, that makes sense. Even if we don't have an interface, it
would be annoying if one class provided $foo->div($bar), and
another provided $foo->dividedBy($bar)
For money calculations, scale is always likely to be a
more useful configuration. For mathematical calculations
(such as machine learning applications, which I would say is
the other very large use case for this kind of capability),
precision is likely to be the more useful configuration.
Other applications that I have personally encountered
include: simulation and modeling, statistical distributions,
and data analysis. Most of these can be done with fair
accuracy without arbitrary precision, but there are
certainly types of applications that would benefit from or
even require arbitrary precision in these spaces.
This probably relates quite closely to Arvid's point that for a
lot of uses, we don't actually need arbitrary precision, just
something that can represent small-to-medium decimal numbers
without the inaccuracies of binary floating point. That some
libraries can be used for both purposes is not necessarily
evidence that we could ever "bless" one for both use cases and
make it a single native type.
My intuition at the moment is that a
single number-handling API would be challenging to do without an
actual proposed implementation on the table for MPDec/MPFR.
I think it would certainly be wise to experiment with how each
library can interface to the language as an extension, before
spending the extra time needed to integrate it as a new zval type.
But even with these extensions available in PHP, they are
barely used by developers at all because (at least in part)
of the enormous difference between PECL and PIP. For PHP, I
do not think that extensions are an adequate substitute like
PIP modules are for Python.
Yes, this is something of a problem. On the plus side, a library
doesn't need to be incorporated into the language to be widely
installed, because we have the concept of "bundled" extensions;
and in practice, Linux distributions add a few "popular" PECL
extensions to their list of installable binary packages. On the
minus side, even making it into the "bundled" list doesn't mean
it's installed by default everywhere, and userland libraries spend
a lot of effort polyfilling things which would ideally be
available by default.
This is, essentially, the thesis of the research and work
that I have done in the space since joining the internals
mailing list.
Thanks, there's some really useful perspective there.
Regards,
--
Rowan Tommins
[IMSoP]
--------------VlTkDBWUHXAv9ovfv8imiw0m--