New DateTime[Immutable]::createFromTimestamp

1 year ago by Marc — view source

unread

Hi internals,

I have opened a small PR to add the following functions at
https://github.com/php/php-src/pull/12413

DateTime::createFromTimestamp(int|float$timestamp):static
DateTimeImmutable::createFromTimestamp(int|float$timestamp):static
date_create_from_timestamp(int|float$timestamp):DateTime
date_create_immutable_from_timestamp(int|float$timestamp):DateTimeImmutable

I hope this does not require an RFC, but I'm not sure so I'm writing
here to get some thoughts.

Thanks,
Marc

1 year ago by Saki Takamachi — view source

unread

Hi Marc,

Personally, I don't think these are necessarily necessary since there is a "from format".

However, there is certainly an issue of convenience, so I'm not against it if there's a demand for it.

I have one concern. float is imprecise, so I don't think it's suitable for this kind of use. For timestamps, 16 digits are used, which exceeds the guaranteed precision of 15 digits on non-IBM platforms.

Regards.

Saki

1 year ago by Rowan Tommins — view source

unread

Hi Marc,

Personally, I don't think these are necessarily necessary since there is a "from format".

As noted on the PR, both the default constructor and createFromFormat require the input to be converted to a string, which tends to make code messy. It's then converted back to an integer internally, which isn't very efficient.

I would be in favour of adding this method.

I have one concern. float is imprecise, so I don't think it's suitable for this kind of use. For timestamps, 16 digits are used, which exceeds the guaranteed precision of 15 digits on non-IBM platforms.

I'm not sure where you got those numbers; on a 64-bit architecture (surely the vast majority of PHP installs), a float can precisely represent any whole number from -253 up to 253 - 1. As a Unix timestamp, that's a one-second accuracy for any time 285 million years into the past or future. https://www.wolframalpha.com/input?i=2**53+as+unix+timestamp

Possibly you're thinking of a representation that counts integers as milliseconds or microseconds, instead of seconds?

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Saki Takamachi — view source

unread

Hi Rowan,

I'm not sure where you got those numbers; on a 64-bit architecture (surely the vast majority of PHP installs), a float can precisely represent any whole number from -253 up to 253 - 1. As a Unix timestamp, that's a one-second accuracy for any time 285 million years into the past or future. https://www.wolframalpha.com/input?i=2**53+as+unix+timestamp

Possibly you're thinking of a representation that counts integers as milliseconds or microseconds, instead of seconds?

Yes, I'm assuming a timestamp that includes up to microseconds. This is because in the last example of the PR description, the microsecond timestamp was expressed as a float.

For information on the number of digits that guarantee float accuracy, please see the following documentation:
https://www.php.net/manual/en/reserved.constants.php

The constant PHP_FLOAT_DIG is its value. This is set to 16 for IBM and 15 for others.

Regards.

Saki

1 year ago by Rowan Tommins — view source

unread

Yes, I'm assuming a timestamp that includes up to microseconds. This
is because in the last example of the PR description, the microsecond
timestamp was expressed as a float.

Floating point numbers don't suddenly become unpredictable after a
certain point, nor is there a strict relationship with any number of
decimal digits. Rather, their absolute precision varies based on their
magnitude - for each power of 2 in magnitude, they lose a power of 2 in
absolute precision.

More specifically, an IEEE float64 / double precision has 53 significant
bits, which can be allocated to either the whole or fractional part of a
number. It has a precision of at least 1 up to 253-1, at least 0.5 up
to 252-1, and so on - a precision of 1/2**(53-n) for values up to (2**n)-1

Plugging that formula into a spreadsheet, I get:

accuracy to nearest microsecond up to the year 2242 (1/220 up to
233-1 seconds)
accuracy to nearest 10 microseconds up to the year 4147 (1/217 up to
236-1 seconds)
accuracy to nearest millisecond up to the year 280701 (1/210 up to
243-1 seconds)

An application that needs to represent a timestamp more than 200 years
in the future, with a precision of 1 microsecond, is going to be very rare.

Another problem is that there are "millisecond" and "microsecond" timestamp formats that do not include dots, how should they be handled? In reality, such a use case is unlikely, but it is virtually impossible to tell the difference between the two

Distinguishing between the two is up to the user, just as it is up to
the user to feed the right format to createFromFormat(). If they know
they have a number of microseconds, they could use
DateTimeImmutable::createFromTimestamp($value / 1_000_000) - as
discussed above, this will not lose any accuracy for dates in the next
200 years or so.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Saki Takamachi — view source

unread

Hi, Rowan

I may have been thinking a little too much on the "technical side". Admittedly, my concern is in rare cases.

OK, I understood that the problem is unlikely to occur in a realistic use case.

Now, I'll just quietly watch this discussion.

Regards.

Saki

1 year ago by Marc — view source

unread

Hi Marc,

Personally, I don't think these are necessarily necessary since there is a "from format".
As noted on the PR, both the default constructor and createFromFormat require the input to be converted to a string, which tends to make code messy. It's then converted back to an integer internally, which isn't very efficient.

I would be in favour of adding this method.

Thanks for your support :)

@saki, If you are dealing with unix timestamps you do (or should) handle
these as numbers and being forced to to stringify these numbers only to
parse it as numbers is error prone and not performant at all.

I have one concern. float is imprecise, so I don't think it's suitable for this kind of use. For timestamps, 16 digits are used, which exceeds the guaranteed precision of 15 digits on non-IBM platforms.
I'm not sure where you got those numbers; on a 64-bit architecture (surely the vast majority of PHP installs), a float can precisely represent any whole number from -253 up to 253 - 1. As a Unix timestamp, that's a one-second accuracy for any time 285 million years into the past or future.https://www.wolframalpha.com/input?i=2**53+as+unix+timestamp

Possibly you're thinking of a representation that counts integers as milliseconds or microseconds, instead of seconds?

On a 64bit system it's obvious to have higher precision if you handle
the integer and fractions part separately (as timelib does) but this
doesn't help if you already have a floating point number at hand. Also
JS uses a double float for timestamps in milliseconds which is the most
used lang together with PHP and there is tons of code out there who does
a simple */ 1000 to pass a date-time to/from JS, which I wouldn't
necessary decline.

One think that could be done is adding an optional second argument for
allowing the microseconds part but again forcing PHP users to extract
seconds and microseconds from a float isn't user friendly so I would
still allow a float in the first place which than opens the question
what to do if both arguments have fractions. Additionally this would
break Carbons createFromTimestamp($timestamp, $tz = null).

Instead it would probably be better to add a [get|set]Microseconds
method as well instead. I'm open for adding another PR for this.

PS: I didn't add a timezone argument here because a unix timestamp is in
UTC by definition and changing the TZ is just one setTimezone call
away not having to deal with any default/system/ini TZ setting.

Regards,

Best,
Marc

1 year ago by Saki Takamachi — view source

unread

Hi Marc,

On a 64bit system it's obvious to have higher precision if you handle the integer and fractions part separately (as timelib does) but this doesn't help if you already have a floating point number at hand. Also JS uses a double float for timestamps in milliseconds which is the most used lang together with PHP and there is tons of code out there who does a simple */ 1000 to pass a date-time to/from JS, which I wouldn't necessary decline.

I think there is no problem if it is up to milliseconds. This becomes a problem when it includes up to microseconds.

Another problem is that there are "millisecond" and "microsecond" timestamp formats that do not include dots, how should they be handled? In reality, such a use case is unlikely, but it is virtually impossible to tell the difference between the two:

timestamp: 1698510347
=> second: 2023-10-28 16:25:47(UTC)
=> millisecond: 1970-01-20 15:48:30.347(UTC)

PS: I didn't add a timezone argument here because a unix timestamp is in UTC by definition and changing the TZ is just one setTimezone call away not having to deal with any default/system/ini TZ setting.

Right. There is no need to consider time zones for timestamps.

Regards.

Saki

1 year ago by Marc — view source

unread

Hi Saki,

Hi Marc,

On a 64bit system it's obvious to have higher precision if you handle the integer and fractions part separately (as timelib does) but this doesn't help if you already have a floating point number at hand. Also JS uses a double float for timestamps in milliseconds which is the most used lang together with PHP and there is tons of code out there who does a simple */ 1000 to pass a date-time to/from JS, which I wouldn't necessary decline.
I think there is no problem if it is up to milliseconds. This becomes a problem when it includes up to microseconds.

Another problem is that there are "millisecond" and "microsecond" timestamp formats that do not include dots, how should they be handled? In reality, such a use case is unlikely, but it is virtually impossible to tell the difference between the two:

timestamp: 1698510347
=> second: 2023-10-28 16:25:47(UTC)
=> millisecond: 1970-01-20 15:48:30.347(UTC)

I'm not sure I fully understand what you mean.

Obviously if you have a string to be parsed you need to use the existing
createFromFormat.

One think that could be done is adding an optional second argument for
allowing the microseconds part but again forcing PHP users to extract
seconds and microseconds from a float isn't user friendly so I would
still allow a float in the first place which than opens the question
what to do if both arguments have fractions. Additionally this would
break Carbons createFromTimestamp($timestamp, $tz = null).

Instead it would probably be better to add a [get|set]Microseconds
method as well instead. I'm open for adding another PR for this.

Just opened another PR for adding [get|set]Microseconds to support
handling seconds + microseconds as separate integers.

Regards.

Saki

Best,
Marc

1 year ago by Marc — view source

unread

Hi Saki,

Hi Marc,

On a 64bit system it's obvious to have higher precision if you
handle the integer and fractions part separately (as timelib does)
but this doesn't help if you already have a floating point number at
hand. Also JS uses a double float for timestamps in milliseconds
which is the most used lang together with PHP and there is tons of
code out there who does a simple */ 1000 to pass a date-time to/from
JS, which I wouldn't necessary decline.
I think there is no problem if it is up to milliseconds. This becomes
a problem when it includes up to microseconds.

Another problem is that there are "millisecond" and "microsecond"
timestamp formats that do not include dots, how should they be
handled? In reality, such a use case is unlikely, but it is virtually
impossible to tell the difference between the two:

timestamp: 1698510347
=> second: 2023-10-28 16:25:47(UTC)
=> millisecond: 1970-01-20 15:48:30.347(UTC)

I'm not sure I fully understand what you mean.

Obviously if you have a string to be parsed you need to use the
existing createFromFormat.

One think that could be done is adding an optional second argument for
allowing the microseconds part but again forcing PHP users to extract
seconds and microseconds from a float isn't user friendly so I would
still allow a float in the first place which than opens the question
what to do if both arguments have fractions. Additionally this would
break Carbons createFromTimestamp($timestamp, $tz = null).

Instead it would probably be better to add a [get|set]Microseconds
method as well instead. I'm open for adding another PR for this.

Just opened another PR for adding [get|set]Microseconds to support
handling seconds + microseconds as separate integers.
Missed the link: https://github.com/php/php-src/pull/12557

Regards.

Saki

Best,
Marc