11 years ago by Yasuo Ohgaki — view source

unread

Hi all,

Session module can be faster in several ways.

It can ignore writing session data when session data is not changed.
It also can remove reading session data by caching as most web
servers support keep alive. Current session save handlers lock
session data, but it could be unlocked.

There are many applications that transnational consistency of session
data is not mandatory.

Session module could have ini settings

session.read_cache = int (by default 0 = no cache. 5 for 5 seconds)
Read cache will be updated if session data has changed.
session.lock = On/Off (On by default. Some save handlers already have
this)
session.write_short_circuit = On/Off (Off by default)

Users may verify save handler feature by viewing phpinfo() and
source code level compatibility will be kept.

These features boost web application performance a lot by ignoring
session data consistency. It's just like transnational vs.
non-transnational SQL.

session.read_cache and session.lock may be consolidated since
enabling session.read_cache disables session.lock. There are slight
difference between them but it may be ignored.
(With session.lock=On, writes from other web server(PHP) can be blocked
even when session.read_cache=5, but reading session cache on the same
keep alive session will not be blocked, for example.)

Any comments?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Yasuo Ohgaki — view source

unread

Hi all,

Session module can be faster in several ways.

It can ignore writing session data when session data is not changed.
It also can remove reading session data by caching as most web
servers support keep alive. Current session save handlers lock
session data, but it could be unlocked.

There are many applications that transnational consistency of session
data is not mandatory.

Session module could have ini settings

session.read_cache = int (by default 0 = no cache. 5 for 5 seconds)
Read cache will be updated if session data has changed.

session.lock = On/Off (On by default. Some save handlers already have
this)

session.write_short_circuit = On/Off (Off by default)

Users may verify save handler feature by viewing phpinfo() and
source code level compatibility will be kept.

These features boost web application performance a lot by ignoring
session data consistency. It's just like transnational vs.
non-transnational SQL.

session.read_cache and session.lock may be consolidated since
enabling session.read_cache disables session.lock. There are slight
difference between them but it may be ignored.
(With session.lock=On, writes from other web server(PHP) can be blocked
even when session.read_cache=5, but reading session cache on the same
keep alive session will not be blocked, for example.)

Any comments?

I also would like to add "session deletion delay" to mitigate session
deletion
race condition by adding deletion time in session data. (i.e
session_regenerate_id(true)
may create multiple valid sessions. This can be mitigated by having this)

session.deletion_delay = int (by default 0 = delete immediately. 10 for 10
sec later)

If there is $_SESSION['SESSION_EXPIRE'] and expired, new session ID is
created
automatically.

I would like to hear comments for this also.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Adam Harvey — view source

unread

I also would like to add "session deletion delay" to mitigate session
deletion
race condition by adding deletion time in session data. (i.e
session_regenerate_id(true)
may create multiple valid sessions. This can be mitigated by having this)

session.deletion_delay = int (by default 0 = delete immediately. 10 for 10
sec later)

If there is $_SESSION['SESSION_EXPIRE'] and expired, new session ID is
created
automatically.

I would like to hear comments for this also.

As a nitpick, I think I'd rather this was controlled by a function
than a magic $_SESSION key, at least in userland — conceptually, it's
simpler to explain to users if everything in $_SESSION is always
persisted and it never affects behaviour.

+1 on the idea, though.

Adam

11 years ago by Yasuo Ohgaki — view source

unread

Hi Adam,

I also would like to add "session deletion delay" to mitigate session
deletion
race condition by adding deletion time in session data. (i.e
session_regenerate_id(true)
may create multiple valid sessions. This can be mitigated by having this)

session.deletion_delay = int (by default 0 = delete immediately. 10 for
10
sec later)

If there is $_SESSION['SESSION_EXPIRE'] and expired, new session ID
is
created
automatically.

I would like to hear comments for this also.

As a nitpick, I think I'd rather this was controlled by a function
than a magic $_SESSION key, at least in userland — conceptually, it's
simpler to explain to users if everything in $_SESSION is always
persisted and it never affects behaviour.

+1 on the idea, though.

Thank you for your comment.

I was wandering which is better, adding new API for expire time or just add
expire time magic value to $_SESSION.

Adding expire time to $_SESSION, is easy and no new API is needed, but
it looks bad, I agree.

Adding new API is clean, but it will requires additional operations to save
handlers which might slow down performance.

Magic value looks bad, while it's simpler and faster. This could be vote
option.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Adam Harvey — view source

unread

It can ignore writing session data when session data is not changed.
It also can remove reading session data by caching as most web
servers support keep alive. Current session save handlers lock
session data, but it could be unlocked.

How do you propose to check if session data was changed? For scalar
types it's pretty easy, but it's possible for objects to alter their
properties (including the ones they're persisting) on __wakeup() — I
presume it would effectively be a comparison of what's about to be
written versus what was read initially when it comes time to write the
session?

Session module could have ini settings

session.lock = On/Off (On by default. Some save handlers already have
this)

I've got some concerns on this. I agree that it's a real issue — we do
get support issues in ##php caused by people attempting to
concurrently access open file sessions, for instance — but I'm worried
that this might be a shoot-yourself-in-the-foot option if we're not
careful. I'm thinking mostly of the files handler here: I presume
there would still be locking around initial read and write operations?

Also, as you note, there also isn't any requirement for session
handlers to implement locking right now: if I'm implementing a custom
session handler that doesn't require locking, do I just ignore this
setting if it's not relevant? It seems slightly odd having it in the
session.* namespace if it's really only relevant for files.

I wonder if a better approach would be to implement an improved files
handler (under a different name) that had options for locking, caching
and the like. Say it was called "awesome_files"[0]; you might have
options like this:

session.save_handler = awesome_files
session.awesome_files.lock = On/Off

I'd love to have a more flexible files handler, but I don't think we
want to overspecialise the session implementation around it.

Adam

11 years ago by Yasuo Ohgaki — view source

unread

It can ignore writing session data when session data is not changed.
It also can remove reading session data by caching as most web
servers support keep alive. Current session save handlers lock
session data, but it could be unlocked.

How do you propose to check if session data was changed? For scalar
types it's pretty easy, but it's possible for objects to alter their
properties (including the ones they're persisting) on __wakeup() — I
presume it would effectively be a comparison of what's about to be
written versus what was read initially when it comes time to write the
session?

Session data is serialized. If content has changed, the data is changed.
Session could be large so I'm thinking using md5 hash to save memory.
Is there any better way?

Session module could have ini settings

session.lock = On/Off (On by default. Some save handlers already have
this)

I've got some concerns on this. I agree that it's a real issue — we do
get support issues in ##php caused by people attempting to
concurrently access open file sessions, for instance — but I'm worried
that this might be a shoot-yourself-in-the-foot option if we're not
careful. I'm thinking mostly of the files handler here: I presume
there would still be locking around initial read and write operations?

For files save handler, it should lock while reading/writing session data
at least.
Otherwise, there would be reader-writer issue. Thank you for point it out.

For other save handlers, author should take into concurrency.

Also, as you note, there also isn't any requirement for session

handlers to implement locking right now: if I'm implementing a custom
session handler that doesn't require locking, do I just ignore this
setting if it's not relevant? It seems slightly odd having it in the
session.* namespace if it's really only relevant for files.

Right. Files save handler locks data, but mm save handler does not.
User save handler leaves locking to users.

I wonder if a better approach would be to implement an improved files
handler (under a different name) that had options for locking, caching
and the like. Say it was called "awesome_files"[0]; you might have
options like this:

session.save_handler = awesome_files
session.awesome_files.lock = On/Off

I'd love to have a more flexible files handler, but I don't think we
want to overspecialise the session implementation around it.

The behavior can be controlled by setting, so I would like to keep
single code files/mm/user and other save handlers.

I don't mind creating new handlers too much, but I'm concerned for having
multiple settings for each save handlers. (e.g. Memcached/MongoDB save
handler have it own lock setting. It would be better to have single setting
for any save handlers that support it.)

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Yasuo Ohgaki — view source

unread

Hi Adam,

How do you propose to check if session data was changed? For scalar

types it's pretty easy, but it's possible for objects to alter their
properties (including the ones they're persisting) on __wakeup() — I
presume it would effectively be a comparison of what's about to be
written versus what was read initially when it comes time to write the
session?

Session data is serialized. If content has changed, the data is changed.
Session could be large so I'm thinking using md5 hash to save memory.
Is there any better way?

I changed my mind.
Since session data may be cached, it's better to save as it is and compare
byte by byte. Having too large session data is not a good practice anyway.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Yasuo Ohgaki — view source

unread

Hi Adam,

I wonder if a better approach would be to implement an improved files

handler (under a different name) that had options for locking, caching
and the like. Say it was called "awesome_files"[0]; you might have
options like this:

session.save_handler = awesome_files
session.awesome_files.lock = On/Off

I'd love to have a more flexible files handler, but I don't think we
want to overspecialise the session implementation around it.

The behavior can be controlled by setting, so I would like to keep
single code files/mm/user and other save handlers.

I don't mind creating new handlers too much, but I'm concerned for having
multiple settings for each save handlers. (e.g. Memcached/MongoDB save
handler have it own lock setting. It would be better to have single setting
for any save handlers that support it.)

It seems we need to think of new files save handler name.

Perhaps, "files_ext"?
Any better names?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Stas Malyshev — view source

unread

Hi!

How do you propose to check if session data was changed? For scalar

I think the best way is not to check, but to let the user tell you.
I.e., I often found it useful to have some way to load session data, but
then to drop the lock and not write anything. It may be useful for
application which keeps the state in the session but changes the state
rarely and only in some well-defined points - e.g. when the app is
authorized (needs session for auth state) but only login/logout actually
change this state. That would reduce network traffic by such app for
session storage by almost 2x, even ignoring the fact that writes are
usually more expensive than reads on the server end.

Session module could have ini settings

session.lock = On/Off (On by default. Some save handlers already have
this)

I've got some concerns on this. I agree that it's a real issue — we do

We could have read-with-no-lock semantics, but taking into account the
above it's almost the same as read-with-lock+release-lock semantics, and
given the protocols for some storage engines, may internally probably do
exactly the same. Convenience function though may be useful.

read-with-no-lock + write however is very dangerous proposition and I
don't think we should support that - it makes, as you correctly pointed
out, shooting oneself in the foot extremely easy.

Which means for me that INI setting doesn't make a lot of sense, since
you dont want ALL of your sessions be read-only - you need to write
something to read something - and write without locks is a nightmare.

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Yasuo Ohgaki — view source

unread

Hi Stas,

On Fri, Nov 15, 2013 at 5:56 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:

How do you propose to check if session data was changed? For scalar

I think the best way is not to check, but to let the user tell you.
I.e., I often found it useful to have some way to load session data, but
then to drop the lock and not write anything. It may be useful for
application which keeps the state in the session but changes the state
rarely and only in some well-defined points - e.g. when the app is
authorized (needs session for auth state) but only login/logout actually
change this state. That would reduce network traffic by such app for
session storage by almost 2x, even ignoring the fact that writes are
usually more expensive than reads on the server end.

We need new user function that close session and does not write session
data. Current session module writes session data always. I'll add that.

session_close() - Close session and discard data.

Session module could have ini settings

session.lock = On/Off (On by default. Some save handlers already have
this)

I've got some concerns on this. I agree that it's a real issue — we do

We could have read-with-no-lock semantics, but taking into account the
above it's almost the same as read-with-lock+release-lock semantics, and
given the protocols for some storage engines, may internally probably do
exactly the same. Convenience function though may be useful.

read-with-no-lock + write however is very dangerous proposition and I
don't think we should support that - it makes, as you correctly pointed
out, shooting oneself in the foot extremely easy.

Which means for me that INI setting doesn't make a lot of sense, since
you dont want ALL of your sessions be read-only - you need to write
something to read something - and write without locks is a nightmare.

I think you've wrote this before I sent new mail. For files save handler,
when session.lock=Off, it will be

open-and-lock -> read session data -> close-and-unlock ->
script executed
-> open-and-lock -> write session data -> close-and-unlock

There may be race for writing, but full data is written to data file.

Current behavior is

open-and-lock -> read session data
script executed
-> write session data -> close-and-unlock

Therefore, other script execution is blocked until lock is released.
(i.e. call session_commit() or script execution ends as you already knew.)

New behavior

open-and-lock -> read session data -> close-and-unlock ->
script executed
-> open-and-lock -> write session data -> close-and-unlock

would not be a problem with write short circuit, since unchanged session
data is not written to session storage. User has to know what session
module is doing well to avoid problems, though.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Yasuo Ohgaki — view source

unread

Hi all,

We need new user function that close session and does not write session
data. Current session module writes session data always. I'll add that.

session_close() - Close session and discard data.

session_discard() may be better?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Stas Malyshev — view source

unread

Hi!

I think you've wrote this before I sent new mail. For files save handler,
when session.lock=Off, it will be

open-and-lock -> read session data -> close-and-unlock ->
script executed
-> open-and-lock -> write session data -> close-and-unlock

I don't think this is a very good idea, since what happens if somebody
else changed the state while script was executing? That change would be
killed by the write. The lock is not for writing, the lock is for data
consistency between reading in writing. If you gave up the lock, you
should not write afterwards because your data may be stale.

would not be a problem with write short circuit, since unchanged session
data is not written to session storage. User has to know what session
module is doing well to avoid problems, though.

That's exactly the problem. Using this method would mean anything you
write to a session can magically vanish and you wouldn't have any idea
why, just because some other script was executing at the same time and
have overwritten your data. "Read and never write" is OK, but read and
then write without locking is very dangerous.

--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Adam Harvey — view source

unread

I think you've wrote this before I sent new mail. For files save handler,
when session.lock=Off, it will be

open-and-lock -> read session data -> close-and-unlock ->
script executed
-> open-and-lock -> write session data -> close-and-unlock

I don't think this is a very good idea, since what happens if somebody
else changed the state while script was executing? That change would be
killed by the write. The lock is not for writing, the lock is for data
consistency between reading in writing. If you gave up the lock, you
should not write afterwards because your data may be stale.

I can see situations where that might be useful — where it's not so
important if the write gets clobbered so long as there's minimal
locking to ensure that the session file is at least valid.

I'm still on the fence about whether it's actually a useful thing to
support in php-src, though. It might be enough of a corner case to not
make it worth supporting in a bundled session handler, and I'm worried
the people might start giving advice that using it will "speed up your
application" and users who copy-paste configurations will silently
usedata.

As an aside, if we do implement it: Yasuo, you've convinced me that
session.lock is a reasonable name (instead of being handler-specific).
:)

Adam

11 years ago by Yasuo Ohgaki — view source

unread

Hi Adam and Stas,

I think you've wrote this before I sent new mail. For files save
handler,
when session.lock=Off, it will be

open-and-lock -> read session data -> close-and-unlock ->
script executed
-> open-and-lock -> write session data -> close-and-unlock

I don't think this is a very good idea, since what happens if somebody
else changed the state while script was executing? That change would be
killed by the write. The lock is not for writing, the lock is for data
consistency between reading in writing. If you gave up the lock, you
should not write afterwards because your data may be stale.

I can see situations where that might be useful — where it's not so
important if the write gets clobbered so long as there's minimal
locking to ensure that the session file is at least valid.

I'm still on the fence about whether it's actually a useful thing to
support in php-src, though. It might be enough of a corner case to not
make it worth supporting in a bundled session handler, and I'm worried
the people might start giving advice that using it will "speed up your
application" and users who copy-paste configurations will silently
usedata.

As an aside, if we do implement it: Yasuo, you've convinced me that
session.lock is a reasonable name (instead of being handler-specific).
:)

Memcached/Memcache/MongoDB session save handlers already have
lock option.

I think not few users are used to it already.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Yasuo Ohgaki — view source

unread

Hi Stas,

On Fri, Nov 15, 2013 at 6:19 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:

I think you've wrote this before I sent new mail. For files save handler,
when session.lock=Off, it will be

open-and-lock -> read session data -> close-and-unlock ->
script executed
-> open-and-lock -> write session data -> close-and-unlock

I don't think this is a very good idea, since what happens if somebody
else changed the state while script was executing? That change would be
killed by the write. The lock is not for writing, the lock is for data
consistency between reading in writing. If you gave up the lock, you
should not write afterwards because your data may be stale.

would not be a problem with write short circuit, since unchanged session
data is not written to session storage. User has to know what session
module is doing well to avoid problems, though.

That's exactly the problem. Using this method would mean anything you
write to a session can magically vanish and you wouldn't have any idea
why, just because some other script was executing at the same time and
have overwritten your data. "Read and never write" is OK, but read and
then write without locking is very dangerous.

I agree.
This behavior could be dangerous just like SQL without transaction.
Therefore, only users know what they are doing should use.

It's dangerous, but if user know what they are doing, it could be safe and
application runs faster. This should be documented well. IMO.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Stas Malyshev — view source

unread

Hi!

I agree.
This behavior could be dangerous just like SQL without transaction.

No, it is way, way more dangerous. SQL only changes the rows it touches,
session saves the whole session. It would be as if every SQL statement
would read whole database when you start the application, changed
something and then wrote the whole database - even the data you never
intended to touch - back to the disk, wiping out all the changes in all
other tables other applications changed since you have read it. Unlike
SQL, which operates only on small piece of data at a time usually,
session writes the whole thing in one bunch. There is no partial reads
in writes. So situation here is not even close to SQL - it is way worse.
The frame of breakage is much larger and the breakage is practically
guaranteed to occur.

It's dangerous, but if user know what they are doing, it could be safe and
application runs faster. This should be documented well. IMO.

I don't think there's a user that knows that they are doing enough to
use such thing (because that means they have to know exactly how every
request in their app ever is serialized and what the exact timing of
every request is, now and forever in the future, and I do not see how it
is feasible). There might be ones that think they do, but that would be
a dangerous delusion, and I don't think we should support it.

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Yasuo Ohgaki — view source

unread

Hi Stas,

On Fri, Nov 15, 2013 at 6:34 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:

I agree.
This behavior could be dangerous just like SQL without transaction.

No, it is way, way more dangerous. SQL only changes the rows it touches,
session saves the whole session. It would be as if every SQL statement
would read whole database when you start the application, changed
something and then wrote the whole database - even the data you never
intended to touch - back to the disk, wiping out all the changes in all
other tables other applications changed since you have read it. Unlike
SQL, which operates only on small piece of data at a time usually,
session writes the whole thing in one bunch. There is no partial reads
in writes. So situation here is not even close to SQL - it is way worse.
The frame of breakage is much larger and the breakage is practically
guaranteed to occur.

It's dangerous, but if user know what they are doing, it could be safe
and
application runs faster. This should be documented well. IMO.

I don't think there's a user that knows that they are doing enough to
use such thing (because that means they have to know exactly how every
request in their app ever is serialized and what the exact timing of
every request is, now and forever in the future, and I do not see how it
is feasible). There might be ones that think they do, but that would be
a dangerous delusion, and I don't think we should support it.

I agree your concern and I knew well what could be wrong since
I forgot to set "serializable" transaction option when I made session_pgsql
save handler.

However, I think this feature would be nice as optional feature with
BIG caution.

I hope many users are used to "unlocked" session data already thanks
to memcached/memcache/mongodb session save handlers.

I'll make this a voting option.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

Session cache, lock and write

Which means for me that INI setting doesn't make a lot of sense, since you dont want ALL of your sessions be read-only - you need to write something to read something - and write without locks is a nightmare.

New behavior

open-and-lock -> read session data -> close-and-unlock -> script executed -> open-and-lock -> write session data -> close-and-unlock

Which means for me that INI setting doesn't make a lot of sense, since
you dont want ALL of your sessions be read-only - you need to write
something to read something - and write without locks is a nightmare.

open-and-lock -> read session data -> close-and-unlock ->
script executed
-> open-and-lock -> write session data -> close-and-unlock