Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:110422 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 52043 invoked from network); 8 Jun 2020 14:18:40 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 8 Jun 2020 14:18:40 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 824F91804C3 for ; Mon, 8 Jun 2020 06:02:06 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com [209.85.221.53]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 8 Jun 2020 06:02:05 -0700 (PDT) Received: by mail-wr1-f53.google.com with SMTP id r7so17324467wro.1 for ; Mon, 08 Jun 2020 06:02:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=E/KyKbRz4zf2P1VB1mZiL6YlT0QiUkF5vxsv/wZuXOQ=; b=FI/KdzqRgbr8UQBNddp8rAzIJCX1o+f/VB2jOw5QnQjGX4D99IwbAjf1twK2XU+qV5 B67Xm0aiNsQiGCgJ9Kv+KQxQTZXhqQvKkvMv7O+wTIgROApvn9RiYfamyCu63ko4aIhr sfb+JC6CLYcI/gC7tIkqFqTw4TAjmA4+XUKDAF53RruYxK60W5H7ussw1GXJah4PKwBr 47RUJLVmiJKmDpFNhaAiEGZDULcX8VRRMbZ+ILxCAYELh5iPYsrsDsNogcefJQuN8Rwi +qxH+WgQVqrdsxdNmUoFhmcSZoGOvNXzJJ57r+g+7aLGnrQ8Pat50CtZ91Y468mUOyW1 lC5A== X-Gm-Message-State: AOAM533WxlHfQ6Dl4JoIyJerV5Bai0uxT4d+cvBpO7LRpecYfZ8FQYsk QmJCDniCFqbFeykA7/szqYVY02FzPESITWY10D9mE/hJZ4Y= X-Google-Smtp-Source: ABdhPJzrnRkG8GN3ooXo8t7TnmFpKmlrhQ2LouNRMjyzusDNQElKMpvXrjD5JOmFT4Eg+jW1/wye2aCCYC81bjJTOB4= X-Received: by 2002:a5d:4c87:: with SMTP id z7mr23227495wrs.100.1591621323258; Mon, 08 Jun 2020 06:02:03 -0700 (PDT) MIME-Version: 1.0 Date: Mon, 8 Jun 2020 09:01:52 -0400 Message-ID: To: PHP Internals Cc: Sara Golemon Content-Type: multipart/alternative; boundary="00000000000097028305a7923765" Subject: New functions `hash_serialize` and `hash_unserialize`? From: kohler@seas.harvard.edu (Eddie Kohler) --00000000000097028305a7923765 Content-Type: text/plain; charset="UTF-8" Hello internals! Thanks for PHP! I'm writing to gauge interest in two new functions to the PHP `hash` extension, `hash_serialize` and `hash_unserialize`. These functions would serialize and unserialize the internals of a HashContext object, allowing a partially-computed hash to be saved, then restored and completed in a later run. EXAMPLE: Multi-part upload. Say that a very large file is uploaded in pieces, `big.001` through `big.999`, and it is necessary to compute the SHA256 of the final concatenated file. Current PHP must compute the hash in one go: $ctx = hash_init("sha256"); for ($i = 1; $i <= 999; ++$i) { hash_update_file($ctx, sprintf("big.%.03d", $i)); } $hash = hash_final($ctx); This in turn requires that all pieces be on the filesystem simultaneously. With hash_serialize and hash_unserialize, the hash can be computed gradually, allowing pieces to be deleted as they are uploaded elsewhere. $ctx = hash_init("sha256"); hash_update_file($ctx, "big.001"); SAVE_TO_DATABASE(hash_serialize($ctx)); ... $ctx = hash_unserialize(LOAD_FROM_DATABASE()); hash_update_file($ctx, "big.002"); SAVE_TO_DATABASE(hash_serialize($ctx)); ... etc. *** I am happy to write up an RFC for these functions. An initial implementation with tests is visible here: https://github.com/kohler/php-src/commit/5a3a828f90b88cd7f660babec7db531cfc04b0a1 New functions `hash_serialize` and `hash_unserialize` appear to fit the existing API well, and simplify implementation, but it's possible that `__serialize/__unserialize` or the internal `serialize/unserialize` functions would be preferred. I'd be grateful for any feedback. Thanks! Eddie Kohler --00000000000097028305a7923765--