Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:61345 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 24323 invoked from network); 17 Jul 2012 11:34:53 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Jul 2012 11:34:53 -0000 Authentication-Results: pb1.pair.com header.from=alex.aulbach@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=alex.aulbach@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.160.42 as permitted sender) X-PHP-List-Original-Sender: alex.aulbach@gmail.com X-Host-Fingerprint: 209.85.160.42 mail-pb0-f42.google.com Received: from [209.85.160.42] ([209.85.160.42:59983] helo=mail-pb0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 7F/D4-28715-BDD45005 for ; Tue, 17 Jul 2012 07:34:52 -0400 Received: by pbbrp12 with SMTP id rp12so692183pbb.29 for ; Tue, 17 Jul 2012 04:34:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=Hh2qoVuGJNJNvvLxKCo0iZP8oBO7N0+QKZaJHuK2Yvw=; b=0hPGe2cbtFU5Bb7bF5WV5BBgufVfsm8WSc+Q7XMh5HD4o7S4xlpH3ty7Gqho/6Qqfs XJJ2ZVsKZuxMwbnjdMRapLmuxL9/ebF0FY94WYUkAAU0RVH5WWB6n6KG89lYxeF/oFgY sqgv1Cy1AlSZZZgTrNDFNazgk0W4wGnwTLselfeX1J4lT4nfZ6zSEVm/3jExuaf52bBM EdVipGK51s5yQH6w83benYQvE8+//e7I/Ifehb6eLRz7j7CHJURKnl7+Z6jsIgCyu72c bcnMafpQc6qyowhHDtgBVp9O8+8f8v1RDZU5Nsc0Laltbc46ZBpoPl9gJkFAL90Rdv2E RoAA== MIME-Version: 1.0 Received: by 10.68.193.195 with SMTP id hq3mr6406950pbc.30.1342524888650; Tue, 17 Jul 2012 04:34:48 -0700 (PDT) Received: by 10.68.31.7 with HTTP; Tue, 17 Jul 2012 04:34:48 -0700 (PDT) In-Reply-To: <5004775D.601@gmail.com> References: <5004775D.601@gmail.com> Date: Tue, 17 Jul 2012 13:34:48 +0200 Message-ID: To: =?ISO-8859-1?Q?=C1ngel_Gonz=E1lez?= Cc: Anthony Ferrara , Andrew Faulds , Nikita Popov , PHP internals Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: =?ISO-8859-1?Q?Re=3A_=5BPHP=2DDEV=5D_Random_string_generation_=28=E1_la_passwo?= =?ISO-8859-1?Q?rd=5Fmake=5Fsalt=29?= From: alex.aulbach@gmail.com (Alex Aulbach) 2012/7/16 =C1ngel Gonz=E1lez : >> 1a) If you want to support character classes, you can do it with pcre: >> http://www.php.net/manual/en/regexp.reference.character-classes.php > That's more or less what I have thought. > If it's a string surrounded by square brackets, it's a character class, > else > treat as a literal list of characters. > ] and - can be provided with the old trick of provide "] as first > character", > "make - the first or last one". Right thought. But introducing a new scheme of character-class identificators or a new kind of describing character-classes is confusing. As PHP developer I think "Oh no, not again new magic charsets". I suggest again to use PCRE for that. The difference to your proposal is not so big. Examples: "/[[:alnum:]]/" will return "abc...XYZ0123456789". We can do this also with "/[a-zA-Z0-9]/". Or "/[a-z0-9]/i". Or "/[[:alpha:][:digit:]]/" You see: You can do things in much more different ways with PCRE. And you continue to use this "standard". [And PCRE supports UTF8. Currently not important. But who knows?] And maybe we can think about removing the beginning "/[" and the ending "]/", but a "/" at the end should be optionally possible to add some regex-parameters (like "/i"). > Having to detect character limits makes it uglier. Exactly. That's why I think we need not so much magic to the second parameter. The character-list is just a list of characters. No magic. We can extent this with a third parameter to tell the function from which charset it is. And maybe a fourth to tell the random-algorithm, but I think it's eventually better to have a function for each algorithm, because that's the way how random currently works. If I should write it with php this looks like that: pseudofunction str_random($len, $characters, $encoding =3D 'ASCII', $algo) { $result =3D ''; $chlen =3D mb_strlen($characters,$encoding); for ($i =3D 0; $i < $len; $i++) { $result .=3D mb_substr($characters, myrandom(0, $chlen, $algo),1); } return $result; } Without testing anything. It's just an idea. This is a working php-function, but $encoding doesn't work (some stupid error?) and not using $algo: function str_random($len, $characters, $encoding =3D 'ASCII', $algo =3D nul= l) { $result =3D ''; $chlen =3D mb_strlen($characters,$encoding); for ($i =3D 0; $i < $len; $i++) { $result .=3D mb_substr($characters, rand(0, $chlen),1); } return $result; } > About supporting POSIX classes, that could be cool. But you then need a w= ay > to enumerate them. Note that isalpha() will be provided by the C > library, so you > can't count on having its data. It's possible that PCRE, which we bundle, > contains the needed unicode tables. It works without thinking as above written in PHP code, but I dunno if this could be done in C equally. >> 3. Because generating a string from character-classes is very handy in >> general for some other things (many string functions have it), I >> suggest that it is not part of random_string(). Make a new function >> str_from_character_class(), or if you use pcre like above >> pcre_str_from_character_class()? > How would you use such function? If you want to make a string out of them= , Oh, there are many cases to use it. For example (I renamed the function to "str_charset()", because it is just a string of a charset): // Search spacer strings strpbrk ("Hello World", str_charset('/[\s]/')); // remove invisible chars at begin or end (not very much sense, because a regex in this case is maybe faster) trim("\rblaa\n", str_charset('/[^[:print:]]/')); // remove invisible chars: when doing this with very big strings it could be much faster than with regex. str_replace(str_split(str_charset('/[^[:print:]]/')), "\rblaa\n"); There are many other more or less useful things you can do with a charset-string. :) --=20 Alex Aulbach