[MySQL]如何从表中随机选取一行

经常我们都要从表中随机选出一行(或者几行),但是怎么选才会快点呢?这个囧,下面我特地做了个实验:

准备:

生成100,000条数据到speed.t3表中,包括两列,第一列AI不多说,第二列是md5(i)。用PHP生成:

1
2
3
4
5
6
7
8
9
10
<?php
set_time_limit(0);
$conn=mysql_connect("localhost","username","userpwd");
mysql_select_db("speed",$conn);
define("MAXN",1000000);
for ($i=1;$i<MAXN;++$i) {
    mysql_query("insert into `t3` (data) values('".md5($i)."')");
}
mysql_close($conn);
?>

法一:

1
SELECT * FROM `t3` ORDER BY RAND() LIMIT 0,1;
优点:易于理解,简单,纯SQL语句,不需要有ID列 缺点:可想而知,速度不会快

法二:

1
2
3
4
$range_result = mysql_query( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM `t3` ");
$range_row = mysql_fetch_object( $range_result );
$random = mt_rand( $range_row->min_id , $range_row->max_id );
$result = mysql_query( " SELECT * FROM `t3` WHERE `id` >= $random LIMIT 0,1 ");
原理:从列中找出ID的最大值和最小值,然后用PHP得到一个[最小,最大]区间内的随机值。由于通过delete,ID可能不连续,所以选取>=该随机值的第一行 优点:比较好理解,速度还不错 缺点:非纯SQL,依赖于外部,需要有ID列

法三:

1
2
3
4
$offset_result = mysql_query( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `t3` ");
$offset_row = mysql_fetch_object( $offset_result );
$offset = $offset_row->offset;
$result = mysql_query( " SELECT * FROM `t3` LIMIT $offset, 1 " );
表示不懂,大概是先得到偏移量然后取偏移量后第一条

法四:

1
SELECT * FROM `t3` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `t3` ) ORDER BY id LIMIT 1
表示完全不懂,用到了一个子查询。

法五:

1
SELECT t3.* FROM (SELECT FLOOR (RAND() * (SELECT count(*) FROM t3)) num ,@num:=@num+1 from (SELECT @num:=0) a , t3 LIMIT 1) b ,  t3 WHERE b.num=t3.id
有点囧了,更看不懂了

测试

经测试,法四速度极慢,所以注释掉。下面先给出PHP代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
<?php
class runtime
{
    var $StartTime = 0;
    var $StopTime = 0;

    function get_microtime()
    {
        list($usec, $sec) = explode(' ', microtime());
        return ((float)$usec + (float)$sec);
    }

    function start()
    {
        $this->StartTime = $this->get_microtime();
    }

    function stop()
    {
        $this->StopTime = $this->get_microtime();
    }

    function spent()
    {
        return round(($this->StopTime - $this->StartTime) * 1000, 1);
    }

};
function show()
{
    global $result;
    while($row = mysql_fetch_array($result))
    {
        echo $row[0] . " " . $row[1];
        echo "<br />";
    }
}
set_time_limit(0);
date_default_timezone_set("Asia/Shanghai");
$runtime=new runtime;
$conn=mysql_connect("localhost","username","userpwd");
mysql_select_db("speed",$conn);

/*
 * 法一
 */
$runtime->start();
$result=mysql_query("SELECT * FROM `t3` ORDER BY RAND() LIMIT 0,1");
show();
$runtime->stop();
echo "Time: ".$runtime->spent()." ms<br /><br /><br />";

/*
 * 法二
 */
$runtime->start();
$range_result = mysql_query( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM `t3` ");
$range_row = mysql_fetch_object( $range_result );
$random = mt_rand( $range_row->min_id , $range_row->max_id );
$result = mysql_query( " SELECT * FROM `t3` WHERE `id` >= $random LIMIT 0,1 ");
show();
$runtime->stop();
echo "Time: ".$runtime->spent()." ms<br /><br /><br />";

/*
 * 法三
 */
$runtime->start();
$offset_result = mysql_query( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `t3` ");
$offset_row = mysql_fetch_object( $offset_result );
$offset = $offset_row->offset;
$result = mysql_query( " SELECT * FROM `t3` LIMIT $offset, 1 " );
show();
$runtime->stop();
echo "Time: ".$runtime->spent()." ms<br /><br /><br />";

/*
 * 法四
 */
/*$runtime->start();
$result = mysql_query( "SELECT * FROM `t3` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `t3` ) ORDER BY id LIMIT 1" );
show();
$runtime->stop();
echo "Time: ".$runtime->spent()." ms<br /><br /><br />";*/

/*
 * 法五
 */
$runtime->start();
$result = mysql_query( "SELECT t3.* FROM (SELECT FLOOR (RAND() * (SELECT count(*) FROM t3)) num ,@num:=@num+1 from (SELECT @num:=0) a , t3 LIMIT 1) b ,  t3 WHERE b.num=t3.id" );
show();
$runtime->stop();
echo "Time: ".$runtime->spent()." ms<br /><br /><br />";
?>

得出的结果是:

[plain] 88443 dbb7f0097fe15075a19396229957e7a7 Time: 3625.1 ms

90233 0fd7b7d88bf1d997aec06d8d800bee5a Time: 1.1 ms

514251 9846559706dac43ac7f0546a292be1b5 Time: 847.9 ms

29668 4db3b4270aca22aa23c78c4acf712915 Time: 3.6 ms [/plain] 看来对于有ID的数据,法二法五还是比较方便的。

Comments