经常我们都要从表中随机选出一行(或者几行),但是怎么选才会快点呢?这个囧,下面我特地做了个实验:
准备:
生成100,000条数据到speed.t3表中,包括两列,第一列AI不多说,第二列是md5(i)。用PHP生成:
1
2
3
4
5
6
7
8
9
10
<?php
set_time_limit ( 0 );
$conn = mysql_connect ( "localhost" , "username" , "userpwd" );
mysql_select_db ( "speed" , $conn );
define ( "MAXN" , 1000000 );
for ( $i = 1 ; $i < MAXN ; ++ $i ) {
mysql_query ( "insert into `t3` (data) values('" . md5 ( $i ) . "')" );
}
mysql_close ( $conn );
?>
法一:
1
SELECT * FROM ` t3 ` ORDER BY RAND () LIMIT 0 , 1 ;
优点:易于理解,简单,纯SQL语句,不需要有ID列 缺点:可想而知,速度不会快
法二:
1
2
3
4
$range_result = mysql_query( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM `t3` ");
$range_row = mysql_fetch_object( $range_result );
$random = mt_rand( $range_row->min_id , $range_row->max_id );
$result = mysql_query( " SELECT * FROM `t3` WHERE `id` >= $random LIMIT 0,1 ");
原理:从列中找出ID的最大值和最小值,然后用PHP得到一个[最小,最大]区间内的随机值。由于通过delete,ID可能不连续,所以选取>=该随机值的第一行 优点:比较好理解,速度还不错 缺点:非纯SQL,依赖于外部,需要有ID列
法三:
1
2
3
4
$offset_result = mysql_query( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `t3` ");
$offset_row = mysql_fetch_object( $offset_result );
$offset = $offset_row->offset;
$result = mysql_query( " SELECT * FROM `t3` LIMIT $offset, 1 " );
表示不懂,大概是先得到偏移量然后取偏移量后第一条
法四:
1
SELECT * FROM ` t3 ` WHERE id >= ( SELECT FLOOR ( MAX ( id ) * RAND ()) FROM ` t3 ` ) ORDER BY id LIMIT 1
表示完全不懂,用到了一个子查询。
法五:
1
SELECT t3 . * FROM ( SELECT FLOOR ( RAND () * ( SELECT count ( * ) FROM t3 )) num , @ num : =@ num + 1 from ( SELECT @ num : = 0 ) a , t3 LIMIT 1 ) b , t3 WHERE b . num = t3 . id
有点囧了,更看不懂了
测试
经测试,法四速度极慢,所以注释掉。下面先给出PHP代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
<?php
class runtime
{
var $StartTime = 0 ;
var $StopTime = 0 ;
function get_microtime ()
{
list ( $usec , $sec ) = explode ( ' ' , microtime ());
return (( float ) $usec + ( float ) $sec );
}
function start ()
{
$this -> StartTime = $this -> get_microtime ();
}
function stop ()
{
$this -> StopTime = $this -> get_microtime ();
}
function spent ()
{
return round (( $this -> StopTime - $this -> StartTime ) * 1000 , 1 );
}
};
function show ()
{
global $result ;
while ( $row = mysql_fetch_array ( $result ))
{
echo $row [ 0 ] . " " . $row [ 1 ];
echo "<br />" ;
}
}
set_time_limit ( 0 );
date_default_timezone_set ( "Asia/Shanghai" );
$runtime = new runtime ;
$conn = mysql_connect ( "localhost" , "username" , "userpwd" );
mysql_select_db ( "speed" , $conn );
/*
* 法一
*/
$runtime -> start ();
$result = mysql_query ( "SELECT * FROM `t3` ORDER BY RAND() LIMIT 0,1" );
show ();
$runtime -> stop ();
echo "Time: " . $runtime -> spent () . " ms<br /><br /><br />" ;
/*
* 法二
*/
$runtime -> start ();
$range_result = mysql_query ( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM `t3` " );
$range_row = mysql_fetch_object ( $range_result );
$random = mt_rand ( $range_row -> min_id , $range_row -> max_id );
$result = mysql_query ( " SELECT * FROM `t3` WHERE `id` >= $random LIMIT 0,1 " );
show ();
$runtime -> stop ();
echo "Time: " . $runtime -> spent () . " ms<br /><br /><br />" ;
/*
* 法三
*/
$runtime -> start ();
$offset_result = mysql_query ( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `t3` " );
$offset_row = mysql_fetch_object ( $offset_result );
$offset = $offset_row -> offset ;
$result = mysql_query ( " SELECT * FROM `t3` LIMIT $offset , 1 " );
show ();
$runtime -> stop ();
echo "Time: " . $runtime -> spent () . " ms<br /><br /><br />" ;
/*
* 法四
*/
/*$runtime->start();
$result = mysql_query( "SELECT * FROM `t3` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `t3` ) ORDER BY id LIMIT 1" );
show();
$runtime->stop();
echo "Time: ".$runtime->spent()." ms<br /><br /><br />";*/
/*
* 法五
*/
$runtime -> start ();
$result = mysql_query ( "SELECT t3.* FROM (SELECT FLOOR (RAND() * (SELECT count(*) FROM t3)) num ,@num:=@num+1 from (SELECT @num:=0) a , t3 LIMIT 1) b , t3 WHERE b.num=t3.id" );
show ();
$runtime -> stop ();
echo "Time: " . $runtime -> spent () . " ms<br /><br /><br />" ;
?>
得出的结果是:
[plain] 88443 dbb7f0097fe15075a19396229957e7a7 Time: 3625.1 ms
90233 0fd7b7d88bf1d997aec06d8d800bee5a Time: 1.1 ms
514251 9846559706dac43ac7f0546a292be1b5 Time: 847.9 ms
29668 4db3b4270aca22aa23c78c4acf712915 Time: 3.6 ms [/plain] 看来对于有ID的数据,法二 和法五 还是比较方便的。