空间管理您的位置: 51Testing软件测试网 » 笑遍世界的测试技术 » 日志

All things are difficult before they are easy. 没有软件的裸机是一具僵尸，没有硬件的软件是一个幽灵。2012，专注于Linux和C语言，关注自动化、性能测试，关注开源社区和开源测试工具、方法，尝试测试团队管理！

嵌套查询与连接查询的性能

上一篇 / 下一篇 2010-10-26 13:42:18 / 个人分类：数据库

查看( 4652 ) / 评论( 0 ) / 评分( 0 / 0 )

   嵌套查询与连接查询的性能：连接查询一般较快；子查询很难被优化。（当然和DB优化有关，也可能子查询比连接查询快）其实不能一概而论的~~ 不过，问了下DBA同学，他建议是能用join的，尽量不要用嵌套查询。以下内容，部分是来自网上的一些观点，自己稍加整理的。
   子查询是实现关联式计算的一种实例，连接实现了关联式代数。关于关联式计算（relational calculus）和关联式代数（relational algebra），貌似比较数学或者理论性的东西理论，我也还没深入弄明白。
  很多人也说到，这个性能的比较，不能一概而论的，需要具体的每个Case具体分析。
   也有人说（并有在某种DBMS上用一个例子证明），子查询可以也可能被优化为与join一样的执行计划，性能可以一样的。
　说一下概念吧：在一个SELECT语句的WHERE子句或HAVING子句中嵌套另一个SELECT语句的查询称为嵌套查询，又称子查询。一个select...From...Where查询语句块可以嵌套在另一个select...From...Where查询块的Where子句中，称为嵌套查询。外层查询称为父查询，主查询。内层查询称为子查询，从查询。子查询可以嵌套多层，子查询查询到的结果又成为父查询的条件。子查询中不能有order by分组语句。先处理子查询，再处理父查询。子查询除非能确保内层select只返回一个行的值,否则应在外层where子句中用一个in限定符,即要返回多个值，要用in或者not in哦，所以当在编译过程中出现“子查询只返回一个值”的错误时，就要考虑是不是要用in和not in.
   可以自己用写个PL/SQL写一段来看执行时间的差异，不过由于dbms的优化，我执行了几次，结果都不每次效率一致，因为查询后可能在dbms端由缓存、优化之类的：

declare
time1 timestamp(3);
time2 timestamp(3);
cou number;
begin
select current_timestamp into time1 from dual;
select count(p.id) into cou from product p where p.company_id in (select c.id from company c);
--select count(p.id) into cou from product p inner join company c on p.company_id=c.id;
select current_timestamp into time2 from dual;
dbms_output.put_line(time1);
dbms_output.put_line(time2);
dbms_output.put_line(time2-time1);
dbms_output.put_line(cou);
end;

关于连接查询，以前总结过一下：http://www.51testing.com/index.php?uid-225738-action-viewspace-itemid-210222
下面是我所查到的网页中的一些摘录：
Joining should always be faster - theoretically and realistically. Subqueries- particularly correlated - can be very difficult to optimise. If you think
about it you will see why - technically, the subquery could be executed oncefor each row of the outer query.
Subqueries such as that described are one instance of the way that SQL implements relational calculus (you will see that it is basically an "Exists" type of
operation). Joins are an implementation of relational algebra. The optimisation of relational algebraic operations is *very* well understood, while the calculus is much more difficult to optimise...
Realistically, most good DBMSs will optimise a query such as yours to use a join instead, thus converting the implementation from calculus to algebra.
In general, subqueries - particularly correlated - should be avoided unless absolutely necessary. It makes the query harder to read/maintain, pushes more work onto the server, and is generally just a far less appropriate style. of SQL.
subquery is faster when we have to retrieve data from large number of tables.Because it becomes tedious to join more tables. join is faster to retrieve data from database when we have less number of tables
In general there is no reason to assume that a subquery will be faster or slower than a join. Specific cases can point one way or the other, but there are too many variables for a general answer. There are cases where a subquery should be faster - an EXISTS test against a JOIN, where the EXISTS stops at the first match but the JOIN has to deal with every match.
Note that in many cases the optimizer rewrites queries with correlated subqueries as outer joins, so in many cases the performance is the same.
更多相关讨论：
http://forums.devx.com/showthread.php?t=24593
http://www.eggheadcafe.com/software/aspnet/32705705/join-vs-subquery.aspx
http://blog.csdn.net/zxs820329/archive/2008/02/14/2094927.aspx

嵌套查询与连接查询的性能

相关阅读:

用户菜单

我的栏目

标题搜索

日历

我的存档

数据统计

RSS订阅