MySQL IN/EXISTS优化问题

MySQL  IN/EXISTS优化问题

where条件in操作,当子查询的条件中有IN操作时效率低下,sql语句:

SELECT

a.id

FROM

A a

INNER JOIN B b ON a.id = b.id

WHERE

条件

AND a.id IN (SELECT c.id FROM C c LEFT JOIN D d ON c.id=d.id WHERE 条件)

尝试EXISTS同样效率低下,sql语句:

SELECT

a.id

FROM

A a

INNER JOIN B b ON a.id = b.id

WHERE

条件

AND EXISTS (

SELECT

c.id

FROM

C c LEFT JOIN D d ON c.id=d.id

WHERE

c.id = a.id

AND 条件

)

上面这两种方法在条件中in包含多个字符时query时间均大于20s,我看到有的这种sql跑了60s以上的

优化1:将子查询作为虚表用inner join改写

SELECT

a.id

FROM

A a

INNER JOIN B b ON a.id = b.id

INNER JOIN (

SELECT DISTINCT

c.id

FROM

C c LEFT JOIN D d

WHERE

条件

) con ON con.id = a.id

WHERE

条件

这种的query时间几乎为0s

优化2:将子查询外包一层作为虚表

SELECT

a.id

FROM

A a

INNER JOIN B b ON a.id = b.id

WHERE

条件

AND a.id IN (

SELECT

id

FROM

(

SELECT DISTINCT

c.id

FROM

C c LEFT JOIN D d

WHERE

条件

) AS temp

)

这种query时间大概在1s左右

总结:

当父表数据集大于子查询表时,EXISTS效率低于IN,反之;

mysql的IN是将括号中的常量全部保存在一个排好序数组中,匹配时为二分查找,时间复杂度O(logn);

mysql对IN包含子查询的处理很差,项目里还是要尽量避免子查询,用join处理效率很好;

可以将IN中的子查询外包装一层来优化,效率次之