mysql建表、索引以及SQL性能优化 – 因为跑不快不敢删库跑路的DBA

1 前言
参考了部分网上资料整理而成。

2 设计部分
2.1 设计表注意事项
2.1.1 定义字段类型
尽可能精确地定义字段类型，包括类型和长度:如不要以字符类型声明纯数字字段，业务上tinyint够用的情况避免定义为int等。
2.1.2 尽可能使用not null
null需要更多的代码，更多的检查和特殊的索引逻辑。所以大多数时候应该使用not null，或者使用一个特殊的值，如0，-1作为默认值。
2.1.3 关于char 和varchar的选择
char类型定长，varchar类型变长。列长度不定，对空间要求高的情况下多使用varchar；列定长，对查询性能高的情况下多用char。
2.1.4 主键与外键
尽可能使用长度短的主键，在主键上无需建单独的索引。
外键会影响插入和更新性能，对于批量可靠数据的插入，建议先屏蔽外键检查。对于数据量大的表，建议去掉外键，改由应用程序进行数据完整性检查。
2.2 设计索引注意事项
2.2.1 不同表之间的相同属性值的字段需一致：
不同表之间的相同属性值的字段，列类型，类型长度，是否非空，是否默认值，需保持一致，否则无法正确使用索引进行关联对比。
2.2.2 适合建索引的情况：
1、在经常需要搜索的列上，可以加快搜索的速度。
2、在经常用在连接的列上，可以加快连接的速度。
3、在经常需要根据范围进行搜索的列上创建索引，因为索引已经排序，其指定的范围是连续的。
4、在经常需要排序的列上创建索引，因为索引已经排序，这样查询可以利用索引的排序，加快排序查询时间。
5、在经常使用在WHERE子句中的列上面创建索引，加快条件的判断速度。
2.2.3 不适合建索引的情况：
1、散列度很低的列，如性别。
2、有大量空值的列。
3、很少使用的列（很少作为条件或表连接条件的）。
4、频繁更新的列。
5、在一张表上不要建超过6个索引。

3 开发部分
3.1 不正确的使用索引列，可能会导致不使用索引，进行全表扫描
3.1.1 对索引列进行计算
不建议的写法：
Select col1,col2 from test where index_col/100 > 10;
应写为：
Select col1,col2 from test where index_col > 10*100;
3.1.2 对索引列进行拼接
不建议的写法：
Select col1,col2 from test where concat(first_name,’ ’,last_name = ‘ZHAN SAN’;
应写为：
Select col1,col2 from test where first_name = ‘ZHAN’ and last_name = ‘SAN’;
3.1.3 在索引列上使用is null或is not null
不建议的写法：
Select col1,col2 from test where price is not null;
逻辑上允许的情况下应写为类似如下的sql：
Select col1,col2 from test where price >=0;
3.1.4 在索引列上使用or
不建议的写法：
Select col1,col2 from test where first_name = ‘ZHAN’ or last_name = ‘SAN’;
应写为：
Select col1,col2 from test where first_name = ‘ZHAN’ union Select col1,col2 from test where last_name = ‘SAN’;
3.1.5 对索引列进行类型转换，或隐式的类型转换
不建议的写法：
如定义col1为char型
Select col1,col2 from test where col1 = 3;
应写为：
Select col1,col2 from test where col1 = ‘3’;
3.1.6 尽可能避免索引列在like的首字符使用通配符
不建议的写法：
Select col1,col2 from test where first_name like ‘%HA%’;
逻辑上允许的情况下应写为类似如下的sql：
Select col1,col2 from test where first_name like ‘ZHA%’;
3.1.7 如果索引是复合索引，那么必须使用到该索引中的第一个字段作为条件时才能保证系统使用该索引
Create index com_index on test (col1,col2,col3);
相当于建立了index(col1,col2,col3)、index(col1,col2)、index(col1) 3个索引。在查询时单独使用col2或col3是不会使用到该索引，必须使用col1才会使用到该索引。
3.2 一条SQL语句只能使用一个表的一个索引
在where条件中多个and的条件中，必须都是一个多列索引的key_part属性而且必须包含key_part1。各自单一索引的话，只使用遍历最少行的那个索引。
3.3 只查询需要的列，避免使用select *
Select * from test;
应写为：
Select col1,col2,col3 from test;
3.4 使用[not] exists 代替 [not] in
select col1,col2 from test where col1 in (select col3 from t2);
应写为：
select col1,col2 from test where exists (select 1 from t2 where test.col1 = t2.col3);
3.5 逻辑上能用union all的时候不要用union
union会对结果进行排序去重，union all则不会。如果已经确定数据不会包括重复行，或者你不在乎是否会出现重复的行时使用union all。
Select col1,col2 from test where col1 >=10 union Select col1,col2 from test where col2 >=10；
逻辑上允许的情况下应写为：
Select col1,col2 from test where col1 >=10 union all Select col1,col2 from test where col2 >=10；
3.6 没有必要时不要使用distinct 和order by
使用distinct 或order by会对结果集进行排序，在结果集较大的情况下会占用大量资源。
3.7 使用case when来避免多次扫描表
Select sum(col1) from test where col2 = 1;
Select sum(col1) from test where col2 = 2;
应写为：
Select sum(case when col2 = 1 then col1 else 0 end) as sum1, sum(case when col2 = 2 then col1 else 0 end) as sum2 from test;
3.8 大数据全表清空时使用truncate代替delete
Delete from test;
应写为：
Truncate table test;
3.9 查询少量记录时使用limit
加上 limit n可以增加性能。MySQL数据库会在找到n条数据后停止搜索，而不是继续往后查找下一条符合记录的数据。

发表评论 取消回复

发表评论取消回复