千万级数据分区存储方案测试

发表于:2021-9-01 09:36

字体: | 上一篇 | 下一篇 | 我要投稿

 作者:lingxinggo    来源:CSDN

  场景
  基于项目中的业务场景,对数据量大,对时间比较敏感(比如历史数据基本不用,热点数据集中在近期)的数据采用分区存储。 本次测试流程主要是创建测试表、插入千万条数据(本次两千万),查看分区表执行情况。
  创建表
  CREATE TABLE test (
    id BIGINT(20) auto_increment,
  area_id INT (8) NOT NULL DEFAULT 0,
  create_time datetime NOT NULL ,
  num INT (8) NOT NULL DEFAULT 0,
  PRIMARY KEY (id, area_id, create_time)
  ) ENGINE = INNODB DEFAULT CHARSET = utf8 PARTITION BY RANGE (TO_DAYS(create_time))(
  PARTITION p0
  VALUES
  LESS THAN (TO_DAYS('2019-11-01')) ENGINE = INNODB,
  PARTITION p20191101
  VALUES
  LESS THAN (TO_DAYS('2019-11-02')) ENGINE = INNODB,
  PARTITION p20191102
  VALUES
  LESS THAN (TO_DAYS('2019-11-03')) ENGINE = INNODB,
  PARTITION p20191103
  VALUES
  LESS THAN (TO_DAYS('2019-11-04')) ENGINE = INNODB,
  PARTITION p20191104
  VALUES
  LESS THAN (TO_DAYS('2019-11-05')) ENGINE = INNODB,
  PARTITION p20191105
  VALUES
  LESS THAN (TO_DAYS('2019-11-06')) ENGINE = INNODB,
  PARTITION p20191106
  VALUES
  LESS THAN (TO_DAYS('2019-11-07')) ENGINE = INNODB,
  PARTITION p20191107
  VALUES
  LESS THAN (TO_DAYS('2019-11-08')) ENGINE = INNODB,
  PARTITION p20191108
  VALUES
  LESS THAN (TO_DAYS('2019-11-09')) ENGINE = INNODB,
  PARTITION p20191109
  VALUES
  LESS THAN (TO_DAYS('2019-11-10')) ENGINE = INNODB
  );

  创建存储过程
  DELIMITER //
      create PROCEDURE insert_test(in num INT, in time varchar(10))
      BEGIN
          DECLARE rowid INT DEFAULT 0;
          DECLARE create_time datetime;
  DECLARE area_id INT;
  SET @exedata = "";
          WHILE rowid < num DO
              SET create_time = (select str_to_date(CONCAT(time,' ',LPAD(FLOOR(0 + (RAND() * 23)),2,0),':',LPAD(FLOOR(0 + (RAND() * 59)),2,0),':',LPAD(FLOOR(0 + (RAND() * 59)),2,0)), "%Y-%m-%d %H:%i:%s") from dual);
  SET area_id = (select FLOOR(910000 + (RAND() * 9)));
  SET rowid = rowid + 1;
              IF length(@exedata)>0 THEN
              SET @exedata = CONCAT(@exedata,',');
              END IF;
              SET @exedata=concat(@exedata,"('",area_id,"','",create_time,"','",rowid,"')");
              IF rowid%5000=0
              THEN 
                  SET @exesql =concat("insert into test(area_id,create_time,num) values ", @exedata);
                  prepare stmt from @exesql;
                  execute stmt;
                  DEALLOCATE prepare stmt;
                  SET @exedata = "";
              END IF;
          END WHILE;
          IF length(@exedata)>0 
          THEN
              SET @exesql =concat("insert into test(area_id,create_time,num) values ", @exedata);
              prepare stmt from @exesql;
              execute stmt;
              DEALLOCATE prepare stmt;
          END IF; 
      END //
  DELIMITER ;


  调用存储过程插入数据
  CALL insert_test (1000000, '2019-10-31');
  CALL insert_test (1000000, '2019-11-01');
  CALL insert_test (1000000, '2019-11-02');
  CALL insert_test (1000000, '2019-11-03');
  CALL insert_test (1000000, '2019-11-04');
  CALL insert_test (1000000, '2019-11-05');
  CALL insert_test (1000000, '2019-11-06');
  CALL insert_test (1000000, '2019-11-07');
  CALL insert_test (1000000, '2019-11-08');
  CALL insert_test (1000000, '2019-11-09');

  分析
  1、两千万数据下,根据日期按小时平均在1s左右;保持单分区数据不变情况下,后期sql性能影响基本不会太大。
  2、此执行计划为全表扫描,经测试后发现分区后where已经问题不大,后期可尝试将主键优化(去掉ID)后,执行计划将来到range级别;
  3、未分区表,同样数据走索引情况下,多次执行平均在2.5s左右。
  select PARTITION_NAME as "分区",TABLE_ROWS as "行数" from information_schema.partitions where table_schema="test" and table_name="test";
  explain partitions  select  *  from  test  where  area_id = 910005 and  create_time > '2019-11-02 00:00:00' and create_time < '2019-11-02 23:59:59';
  explain 
  select avg(num) , DATE_FORMAT(create_time,'%Y%m%d%H') time   from  test  
  where  area_id = 910005 and  create_time > '2019-11-06 00:00:00' and create_time < '2019-11-06 23:59:59'
  group by time

  原文链接:https://blog.csdn.net/lingxinggo/article/details/102951673

  本文内容不用于商业目的,如涉及知识产权问题,请权利人联系51Testing小编(021-64471599-8017),我们将立即处理

《2023软件测试行业现状调查报告》独家发布~

关注51Testing

联系我们

快捷面板 站点地图 联系我们 广告服务 关于我们 站长统计 发展历程

法律顾问:上海兰迪律师事务所 项棋律师
版权所有 上海博为峰软件技术股份有限公司 Copyright©51testing.com 2003-2024
投诉及意见反馈:webmaster@51testing.com; 业务联系:service@51testing.com 021-64471599-8017

沪ICP备05003035号

沪公网安备 31010102002173号