python批量插入到达梦数据库的三种方式性能对比

JAY.LIN 收录于未分类

2024-01-21 约 1147 字预计阅读 3 分钟

https://bing.ee123.net/img/rand?artid=135729794

python批量插入到达梦数据库的三种方式性能对比

最近工作中，遇到需要批量导入测试数据到达梦数据库表中。使用python程序，完成一定规模数据插入到达梦数据库表中，尝试了三种方式（for循环单条，批量插入，分批批量插入），对三种数据插入方案，进行了一次性能测试比对，记录一下测试心得

。

环境

Win11+python

虚拟机centos7+DM8数据库

编写python数据导入程序

编写以下python程序，分别实现for循环单条插入，批量插入，分批批量插入数据到DM8达梦数据库，数据量插入4万条，再分别统计三种方式的用时。

#创建一个数据库连接
import dmPython
import time
conn = dmPython.connect(user=‘SYSDBA’, password=‘SYSDBA’, server=‘192.168.31.112’, port=5236, autoCommit=True)
cursor = conn.cursor()
cursor.execute(“select sysdate()”)
res = cursor.fetchall()
print(res)
创建测试表
cursor.execute(‘CREATE TABLE IF NOT EXISTS test1 (id INTEGER PRIMARY KEY, value varchar)’)
#清空表，表可能存在，初始化测试环境
cursor.execute (‘TRUNCATE table test1’)
conn.commit()
print(‘python: 清空 success!’)
######### for 循环，单条插入性能测试########
统计总时间
start_time = time.time()
#插入数据
for i in range(1,40001):
cursor.execute (‘INSERT INTO test1 (id, value) values (?, ?)’,(i,f’Value{i}’))
conn.commit()
打印总时间
end_time = time.time()
total_time = end_time - start_time
print(f’for 循环，单条插入，Total time: {total_time} seconds’)
print(‘打印插入数据量’)
cursor.execute (" select COUNT(*) from test1")
res = cursor.fetchall()
print(res)
print(‘python: Single insert table success!’)
#########批量插入性能测试########
#清空表，初始化测试环境
cursor.execute (‘TRUNCATE table test1’)
conn.commit()
print(‘python: 清空 success!’)
准备要插入的数据
data = [(i, f’Value{i}’) for i in range(1,40001)]
统计总时间
start_time = time.time()
#插入数据
cursor.executemany (‘INSERT INTO test1 (id, value) values (?, ?)’,data)
conn.commit()
打印总时间
end_time = time.time()
total_time = end_time - start_time
print(f’批量插入,Total time: {total_time} seconds’)
print(‘打印插入数据量’)
cursor.execute (‘select COUNT(*) from test1’)
res = cursor.fetchall()
print(res)
print(‘python: Batch insert table success!’)
#########分批批量插入性能测试########
#清空表，初始化测试环境
cursor.execute (‘TRUNCATE table test1’)
conn.commit()
print(‘python: 清空 success!’)
data.clear()
统计总时间
start_time = time.time()
for i in range(1,40001):
data.append((i,f’Value{i}’))
if len(data) ==1000:#每批为1000条数据
cursor.executemany (‘INSERT INTO test1 (id, value) values (?, ?)’,data)
data.clear()
conn.commit()
打印总时间
end_time = time.time()
total_time = end_time - start_time
print(f’分批批量，Total time: {total_time} seconds’)
print(‘打印插入数据量’)
cursor.execute (‘select COUNT(*) from test1’)
res = cursor.fetchall()
print(res)
print(‘python: batched batch insert table success!’)
conn.close()

测试过程

执行程序：

结果显示：

分析总结

测试结果显示：for循环单次插入性能最低效，而分批批量插入，比单次批量插入的性能更高。

分批批量插入性能高于单次批量插入的性能，与我预期不符，查阅达梦数据库相关资料，大概理解了其中原因：

分批插入将大量数据分成小批次进行插入，可以减少数据库的压力，提高插入性能。

分批插入

性能高的几个点

：

可以降低数据库IO：一次插入大量数据时，数据库需要处理大量数据，可能导致IO瓶颈。分批插入可以将数据分成小批次，降低每次插入的数据量，从而降低数据库IO。
可以减少锁竞争：在并发访问的情况下，大量数据同时插入可能导致锁竞争，影响数据库性能。分批插入可以降低锁竞争，提高数据库并发性能。
可以提高插入速度：分批插入可以充分利用数据库的批量处理能力，提高数据插入速度。

达梦技术社区：eco.dameng.com

目录

python批量插入到达梦数据库的三种方式性能对比

python批量插入到达梦数据库的三种方式性能对比

环境

编写python数据导入程序

创建测试表

统计总时间

打印总时间

准备要插入的数据

统计总时间

打印总时间

统计总时间

打印总时间

测试过程

分析总结