因为 int32 溢出导致的 peewee get_or_create时出现 peewee.IntegrityError错误事故记录
详细看这里:https://github.com/coleifer/peewee/issues/2948
关于使用 peewee 的 get_or_create 时候,出现 peewee.IntegrityError 报错
下面的是完整的代码日志
error handling worker <WorkerContext [taisan_task_ocr_service.start_ocr] at 0x7efd9bd31410>: (1062, "Duplicate entry '2147483647' for key 'found_video_cover_image_ocr_count.foundedvideocoverimageocrcounttable_video_id'")
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 7285, in get
return clone.execute(database)[0]
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 4581, in __getitem__
return self.row_cache[item]
~~~~~~~~~~~~~~^^^^^^
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6875, in get_or_create
return query.get(), False
^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 7288, in get
raise self.model.DoesNotExist('%s instance matching query does '
core.mysql.models.FoundedVideoCoverImageOcrCountTableDoesNotExist: <Model: FoundedVideoCoverImageOcrCountTable> instance matching query does not exist:
SQL: SELECT `t1`.`id`, `t1`.`video_id`, `t1`.`search_count`, `t1`.`created_at`, `t1`.`updated_at` FROM `found_video_cover_image_ocr_count` AS `t1` WHERE (`t1`.`video_id` = %s) LIMIT %s OFFSET %s
Params: [2397088498, 1, 0]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/nameko/containers.py", line 392, in _run_worker
result = method(*worker_ctx.args, **worker_ctx.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/code/services/taisan_task_ocr_service.py", line 25, in start_ocr
fvcisc, fvcisc_created = FoundedVideoCoverImageOcrCountTable.get_or_create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6877, in get_or_create
raise exc
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6872, in get_or_create
return cls.create(**kwargs), True
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6741, in create
inst.save(force_insert=True)
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6951, in save
pk = self.insert(**field_dict).execute()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 2036, in inner
return method(self, database, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 2107, in execute
return self._execute(database)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 2912, in _execute
return super(Insert, self)._execute(database)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 2625, in _execute
cursor = database.execute(self)
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 3330, in execute
return self.execute_sql(sql, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/playhouse/shortcuts.py", line 256, in execute_sql
return self._reconnect(super(ReconnectMixin, self).execute_sql, sql, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/playhouse/shortcuts.py", line 268, in _reconnect
raise exc
File "/usr/local/lib/python3.11/site-packages/playhouse/shortcuts.py", line 263, in _reconnect
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 3320, in execute_sql
with __exception_wrapper__:
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 3088, in __exit__
reraise(new_type, new_type(exc_value, *exc_args), traceback)
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 196, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.11/site-packages/peewee.py", line 3322, in execute_sql
cursor.execute(sql, params or ())
File "/usr/local/lib/python3.11/site-packages/pymysql/cursors.py", line 153, in execute
result = self._query(query)
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymysql/cursors.py", line 322, in _query
conn.query(q)
File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 563, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 825, in _read_query_result
result.read()
File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 1199, in read
first_packet = self.connection._read_packet()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 775, in _read_packet
packet.raise_for_error()
File "/usr/local/lib/python3.11/site-packages/pymysql/protocol.py", line 219, in raise_for_error
err.raise_mysql_exception(self._data)
File "/usr/local/lib/python3.11/site-packages/pymysql/err.py", line 150, in raise_mysql_exception
raise errorclass(errno, errval)
peewee.IntegrityError: (1062, "Duplicate entry '2147483647' for key 'found_video_cover_image_ocr_count.foundedvideocoverimageocrcounttable_video_id'")
下面是 peewee 的模型
class FoundedVideoCoverImageOcrCountTable(Model):
id = BigAutoField(primary_key=True)
video_id = IntegerField(unique=True)
search_count = IntegerField(default=0)
created_at = DateTimeField(
null=False,
constraints=[SQL('DEFAULT CURRENT_TIMESTAMP')],
help_text='使用数据库时间'
)
updated_at = DateTimeField(
null=False,
constraints=[
SQL('DEFAULT CURRENT_TIMESTAMP'),
SQL('ON UPDATE CURRENT_TIMESTAMP'),
]
)
class Meta:
database = super_taisan_db
table_name = 'found_video_cover_image_ocr_count'
数据库里面的 ddl
-- super_taisan.found_video_cover_image_ocr_count definition
CREATE TABLE `found_video_cover_image_ocr_count` (
`id` bigint NOT NULL AUTO_INCREMENT,
`video_id` int NOT NULL,
`search_count` int NOT NULL,
`created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `foundedvideocoverimageocrcounttable_video_id` (`video_id`)
) ENGINE=InnoDB AUTO_INCREMENT=76871889 DEFAULT CHARSET=utf8mb3;
执行的部分代码
class TaiSanTaskSubmitService:
name = 'taisan_task_ocr_service'
def start_ocr(self, parse_result_json: str):
# logger.debug(parse_result_json)
parse_result = ParseResult.model_validate_json(parse_result_json)
video_id: int = parse_result.extra['video_id']
fvcisc, fvcisc_created = FoundedVideoCoverImageOcrCountTable.get_or_create(video_id=video_id)
fvcisc: FoundedVideoCoverImageOcrCountTable
if fvcisc.search_count >= 1 and timedelta_seconds(get_utc_now_timestamp(), fvcisc.updated_at) < 4*HOUR:
logger.warning(f'当前图片重试次数过多,抛弃 video_id: {video_id}')
fvcisc.search_count += 1
fvcisc.save()
return
很显然问题出现在 FoundedVideoCoverImageOcrCountTable.get_or_create(video_id=video_id)
阶段
让我感到困惑的是二次读取的时候,为什么会出现 WHERE (
t1.
video_id = %s) LIMIT %s OFFSET %s Params: [2397088498, 1, 0]
?这和 peewee.IntegrityError: (1062, "Duplicate entry '2147483647' for key 'found_video_cover_image_ocr_count.foundedvideocoverimageocrcounttable_video_id'")
的 2147483647 对应不起来。我不知道为什么会出现 2397088498 这个值?
我一开始觉得是 peewee 出问题了,然后提了 issues 给 peewee 的作者
在作者回复之前,我自己也开始了排查,最后发现的原因是
video_id 的长度超过 int32 ,所以出现了溢出循环导致的
至于为什么会超出 int32 ,是因为 video 表的 id 用的是 int64(即 mysql 的 bigint)
但是 found_video_cover_image_ocr_count 表的 video_id 不小心设置为 int 了
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。