Skip to content

--sql 参数解析idb数据 varchar、text、int 、bigint类型解析出的数据有问题! #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yikela1990 opened this issue Apr 10, 2024 · 15 comments

Comments

@yikela1990
Copy link

yikela1990 commented Apr 10, 2024

1、表结构
image

2、先查一下真实数据是什么
image

3、使用ibd2sql解析数据
$ python3.6 main.py /data0/mysqlsandbox/data/sandboxes/msb_8_0_33/data/testdb/dts_test_table.ibd --sql > /data0/dts_test_table.sql

4、看一下dts_test_table.sql里的id=100063数据是什么
image

@yikela1990 yikela1990 changed the title --sql 参数解析idb数据 varchar、text、bigint类型解析出的数据有问题! --sql 参数解析idb数据 varchar、text、int 、bigint类型解析出的数据有问题! Apr 10, 2024
@yikela1990
Copy link
Author

版本:
image

@ddcw
Copy link
Owner

ddcw commented Apr 10, 2024

1、表结构 image

2、先查一下真实数据是什么 image

3、使用ibd2sql解析数据 $ python3.6 main.py /data0/mysqlsandbox/data/sandboxes/msb_8_0_33/data/testdb/dts_test_table.ibd --sql > /data0/dts_test_table.sql

4、看一下dts_test_table.sql里的id=100063数据是什么 image

  1. ibd2sql是否为最新版呢 ? (我好像处理过类似的问题).
  2. 提供下DDL和测试数据, 我复现下. 非图片版的(我懒得敲-_-)
  3. 也可以提供下 --debug 的日志

@yikela1990
Copy link
Author

1、代码是直接clone的main分支的代码。
2、测试数据不好提供,因为我使用相同的表结构、插入了一条数据,然后再使用ibd2sql 导出data是没有任何问题的,您留一个邮箱,我把ibd文件发给你,这个表是纯测试数据,没有我们自己的数据。
3、debug日志:
debug.zip

@yikela1990
Copy link
Author

已发送邮箱

@ddcw
Copy link
Owner

ddcw commented Apr 10, 2024

已发送邮箱

收到. 我看下先.

@ddcw
Copy link
Owner

ddcw commented Apr 10, 2024

用ibd2sql 0.3 版本可以解析, 看来是新版本引入的BUG . 我尽快修改 1.0 的这个BUG
image

已发送邮箱

@yikela1990
Copy link
Author

好的 修复了以后麻烦update一下这个issuse,我再测一下

@ddcw
Copy link
Owner

ddcw commented Apr 10, 2024

好的 修复了以后麻烦update一下这个issuse,我再测一下

修复了, 你重新下载后, 再试下呢
修复记录:

# ibd2sql/innodb_page_index.py::record_header
self.instant = True

@yikela1990
Copy link
Author

python3.6 main.py /data0/dts_test_table.ibd --sql > /data0/dts_test_table.sql

Traceback (most recent call last):
File "main.py", line 5, in
from ibd2sql.ibd2sql import ibd2sql
File "/root/ibd2sql-main/ibd2sql/ibd2sql.py", line 7, in
from ibd2sql.innodb_page_index import *
File "/root/ibd2sql-main/ibd2sql/innodb_page_index.py", line 53
self.instant = True
^
TabError: inconsistent use of tabs and spaces in indentation

@ddcw
Copy link
Owner

ddcw commented Apr 10, 2024

python3.6 main.py /data0/dts_test_table.ibd --sql > /data0/dts_test_table.sql

Traceback (most recent call last): File "main.py", line 5, in from ibd2sql.ibd2sql import ibd2sql File "/root/ibd2sql-main/ibd2sql/ibd2sql.py", line 7, in from ibd2sql.innodb_page_index import * File "/root/ibd2sql-main/ibd2sql/innodb_page_index.py", line 53 self.instant = True ^ TabError: inconsistent use of tabs and spaces in indentation

你clone的还是自己修改的呢, 修改的话, 我代码里面是tab 不是空格

@yikela1990
Copy link
Author

再次运行,跑了一会就挂了
image

Traceback (most recent call last):
File "main.py", line 177, in
ddcw.get_sql()
File "/root/ibd2sql-main/ibd2sql/ibd2sql.py", line 229, in get_sql
for x in _tdata:
File "/root/ibd2sql-main/ibd2sql/innodb_page_index.py", line 226, in read_row
self._read_all_row()
File "/root/ibd2sql-main/ibd2sql/innodb_page_index.py", line 396, in _read_all_row
_data[colno],_expage[colno] = self._read_field(col)
File "/root/ibd2sql-main/ibd2sql/innodb_page_index.py", line 162, in _read_field
data = self.read(size).decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 53: invalid start byte

@ddcw
Copy link
Owner

ddcw commented Apr 10, 2024

再次运行,跑了一会就挂了 image

Traceback (most recent call last): File "main.py", line 177, in ddcw.get_sql() File "/root/ibd2sql-main/ibd2sql/ibd2sql.py", line 229, in get_sql for x in _tdata: File "/root/ibd2sql-main/ibd2sql/innodb_page_index.py", line 226, in read_row self._read_all_row() File "/root/ibd2sql-main/ibd2sql/innodb_page_index.py", line 396, in _read_all_row _data[colno],_expage[colno] = self._read_field(col) File "/root/ibd2sql-main/ibd2sql/innodb_page_index.py", line 162, in _read_field data = self.read(size).decode() UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 53: invalid start byte

已修复, 请重新下载最新版本试下呢,
我测试了下, 全部解析出来了. 3.3GB左右, 700W行左右数据 (可以看 /proc/PID/io 的rchar 来预估速度哈)
修复记录: recorde header 1-2bit is instant flag

@yikela1990
Copy link
Author

greate! 👍🏻

@yikela1990
Copy link
Author

yikela1990 commented Apr 11, 2024

https://cloud.tencent.com/developer/article/2407144?pos=comment

这个文章有一个地方有问题,就是推荐使用import tablespace的方式将ibd文件导入到现有数据库中
1、缺少 表名.cfg 文件
2、alter table xxx import tablespace的时候会报lsn错误

@ddcw
Copy link
Owner

ddcw commented Apr 11, 2024

  1. 因为不一定是flush export 出来的ibd文件, 所以不一定有.cfg文件. (无.cfg的话, 会有warning提示, 不影响导入): 官方教程: https://dev.mysql.com/doc/refman/8.0/en/innodb-troubleshooting-datadict.html
  2. 如果目标环境是新环境的话, 是可能存在lsn的问题. 而且ibd文件本身也可能存在问题. 这时候就可以使用ibd2sql来解析了. --page-start 100 --page-count 1 解析指定的页

https://cloud.tencent.com/developer/article/2407144?pos=comment

这个文章有一个地方有问题,就是推荐使用import tablespace的方式将ibd文件导入到现有数据库中 1、缺少 表名.cfg 文件 2、alter table xxx import tablespace的时候会报lsn错误

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants