Posts tagged load data

MySQL 执行load data infile时同步原理及注意事项

0
   Master上执行
      mysql> load data INFILE ‘/tmp/del_proid.txt’ INTO table del_proid;
   在Binlog中记录的SQL是这样的
      load data LOCAL INFILE ‘/tmp/SQL_LOAD_MB-1-0′ INTO table del_proid
   并且在/tmp目录中也确实出现了这个文件,内容和del_proid.txt一模一样
   同步时,Master的Binlog Dump进程会读取SQL_LOAD_MB-1-0文件的数据传,送给Slave的IO进程写入Relaylog,当Slave同步到该语句时,SQL进 程从Relaylog中解出数据,仍以文件名SQL_LOAD_MB-1-0写入slave-load-tmpdir参数打定的目录下,然后执行
   load data LOCAL INFILE ‘/tmp/SQL_LOAD_MB-1-0′ INTO table del_proid
   完成后,删除该文件
   slave-load-tmpdir参数,在默认或未设置的情况下,使用tmpdir参数值文档:

It appears that the file you load with LOAD DATA INFILE on the master are automatically transferred via the replication log from the master to the slave. The slave loads these files when it gets to the LOAD DATA INFILE in the statement-based replication queue.

I’m inferring this from a couple of statements in the docs:

16.1: Backing Up Replication Slaves

If your MySQL server is a slave replication server, then regardless of the backup method you choose, you should also back up the master.info and relay-log.info files when you back up your slave’s data. These files are always needed to resume replication after you restore the slave’s data. If your slave is subject to replicating LOAD DATA INFILE commands, you should also back up any SQL_LOAD-* files that may exist in the directory specified by the –slave-load-tmpdir option.

16.1.2.3: Replication Slave Options and Variables

When the slave SQL thread replicates a LOAD DATA INFILE statement, it extracts the file to be loaded from the relay log into temporary files, and then loads these into the table. If the file loaded on the master is huge, the temporary files on the slave are huge, too. Therefore, it might be advisable to use this option to tell the slave to put temporary files in a directory located in some filesystem that has a lot of available space. In that case, the relay logs are huge as well, so you might also want to use the –relay-log option to place the relay logs in that filesystem.

mysql load data infile[zt]

0

如何提高mysql load data infile的速度
测试数据2.5G,共有数据9427567条。用的mysql的large服务器的配置。
load一次需要大概10分钟左右。
建的表用的是MYISAM,调整了几个session的参数值

SET SESSION BULK_INSERT_BUFFER_SIZE=256217728;
SET SESSION MYISAM_SORT_BUFFER_SIZE=256217728;

运行结果如下

Query OK, 9427567 rows affected, 1558 warnings (3 min 55.21 sec)
Records: 9427567 Deleted: 0 Skipped: 0 Warnings: 0
google到的还可以

set global KEY_BUFFER_SIZE=256217728;
alter table tablename disable keys;

如何load数据里面带反斜杠(backslash)”\” 的数据
由于如果你没有指定FIELDS子句,则默认值为假设您写下如下语句时的值:
FIELDS TERMINATED BY ‘\t’ ENCLOSED BY ” ESCAPED BY ‘\\’
所以,如果你的数据里面有反斜杠(backslash)”\”的时候,数据会出现被截断的问题。出现这种问题,只要写上如下的fields子句即可
FIELDS TERMINATED BY ‘\t’ ENCLOSED BY ” ESCAPED BY ”

如何load不同编码的数据
原来用的4.X的mysql,我是select INTO OUTFILE ,只后用iconv,或者其他软件来做。可以参考这里,但是由于这次数据大,用ultraedit等软件打开都要半天。好在新版的mysql可以增加一个新的参数
CHARACTER SET gbk
我的文本数据是GBK的编码,数据表是utf8的,用这种方法测试成功。
如何load的时候只insert特定的列
比如表里面有比元数据多的列。可以在load的时候指定要插入的字段名字。

示例的代码如下:

LOAD DATA INFILE ‘~/data.txt’
INTO TABLE fb0505
CHARACTER SET  gbk
FIELDS TERMINATED BY ‘\t’ ENCLOSED BY ” ESCAPED BY ”
LINES TERMINATED BY ‘\n’ STARTING BY ”
(seller_id,fb_type,fb_date,item_url);

其中表fb0505里面还有一列是id。

Go to Top