I recently imported a large dataset using phpMyAdmin (no SSH) and learned a couple things along the way that I thought worth sharing (if only with my future self).
1) Use split
phpMyAdmin’s import has a maximum file size, which is significantly smaller than the dataset I was importing. My initial solution to this problem was to manually break up the text file containing the data. After doing this for a few minutes, I thought “I should make a tool to do this automatically.” Then I thought “Maybe someone already made that tool.” So I headed over to Google to find an application to download, only to discover it’s already on my machine. It’s a Unix command, aptly named “split.” Here’s how I used it:
split -l 250000 bigfile.csv
That gave me a bunch of files of 250,000 lines each, about 8MB in size, comfortably under my 10MB maximum.
2) Use gzip
phpMyAdmin can unpack compressed files. I got an error about file size headers when I tried using a zip file, so I went with gzip instead, and that worked fine, reducing my 8MB files to about 1.5MB. I used gzip like this:
gzip x*
You’re probably wondering: if my maximum upload size is 10MB, why didn’t I split into 30MB files and get 6MB gzips? I didn’t do that because of another limit: time. PHP scripts (including phpMyAdmin) can only run for so long at one go, and it takes a while to run 250,000 MySQL queries.
After that, it’s just a series of imports, which should run smoothly. If you’re willing and able to SSH to your database, of course, command-line MySQL makes this all a single step. But when that’s not an option, this will at least make the slow alternative less so.
One Comment
Is you MySpace blog export still working ?
I would love to export all my post (not just 10) to wordpress. I have an error when I try your blog export …
Thanks !
David
Post a Comment