sparse file WTFs on Linux
By joe
- 2 minutes read - 252 wordsCreate a big file …
[root@jr5-lab test]# dd if=/dev/zero of=big.file.1 bs=1 count=1 seek=1P
1+0 records in
1+0 records out
1 byte (1 B) copied, 0.000159797 seconds, 6.3 kB/s
[root@jr5-lab test]# ls -alF
total 4
drwxrwxrwx 2 root root 23 Dec 19 17:33 ./
drwxr-xr-x 6 root root 73 Nov 6 11:07 ../
-rw-r--r-- 1 root root 1125899906842625 Dec 19 17:33 big.file.1
[root@jr5-lab test]# ls -alFh
total 4.0K
drwxrwxrwx 2 root root 23 Dec 19 17:33 ./
drwxr-xr-x 6 root root 73 Nov 6 11:07 ../
-rw-r--r-- 1 root root 1.1P Dec 19 17:33 big.file.1
make a copy of the big file
[root@jr5-lab test]# cp --sparse=auto big.file.1 big.file.2
....
(Hangs … #fail) So strace this puppy
strace cp --sparse=auto big.file.1 big.file.2
...
lseek(4, 3145728, SEEK_CUR) = 5737807872
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3145728) = 3145728
lseek(4, 3145728, SEEK_CUR) = 5740953600
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3145728) = 3145728
lseek(4, 3145728, SEEK_CUR) = 5744099328
...
Seriously? WTF? This is 2012 … our utilities can’t sanely deal with sparse files? Really? And no, rsync is just as bad.
[root@jr5-lab test]# rsync --sparse --progress big.file.1 big.file.2
big.file.1
298844160 0% 57.36MB/s 5324:47:26
Maybe my google-fu is broken, but its looking like xfs utilities may be the only current way to sanely handle this, and they only handle this at a file system level (sigh).
[root@jr5-lab test]# xfs_bmap big.file.1
big.file.1:
0: [0..2147483647]: hole
1: [2147483648..2147483655]: 517019648..517019655
A big hole up front, and a tiny bit of file at the end. Yet cp et al cannot handle this correctly. Does this not seem … wrong?