The results of the compression on the Canturbury Corpus

*All files underwent BWT before compression.

File

Size

B00

Compression rates

gzip-9

Compression rates

xargs.1

4,227

1,636

3.096

1,756

3.323

sum

38,240

12,100

2.531

12,772

2.672

ptt5

513,216

30,493

0.475

52,382

0.817

plrabn12.txt

481,861

138,040

2.292

194,277

3.225

lcet10.txt

426,754

101,351

1.900

144,429

2.707

Kenndey.xls

1,029,744

24,259

0.188

209,733

1.629

Grammar.lsp

3,721

1,184

2.546

1,246

2.679

fields.c

11,150

2.869

2.058

3,136

2.250

asyoulik.txt

125,179

37,767

2.414

48,829

3.131

alice29.txt

152,089

40,988

2.156

54,191

2.850

cp.html

24,603

7,259

2.360

7,981

2.595

E.coli

4,638,690

1,111,234

1.916

1,299,066

2.240

bible.txt

4,047,392

748,607

1.480

1,176,645

2.326

world192.txt

2,473,400

413,511

1.337

721,413

2.333

Totals

13,970,266

2,671,296

1.911 (1.530) bits/byte

3,927,856

2.483 (2.249)



icon main site