Convert All Word Documents to .odt


This post is not about a quick one-liner script that you can run to convert all Word documents on your PC to .odt files. But if that’s what you are looking for, here you go:

find ./ -type f \( -iname \*.doc -o -iname \*.docx \) -exec libreoffice --headless --convert-to odt {} \;

Just replace the ./ with the directory you want to recursively search in. Also note that this command creates a copy that is converted and does not delete the original document.

I have dozens of old Word documents on my computer dating back to my childhood days. I finally decided to do a bit of a cleanup and convert them all to .odt so that they are in an open file format that I can trust.

And instead of converting them all in one go using the script above, I decided to do them one by one, and see how well it worked, and to make sure I didn’t lose anything.

I started of just seeing how many I had:

find ./ -type f \( -iname \*.doc -o -iname \*.docx \) | wc -l

Which came to 1996. damn, way more than I thought. Looking through the results (without piping it to wc) I see there are a lot of old school assignments and notes, some programming notes, a lot of music (mainly chord sheets), and finally just some miscellaneous documents.

Most of the documents I no longer needed (like schoolwork) but I converted a lot of them anyway just to see how well it would work. I ended up deleting most of these documents.

What I learned

Practically all of the word documents I had were just simple text documents and the occasional image. I had zero problems converting these.

A lot of my school assignments made use of math formulas and tables, and I’d say about 99% of them converted without any issue. Not that it mattered because I deleted all my school work afterwards.

What I found interesting after converting a whole bunch of files was that for small files (less than ~20KB) the .odt version was larger than the .docx version. But if the original file was any larger than that, then the file actually got smaller after converting to .odt.