Extracting images from word documents and turning into PDF

So someone recently sent me a load of Word 2007 documents that had musical scores embedded as pictures. Nice and easy to convert these (losslessly) into PDF’s with one image per page using this img2pdf script

for i in *.docx; do
  pushd proc;
  mkdir t;
  pushd t;
  unzip ../../"$i";
  popd;
  python img2pdf.py -o "${i%.docx}.pdf" `find -name \*.png | sort`;
  rm -fr t;
  popd;
done

Leave a Reply

Your email address will not be published. Required fields are marked *