What makes us different from other similar websites? › Forums › Tech › Remove White Space from PDF file [Linux]
Tagged: Cropped PDF, Linux, PDF, Remove White Space PDF, Scanned PDF
- This topic has 1 reply, 1 voice, and was last updated 17 hours, 11 minutes ago by
thumbtak.
-
AuthorPosts
-
March 30, 2025 at 3:55 pm #8029
thumbtak
KeymasterI had scanned a file that was over 30 pages, and I wanted to find an easy way to crop all the pages to remove white space. This is how I did it.
Note: Create a new, dedicated folder and place only the PDF file you intend to process inside. The corrected PDF, named ‘cropped.pdf’, will be saved to this folder upon completion.
$ pdfimages -j input.pdf page $ mogrify -trim page-*.jpg $ mogrify -resize 1600x1200 page-*.jpg $ convert page-*.jpg cropped.pdf $ rm -rf *.jpg
April 4, 2025 at 9:24 pm #8035thumbtak
KeymasterHere is an SH script you can save, to help you do this, if you need to do this often.
#!/bin/bash # Script to extract images from a PDF, process them, and create a new PDF. # Ask the user for the input PDF filename read -p "Enter the name of the input PDF file: " INPUT_PDF # Check if the input PDF file was provided if [ -z "$INPUT_PDF" ]; then echo "Error: No input PDF file specified." echo "Usage: $0" exit 1 fi # Check if the input PDF file exists if [ ! -f "$INPUT_PDF" ]; then echo "Error: Input PDF file '$INPUT_PDF' not found." exit 1 fi OUTPUT_BASE="cropped" # Default base name for output files echo "Extracting images from '$INPUT_PDF'..." pdfimages -j "$INPUT_PDF" page echo "Trimming whitespace from extracted images..." mogrify -trim page-*.jpg echo "Resizing images to a maximum of 1600x1200..." mogrify -resize 1600x1200 page-*.jpg # Ask the user for the desired output PDF filename read -p "Enter the desired name for the final PDF file (without extension): " OUTPUT_NAME if [ -z "$OUTPUT_NAME" ]; then FINAL_PDF="${OUTPUT_BASE}.pdf" echo "Using default output filename: '$FINAL_PDF'" else FINAL_PDF="${OUTPUT_NAME}.pdf" fi echo "Creating the final PDF: '$FINAL_PDF'..." convert page-*.jpg "$FINAL_PDF" echo "Removing temporary JPEG files..." rm -rf *.jpg echo "Processing complete. The final PDF is: '$FINAL_PDF'"
How to use this script:
1. Save the code: Save the code above into a file, for example,
process_pdf.sh
.2. Make it executable: Open your terminal and navigate to the directory where you saved the file. Then run the command:
$ chmod +x process_pdf.sh
3. Run the script: Execute the script without any arguments:
$ bash process_pdf.sh
The script will first ask you:
Enter the name of the input PDF file:
You will then need to type the name of your PDF file and press Enter. After processing the images, it will then ask you for the output file name as before:
Enter the desired name for the final PDF file (without extension):
-
AuthorPosts
- You must be logged in to reply to this topic.