Add README, LICENSE, Makefile + use tabs

2026-04-17 22:27:06 +01:00
parent 38dfc87459
commit d89ffc2e9e
4 changed files with 805 additions and 116 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,60 @@
+# `recover-pdfs` recovers all PDFs from any kind of disk or filesystem.
+
+- Did you accidentally delete one or all of your PDF documents?
+- Do you need an important PDF bank statement from an old computer, but the 2008 hard-drive doesn't boot?
+
+I built `recover-pdf` as a filesystem-agnostic tool for recovering PDF documents.
+It literally looks over every bit of information on your disk and anything that looks
+like a PDF gets put into a recovery directory.
+
+Usage: `recover-pdfs <DISK_OR_IMAGE_FILE> <BACKUP_DIRECTORY`
+
+DISK_OR_IMAGE_FILE can be replaced with the actual disk device in /dev, e.g. /dev/sda1.
+Alternatively, DISK_OR_IMAGE_FILE can be an ISO or any similar image file.  Basically
+it can be anything made up of 1s and 0s and readable by C file APIs.
+
+BACKUP_DIRECTORY iss the directory/folder where you'd like the
+backed up PDFs to go.  The program does not do de-duplication.
+This can be done with other tools, such as `duff *`.
+
+# Build, install, and run
+
+```{bash}
+# Compile the program
+make
+# Optionally, install it to /usr/bin
+# sudo cp recover-pdfs /usr/bin
+# Set up a backup directory
+mkdir -p recovered_pdfs
+# Run the program
+./recover-pdfs old_laptop.img recovered_pdfs
+```
+
+# Alternative workflow with pipe viewer
+
+If you'd like a progress bar, you can use pipe viewer to see how much of the
+disk/image file has been processed.
+
+```{bash}
+# Install pipe viewer if you don't have it already
+sudo apt install pv
+# Run the program with pipe viewer to see progress
+pv old_laptop.img | ./recover-pdfs - recovered_pdfs
+```
+
+# Recommended followup routine
+
+```{bash}
+# De-duplicate recovered PDFs
+(cd recovered_pdfs && rm $(duff -e *))
+# Sort by size to quickly identify 'run-awy' PDFs that are hundreds of megabytes or more
+du -sh $(du -sb recovered_pdfs | sort -n) 2>/dev/null
+```
+
+By default, only PDFs up to 100MB are recovered.  If you have a lot of large
+PDFs, you can increase this limit at the top of the `recover-pdfs.c` source file, and then recompile.
+
+Specifically, change this line:
+```{c}
+#define MAX_PDF_SIZE (100LL * 1024 * 1024)
+```