When the text to be deleted from multiple PDFs is not exactly the same—for example, the month might be April or May, and the year could be any four-digit number—manually deleting each one individually is very inefficient. This article, using the actual interface of HeSoft Doc Batch Tool , explains how to import multiple PDFs, use formulas to perform fuzzy text searches, and leave the replacement content blank, thereby batch-deleting variable keywords in PDFs. This is applicable to office scenarios such as report redaction, cleaning dates before publishing materials, and batch processing contract text.
Many office workers, when dealing with PDF documents, encounter a task that seems simple yet is very time-consuming: a batch of PDFs all contain some text that needs to be deleted, but this text isn't exactly the same. For example, one file might contain "April 13, 2017," while another might have "May 13, 2018," or the year, month, and number change from file to file. Relying on manual searching with a PDF reader could take several minutes per document, turning dozens of files into repetitive drudgery.
This article introduces a more suitable method for batch processing: using HeSoft Doc Batch Tool , select "Find and Replace Keywords in PDF" within the PDF tool, then use formulas for fuzzy text searching and replace the matched content with nothing. This allows for the batch deletion of non-fixed text in multiple PDFs. The core logic of this workflow is: first, add all PDFs to the task list; second, describe the text to be deleted using wildcards or formula rules; and finally, let the software process each file automatically.
Applicable Scenarios: Fuzzy Deletion is More Suitable for Non-Fixed Keywords Than Fixed Keywords
If you only need to delete a fixed word, such as a specific company name or a fixed project code, an exact search is sufficient. However, in actual document processing, the content to be deleted more often follows a pattern without being completely identical. For example, English months might have different values like April and May; a year could be any four-digit number like 2017, 2018, or 2020; report numbers might contain different serial numbers; and contract numbers could be composed of combinations of letters and digits.
For such content, entering fixed keywords one by one not only requires numerous rules but also easily leads to omissions. Using formula-based fuzzy text searching allows you to target a "category of text." For instance, the April|May shown in the screenshot can be understood as matching either April or May, and \d{4} can be understood as matching four-digit numbers. This method makes batch deletion of dates, years, numbers, and sensitive fields in PDFs more efficient.
Typical application scenarios include: deleting date information before externally publishing a PDF report; batch-cleaning non-public numbers in contract PDFs; deleting some years or months from audit materials; cleaning old version fields in training materials, manuals, and information packets; and uniformly desensitizing multiple PDF samples.
Effect Preview: PDF Files Needing Batch Cleanup Before Processing
In this example, there are 4 PDF files in the pending folder, named 1.pdf, 2.pdf, 3.pdf, and 4.pdf respectively. Although the number of sample files is small, the operation method is equally applicable to many more PDFs. For dozens or hundreds of files, the efficiency advantage of batch processing becomes even more apparent.

Opening one of the PDFs reveals date content "April 13, 2017" on the page. The screenshot highlights the two positions needing deletion, "April" and "2017", with red boxes. The requirement here is not to delete the entire PDF page or an entire paragraph title, but specifically to delete the text content matching the rules.

Such PDFs are typically reports, manuals, archived files, or public materials. If a document has dozens of pages, certain keywords may be scattered in different positions, and manually searching page by page can easily miss them. Using the batch find and replace function allows the software to process them automatically according to unified rules.
Effect Preview: Target Keywords Deleted After Processing, Other Content Preserved
After processing, the "April" and "2017" in the PDF have disappeared, leaving blank spaces in their original positions, while "13," remains. This result demonstrates that the software did not simply delete an entire line or paragraph but removed the matched text based on the keyword rules.

For office scenarios requiring desensitization or cleanup of fixed-format information, this method is very practical. It allows for the deletion of keywords matching specified rules while preserving the overall PDF layout, titles, stamps, footers, and other content. It is especially useful when multiple PDFs have similar content structures, as rules can be configured once and applied repeatedly.
Step 1: Open the Keyword Find and Replace Feature in the PDF Tool
After opening HeSoft Doc Batch Tool , first select "PDF Tools" in the left navigation bar. In the main interface function list, find "1. Find and Replace Keywords in PDF". The description for this function is "Batch find and replace keywords in PDF file content," which precisely matches the batch deletion of PDF text to be achieved in this article.

The purpose of selecting this function is to enter the workflow for find and replace at the PDF content level. The interface also shows other PDF functions, such as Add PDF Password Protection, Remove PDF Password Protection, Add Watermark to PDF, and Convert PDF to Word. However, as this task only involves cleaning keywords from the PDF body text, be careful not to select the wrong module.
Step 2: Import Multiple PDFs and Confirm the Task List
After entering the function page, the progress bar indicates you are at step 1: "Select records to process". The upper right area of the page provides two main entry points: "Add Files" and "Import Files from Folder". If files are scattered, use "Add Files"; if all PDFs are located in the same directory, using "Import Files from Folder" saves more time.

The screenshot shows that 4 PDFs have been imported, with the list displaying information such as serial number, name, path, extension, creation time, and modification time. The extensions are all pdf, confirming that only PDF files have been added. The summary at the bottom shows the record count is 4. After confirming the files are correct, click the "Next" button at the bottom.
At this step, it is recommended to carefully check two points: first, whether all PDFs needing processing have been added; second, whether any files that should not be modified have been accidentally included. If the list contains files that do not need processing, they can be removed using the delete icon in the operation column. Batch processing is highly efficient, but only if the task scope is accurate.
Step 3: Enable Formula-Based Fuzzy Text Search
Clicking "Next" takes you to "Configure Processing Options". Under "Set Keyword Options", you can choose the search method. The interface offers two choices: "Exact Text Search" and "Fuzzy Text Search Using Formulas". This example selects "Fuzzy Text Search Using Formulas" because the months and years to be deleted are not completely fixed character strings.

If you only needed to delete the word "April", choosing exact text search would work. But to simultaneously match April and May, or even all four-digit years, fuzzy rules should be used. The advantage of fuzzy search is that it can uniformly describe similar but not identical content, reducing the number of rules.
In the screenshot, two lines have been entered in the "List of Keywords to Find": April|May and \d{4}. The first line searches for April or May, and the second searches for four-digit numbers. The "List of Keywords for Replacement" on the right is left blank, and the interface clearly prompts that "Leaving it blank means deletion." Therefore, the software will delete the content matched on the left side rather than replacing it with other text.
Step 4: Achieve Deletion by Empty Replacement, Not by Entering a Space
Many users using find and replace for the first time wonder: when deleting keywords, should a space be entered on the right side? Based on the screenshot's prompt, the answer is no. Leaving the right side blank signifies deletion. If a space is entered, the matched text might be replaced with a space character, which visually and functionally differs from true deletion and could affect subsequent text copying or layout judgments.
Therefore, this example keeps the replacement keyword list empty. The English months matched by the first line, April|May, will be deleted; the four-digit years matched by the second line, \d{4}, will be deleted. After processing, only the parts of the PDF not matched by the rules, like the "13," in the example, will remain.
After completing the settings, click "Next" to proceed with "Set Save Location" and "Start Processing". For the first attempt, it is recommended to choose a new output location to avoid directly overwriting the original PDFs. Once finished, spot-check the processed PDFs to confirm that the rules did not accidentally delete other four-digit numbers that should be kept.
Common Questions and Precautions
1. Is formula-based fuzzy search the same as using wildcards? In terms of purpose, they are both used to match a category of non-fixed text. The interface calls it "Fuzzy Text Search Using Formulas." In practice, you can understand it as a more flexible rule-matching method than exact search.
2. Why were only April and 2017 deleted after processing, but not 13? Because the search rules only included April|May and \d{4}. 13 is a two-digit number, which does not match the four-digit number rule, nor does it equal April or May, so it was preserved.
3. What if there are other four-digit numbers in the PDFs, will they also be deleted? Any text matching \d{4} could potentially be matched. Therefore, before formal batch processing, it is recommended to first test with a few copied files or write more specific rules to minimize accidental deletions.
4. Can this be used for doc, docx, Excel, or other files? This article demonstrates PDF keyword processing within the PDF Tools. The software interface also shows classifications like Word Tools, Excel Tools, and PowerPoint Tools on the left, but you should select the corresponding tool for different formats. Do not apply the PDF workflow directly to doc, docx, or xlsx files.
5. Why might deletion fail for scanned documents? If the text within a PDF is actually an image and cannot be selected or copied, text-based find and replace might not recognize it. In this case, you need first to confirm whether the PDF contains an editable or searchable text layer.
Summary: The Key to Batch Deleting Non-Fixed Text in PDFs Lies in Writing Good Rules
The crux of batch deleting non-fixed text from multiple PDFs is not repeatedly clicking delete, but abstracting the content to be deleted into rules. Using HeSoft Doc Batch Tool , you can import multiple PDFs, then choose "Fuzzy Text Search Using Formulas" within the "Find and Replace Keywords in PDF" function, input rules like April|May and \d{4}, and leave the replacement content empty, thereby achieving batch fuzzy deletion.
For users who frequently handle reports, contracts, archival files, or publish PDFs externally, this method can significantly reduce repetitive work. It is recommended to prepare backup files before formal batch processing and validate the rules with a small number of samples. Once confirmed correct, proceed to process the entire batch of PDFs. This approach enhances efficiency and ensures more reliable file cleanup results.