If multiple PDFs contain similar dates, years, numbers, or fixed keywords, opening and deleting them one by one is very inefficient. This article uses the example of batch deleting the month and four-digit year in PDF covers to explain how to use HeSoft Doc Batch Tool to perform fuzzy search and batch deletion in PDFs. The operation steps include selecting the PDF tool, entering find and replace keywords in PDFs, importing multiple PDFs, enabling the use of formulas for fuzzy text search, filling in the rules for April|May and four-digit numbers, and leaving the replacement content empty to achieve deletion.
In many office scenarios, PDFs are not processed one by one, but in batches: a batch of audit reports, a batch of project documents, a batch of public materials, a batch of scanned and archived contracts, or multiple PDFs exported from the same template. As long as these PDFs contain identical or similar text, batch cleanup needs may arise. For example, all PDF covers have dates, where the month might be different, or the year might be different, but you want to uniformly delete this information.
If processed manually, it usually involves a cycle of opening the PDF, locating the text, editing or redacting, saving, closing, and then opening the next file. It's acceptable when there are few files, but becomes typical repetitive labor when there are many. More troublesome is that text like dates, numbers, and years is often not completely identical, and a simple normal search cannot cover all cases at once.
This article introduces an approach more suitable for batch office work: using " HeSoft Doc Batch Tool " to perform wildcard/formula fuzzy search across multiple PDFs, and setting the replacement content to empty, thereby achieving batch deletion of PDF keywords. In the example, there are 4 PDFs in the folder, and the goal is to delete the month word and the four-digit year from the cover date, for instance, deleting April and 2017, while retaining the middle part 13,.
Applicable Scenarios: Batch Deleting Patterned but Not Completely Identical Text in PDFs
Wildcard fuzzy deletion is not only suitable for the date example in this article; it's better suited for handling PDF text with "patterned variations." The following scenarios are all common:
- Dates on PDF covers or in headers/footers that require batch deletion of the month, year, or complete date.
- Report numbers, project numbers, or customer numbers in multiple PDFs that require deletion of a fixed-format number.
- Occurrences of old company names, old department names, or old project codes within PDF content that need unified cleanup.
- Statistical periods from different years in documents, such as 2017, 2018, 2021, that need to be handled based on a four-digit number rule.
- Placeholder text in multiple PDFs exported from the same template that needs to be batch replaced with empty.
If the target text is completely identical, an exact search is sufficient; if the target text has multiple possible values or conforms to a certain format, then using "Use formula for fuzzy text search" is more appropriate. Its value lies in the user not having to list every single possible specific term, but rather describing a class of text using rules and letting the software batch process all PDFs.
HeSoft Doc Batch Tool is a batch document processing tool within office software, focusing not on the fine-tuning of a single file, but on helping users execute unified rules on a large number of files, reducing mechanical operations and improving processing efficiency.
Effect Preview: PDF Files Before Batch Processing and Content to be Deleted
Before processing, there are 4 PDF files in the sample folder, namely 1.pdf, 2.pdf, 3.pdf, and 4.pdf. They will all serve as the target files for this batch find and replace operation.

Opening one of the PDFs, you can see a date segment April 13, 2017 on the cover. The parts April and 2017 are marked with a red box in the screenshot, and these two parts are the content to be deleted this time. Since April is a month word and 2017 is a four-digit year, they can be handled with different fuzzy matching rules.

The key point here is: we do not want to delete all content in the entire date string, but only the parts matched by the specified rules. In other words, 13, does not need to be deleted, while the month and year need to be deleted. By setting up rules, the software can precisely handle the content that needs to be cleaned.
Post-Processing Effect: Matched Text in the PDF is Cleared
After the batch process is complete, when viewing the PDF page, the position where April was originally displayed has become blank, the position where 2017 was originally displayed has also become blank, while the middle part 13, still exists. This result is as expected, indicating that the batch fuzzy deletion has taken effect.

From the effect, the software executes the logic of "find and replace with empty." As long as the left-side rules can match the text in the PDF, and the right-side replacement content is empty, a deletion effect is achieved. For many scenarios requiring PDF keyword cleanup, this is more stable and easier to reuse than manual modification.
Operation Step 1: Open the Find and Replace Function in the PDF Tool
After launching HeSoft Doc Batch Tool , first select "PDF Tools" from the left-side tool category. The interface displays several PDF-related batch functions, including adding watermarks, deleting pages, converting formats, etc. This time, we need to process text within the PDF content, so select the first function, "Find and Replace Keywords in PDFs."

The purpose of this function is to batch find and replace keywords within the content of PDF files. Although the name includes "replace," when the replacement content is left blank, it can also achieve deletion. That is to say, deleting PDF keywords can be understood as a special kind of replacement: replacing the matched text with empty content.
After entering this function, the software will guide the operation according to a process, which includes selecting the records to process, setting processing options, setting the save location, and starting the process. This process design is suitable for batch processing because it separates file selection, rule setting, and output saving, making it convenient for users to confirm item by item.
Operation Step 2: Import Multiple PDFs and Verify the Processing List
After entering the "Find and Replace Keywords in PDFs" page, you first need to import PDFs. The upper-right area of the interface provides two common entry points: "Add Files" and "Import Files from Folder." If the number of PDFs is small, you can use "Add Files"; if all PDFs are in the same folder, using "Import Files from Folder" is usually more efficient.

The screenshot shows that 4 records have been successfully imported. The table lists the file name, path, extension, creation time, and modification time, with the bottom summary showing a record count of 4. Through this list, you can confirm whether the files to be processed this time are correct, avoiding the addition of irrelevant PDFs to the batch task.
At this step, it is recommended to carefully check two points: first, whether the file extensions are all pdf; second, whether the path is the directory you intend to process. The efficiency of batch processing is high, but it also means incorrect settings will affect multiple files, so it's very important to confirm the list before proceeding to the next step.
After confirmation, click "Next" at the bottom to enter the keyword find and replace rule setting page.
Operation Step 3: Choose to Use Formula for Fuzzy Text Search
On the "Set Processing Options" page, you first need to set the "Search Method." The interface provides "Exact Text Search" and "Use Formula for Fuzzy Text Search." If you only need to delete a fixed word, for example, deleting the same name from all PDFs, you can choose Exact Search; however, the month and year to be processed in this article have variation patterns, so you need to select "Use Formula for Fuzzy Text Search."

In the screenshot, "Use Formula for Fuzzy Text Search" is already checked. This method can be understood as using rules to search for PDF text, suitable for wildcard batch keyword deletion. It can combine multiple possible content types into one rule, and can also match formatted text like numbers and years.
In "Additional Options," you can see "Ignore letter case." Whether to check this needs to be decided based on the actual files. If the PDF might contain case variations like April, april, APRIL, ignoring case can improve match coverage; if the case itself has distinguishing meaning, it should be used cautiously.
Operation Step 4: Fill in Keyword Rules to Delete and Leave Replacement Content Empty
In the "Keyword List to Find," fill in two lines according to the screenshot example:
- April|May: means matching April or May. This is suitable for deleting multiple possible month words simultaneously.
- \d{4}: means matching four-digit numbers. For year-type content, such as 2017, 2020, 2026, this type of rule can be used for unified searching.
The area on the right is the "Replaced Keyword List." The red box in the screenshot marks the prompt "Leaving blank means deletion." Therefore, if the goal is to delete keywords, do not fill in the replacement content, just keep the right side empty.
This step is the core of the entire operation. The left side determines what to find, and the right side determines what to replace it with; when the right side is empty, the software will clear the text matched by the left side. Through this method, you can batch delete date segments, year numbers, or specified words in multiple PDFs.
It's important to note that the broader the rule, the larger the match scope. For example, \d{4} will match all four-digit numbers, not necessarily just years. If there are four-digit identifiers in the PDF, they might also be deleted. Therefore, in practical work, rules should be designed cautiously based on the document content, and tested on a small number of files first.
Operation Step 5: Set Save Location and Start Processing
After completing the keyword rule settings, click "Next" at the bottom of the page. Following the interface flow, you will then go to "Set Save Location," and then to "Start Processing." When batch processing PDFs, it is recommended not to overwrite the original files directly, but to save the processed results to a separate directory. This way, even if rules need adjustment, you can go back to the original files and reprocess.
After starting the process, the software will perform the find and replace on each PDF in the imported list one by one. For the 4 PDFs in the example, the software will find April or May, and all text matching the four-digit number rule, and replace these matches with empty. After processing is complete, open the output PDF to check, and you can see that the month and year have been deleted.
If the number of files processed is large, you can spot-check a few typical files first: ones containing April, ones containing May, ones with different years, and ones with different layouts. After confirming that the rule hits are stable, you can apply the same method to a larger batch of files.
FAQ and Notes
1. What is the difference between wildcard fuzzy deletion and normal find deletion?
Normal find usually requires the keyword to be exactly the same, for example, only finding April. Wildcard or formula fuzzy search can match based on rules, for example, April|May can match two words, and \d{4} can match four-digit numbers. For multiple PDFs with not entirely identical content, fuzzy searching saves more time.
2. Why is only 13, left after processing?
Because the find rules in this example only cover April, May, and four-digit numbers, and 13, was not written into the deletion rules. The software only processes the matched text and will not actively delete unmatched content, so 13, is retained. This also shows that the rule settings are targeted.
3. How should I approach deleting a complete date?
You can design a more complete find rule based on the actual format of the date. But before formal processing, you should verify it with sample files first to avoid accidentally deleting numbers or words that shouldn't be removed. This article only explains the method for deleting months and years based on the rules shown in the screenshots, without expanding on other buttons or advanced features not reflected in the screenshots.
4. What if the PDF text cannot be deleted?
If the content in the PDF is in the form of an image, rather than selectable and copyable text, the find and replace might not be able to match it. It is recommended to try selecting the text with a PDF reader first. If it cannot be selected, it means it might not be a normal text layer, and other processing methods should be considered based on the file type.
5. Will batch processing affect the original layout?
After find and replace with empty, the original text position will become blank, and other page content usually remains in place. Due to the complexity of PDF layouts, effects may vary between different files, so processed pages should be spot-checked afterwards, especially covers, headers, footers, and areas near tables.
Summary: Leave Repetitive PDF Text Cleanup to Batch Processing Tools
The key to batch deleting keywords from multiple PDFs is not how to modify a single file, but how to stably apply the same set of rules to a batch of files. The "Find and Replace Keywords in PDFs" function provided by HeSoft Doc Batch Tool can achieve wildcard-style matching through "Use Formula for Fuzzy Text Search," and achieve deletion by leaving the replacement content empty.
In the example of this article, by first importing 4 PDFs, then filling in the two find rules April | May and \d{4}, and finally leaving the replaced keyword list empty, the months and four-digit years in the PDFs could be batch deleted. For users who frequently process reports, contracts, archived materials, and externally published PDFs, this method can significantly reduce the time spent on repeated opening and manual editing.
It is recommended that you prepare a backup of the original files before use, select a small number of PDFs to test the wildcard rules, and then batch process the complete folder after confirming they are correct. This allows you to leverage the batch processing efficiency of the office software while reducing the risk of accidental deletion.