When the content to be deleted from multiple PDFs is not a completely fixed term, but rather variable text such as English months, 4-digit years, or serial numbers, you can use formula-based fuzzy matching for batch processing. This article, combined with the screenshot HeSoft Doc Batch Tool , demonstrates how to open the PDF tool, select "Find and Replace Keywords in PDF," import 1.pdf to 4.pdf, use "Use Formula to Fuzzy Find Text," enter April|May and \d{4} in the find list, and leave the replace list empty to delete the matched keywords. The article also explains the effect comparison, rule risks, and pre-check suggestions before batch processing.
In PDF file management, one of the most troublesome scenarios is needing to delete content that "follows a pattern but isn't identical." For example, some PDFs contain "April," while others have "May"; some years are 2017, others 2020; some number prefixes are the same, but the subsequent digits differ. When facing this kind of variable text, relying solely on a standard search means repeatedly entering different keywords, resulting in very low processing efficiency.
This article uses a practical case study to introduce how to use the batch PDF find-and-replace capability in HeSoft Doc Batch Tool , combined with formula-based fuzzy matching, to batch-delete variable keywords from multiple PDFs. The example involves four PDF files, with the text "April 13, 2017" appearing on the PDF pages. The goal this time is to delete the month and year: the month could be April or May, and the year is a four-digit number. After processing, the month and year disappear, while the date number "13," is preserved.
This type of operation is highly suitable for office scenarios in corporate administration, HR, finance, legal, and project management. Examples include batch-cleaning internal dates from external-facing PDF reports, removing old year identifiers, and deleting variable document numbers from contract templates. Compared to manual editing, the advantage of batch file processing is that rules are set up once and executed uniformly across multiple files, reducing repetitive work and lowering the probability of missed items.
Applicable Scenarios: Deleting Variable Text from PDFs, Not Just Fixed Words
If you only need to delete a fixed keyword like "Draft" or "Internal Use," standard exact find-and-replace is sufficient. However, if you want to delete a class of text—such as all English months, all four-digit years, or fixed-format document numbers—you need a more flexible fuzzy matching approach.
The example in this article represents a typical case of variable text deletion. The date in the original PDF is "April 13, 2017," where "April" might change between files, and "2017" could change to another year. Manually listing all complete dates would be very tedious; using a formula for fuzzy search allows you to use rules to describe "what to look for."
Common applicable scenarios include:
- Batch-deleting English months from PDFs, such as April, May, etc.
- Batch-deleting four-digit years from PDFs, such as 2017, 2024, 2026.
- Batch-deleting date fields from PDFs based on the same template, while preserving other body text.
- Batch-cleaning project numbers, version numbers, and batch numbers from old PDFs.
- Batch-processing unified markers in reports, manuals, and notification files.
It must be emphasized that more powerful fuzzy matching places higher demands on rule accuracy. A rule that is too broad might delete extra content, while a rule that is too narrow might miss some targets. Therefore, always verify the effect on a sample file before actual operation.
Result Preview: Before-and-After Comparison of Batch Processing
Before Processing: Four PDF Files Requiring the Same Rule
The folder before processing contains four PDFs, named 1.pdf, 2.pdf, 3.pdf, and 4.pdf. These are the targets for this batch task. For office software like HeSoft Doc Batch Tool , the entry point for batch processing is usually not opening files one by one, but first adding all target files to the same task list.

As seen in the PDF content screenshot, the page features the prominent date content "April 13, 2017." A red box marks the "April" and "2017" to be deleted. These two pieces of content are representative: one is candidate-word-based text, and the other is number-rule-based text.

After Processing: Content Matching the Rules Has Been Cleared
In the processed PDF, the original positions of the month and year are now blank, but the "13," in the middle was not deleted. This indicates that the software did not delete the entire date as a whole segment, but rather located and cleaned the content based on the keyword rules set by the user.

This result is very important for refined PDF content cleanup. Often, users do not want to delete an entire page or line, only specific variable fields. Using formula-based fuzzy matching and then replacing with nothing can achieve this goal more precisely.
Steps: Batch-Deleting PDF Keywords Using Formula-Based Fuzzy Matching
Step 1: Open the PDF Tools Category
After launching HeSoft Doc Batch Tool , the left side of the interface provides multiple file processing categories, including File Name, Folder Name, File Organization, Word Tools, Excel Tools, PowerPoint Tools, PDF Tools, etc. Since the processing target this time is PDF files, click "PDF Tools" on the left.
In the list of PDF tools functions, select "Find and Replace Keywords in PDF." This feature, described as batch finding and replacing keywords within PDF file content, perfectly suits this "find then delete" requirement.

The goal of this step is to enter the batch function related to PDF text content processing. Once done, the software takes you to a step-by-step task page, rather than having you open PDFs for editing one by one.
Step 2: Import the PDF Files to Process
After entering the function, the top of the page shows the current task is "Find and Replace Keywords in PDF." The first step is "Select records to process." In the upper right corner, you can see action buttons like "Add Files," "Import Files from Folder," "Clear," and "More."
If you don't have many PDFs, you can use "Add Files"; if the PDFs are already organized in the same directory, using "Import Files from Folder" is more efficient. In the example, four PDFs have been added to the list, with file names 1.pdf, 2.pdf, 3.pdf, 4.pdf, all extensions being .pdf, displaying the corresponding path and time information.

The goal of this step is to confirm the scope of files involved in the batch process. The expected result is a list containing only the PDFs needing processing and no extraneous files. Before formal processing, check the record count and file paths to avoid accidentally processing other PDFs.
Step 3: Select "Use Formula to Fuzzy Search Text"
Once files are confirmed without errors, click "Next Step" at the bottom to enter "Set processing options." Under "Search Method," the interface offers "Exact Text Search" and "Use Formula for Fuzzy Text Search."
In this case, select "Use Formula for Fuzzy Text Search." This is because what we want to delete is not a single fixed word, but text with a pattern: the month could be April or May, the year can be any four continuous digits. Using Exact Search makes it difficult to cover all these variations at once.

The goal of this step is to enable wildcard-style or formula-based search capabilities. The expected result is that the software will match PDF content based on the rules entered next, rather than searching only for textually identical strings.
Step 4: Enter the Keyword Expressions to Find
In the "List of Keywords to Find," the example has two lines entered: The first is "April|May", and the second is "\d{4}". These two rules correspond to the two types of deletion targets.
"April|May" can be understood as matching April or May, handling the inconsistency of English months across different PDFs. "\d{4}" can be understood as matching four consecutive digits, used to delete the year. This way, even if the year differs across PDFs, as long as it fits the four-digit rule, it can be found.
If the content you want to delete in your own files is different, replace the example rules with your own. For instance, to delete more months, expand the expression accordingly; to delete ID numbers, set a more suitable expression based on the ID format. Do not directly apply the rules without understanding your file content, especially constructs like "\d{4}" which can have a broad match range.
Step 5: Leave the "List of Keywords to Replace With" Empty
On the right side is the "List of Keywords to Replace With," where the interface prompts, "Leave blank to delete." This sentence is crucial, as it explains that deletion is not a separate button but achieved by "replacing with nothing."
In this example, we want April, May, and four-digit years to disappear from the PDF, so we enter nothing in the replacement column on the right. When the software runs, it will replace the matched content on the left with nothing, thus achieving the deletion effect.
The goal of this step is to transform the batch find-and-replace feature into a batch deletion feature. The expected result is that the matched keywords no longer appear in the output PDFs.
Step 6: Continue Setting the Save Location and Start Batch Processing
After completing the keyword settings, continue by clicking "Next Step." The page flow shows the subsequent stages are "Set Save Location" and "Start Processing." Although the screenshot does not show the specific options for these two pages, it is clear from the flow names that the user needs to first specify the save location for the processed PDFs and then start the task.
Here, it is recommended not to mix the processing results directly with the original files. A safer approach is to create a new output folder, such as "PDFs After Keyword Deletion" or "Output Results," and save the processed files there. This facilitates comparing original and new files and allows for a quick rollback if a rule is found to be unsuitable.
After starting the process, wait for the software to complete the batch task. Once finished, open at least a few PDFs for sample checks, especially examining pages containing the target fields, to confirm that the months and years are deleted and other content is preserved.
FAQ and Notes
1. What is the difference between formula-based fuzzy matching and normal keyword search?
Normal keyword search is suitable for completely identical text, while formula-based fuzzy matching is suitable for text with regular variations. For example, "April" is a fixed word, while "April|May" can match two words; "2017" is a fixed year, while "\d{4}" can match any four-digit number.
2. Will replacing with nothing affect the PDF layout?
Judging from the example results, the deleted positions leave blank spaces, and other content remains displayed. The layout structure of different PDFs may vary, so the final effect depends on the actual file. It is recommended to test on a sample PDF before batch processing.
3. How can I avoid accidentally deleting IDs or other numbers?
Do not blindly use rules that are too broad. For example, "\d{4}" will match all sequences of four consecutive digits, not necessarily only years. If the PDF contains report numbers, contract numbers, monetary amounts, etc., they could also be matched. Check the file content first and narrow the rule scope if necessary.
4. Do I need to check "Ignore Letter Case"?
The screenshot shows the "Ignore letter case" option, but it is unchecked in the example. If your PDFs have inconsistent capitalization, such as both "April" and "april," you can use this option based on your actual needs. Checking it depends on the match scope you intend.
5. Why should you back up before batch processing?
Batch processing is highly efficient, but once a rule is set incorrectly, it affects a batch of files, not just one. Backing up the original PDFs or outputting to a new directory is a fundamental practice to mitigate risk, especially for important files like formal reports, contracts, and archived materials.
Summary: Replace Repetitive Work with Rules for More Efficient PDF Batch Cleanup
This article demonstrated a typical workflow for batch-deleting PDF keywords: access the PDF tools in HeSoft Doc Batch Tool , select "Find and Replace Keywords in PDF," import multiple PDF files, choose "Use Formula for Fuzzy Text Search," enter "April|May" and "\d{4}" in the search list, and leave the replacement keyword list empty. Finally, the software deletes the matched months and years.
The value of this method lies in not requiring users to open PDFs one by one, nor requiring the target text to be completely consistent across files. As long as the content has a pattern, it can be batch-matched using rules. For office scenarios involving batch deletion of variable PDF text, batch cleanup of date fields, or batch processing of content in multiple files, this approach can save a significant amount of time.
If you are processing many PDFs, Word docs, .docx, .doc, or other office files, consider delegating highly repetitive cleanup tasks to batch processing tools. In practice, it is recommended to first test rules on a small number of files before expanding to batch execution for an entire folder. This enhances efficiency while also ensuring reliable processing results.