When multiple PDF files contain text with the same format but not entirely identical content, such as English months, years, numbers, or dates, manually opening each PDF to delete them is very time-consuming. This article takes batch deletion of "April" and four-digit years in PDFs as an example to introduce how to use the "Find and Replace Keywords in PDF" feature in HeSoft Doc Batch Tool , employing wildcards or formula-based fuzzy matching to locate keywords in bulk, and achieving batch deletion by leaving the replacement content blank.
When organizing contracts, reports, archives, data packages, or public version PDFs, a common practical problem often arises: many PDF files contain a type of text that needs to be deleted, but these texts are not exactly identical. For example, some files contain "April 13, 2017," others might have "May 20, 2018," and some include different years, different serial numbers, or different dates. Using a simple find operation means processing one term at a time; opening each PDF for manual deletion is not only time-consuming but also prone to omissions.
This article addresses the problem of "batch fuzzy deletion of PDF keywords." We will use the office software " HeSoft Doc Batch Tool ," through its "Find and Replace Keywords in PDF" function, to add multiple PDFs to a processing list at once, and then match variable content using formula-based fuzzy text search. The key point is: leaving the "Replacement Keyword List" empty achieves the effect of deleting the matched text. In the screenshot example, the goal is to delete the English month "April" and the four-digit year "2017" from the PDF cover date. After processing, the corresponding positions on the page are cleared, leaving only the middle part "13,".
Applicable Scenarios: What PDF Content is Suitable for Batch Deletion with Wildcards
Wildcards or formula-based fuzzy searches are suitable for processing PDF text that has patterns but is not identical. Examples include batch-deleting dates, years, months, serial numbers, version numbers, report numbers, contacts, fixed-format codes, etc., from many PDF files. Unlike deleting common keywords, fuzzy matching does not require the text to be completely identical in every file. As long as it can be described by a rule, it can be processed in batches.
For example, a four-digit year can be located using a pattern like "match 4 consecutive digits"; multiple English months can be matched simultaneously using a pattern like "April or May"; certain prefixes with numbers can also be located through formula rules. The advantage is clear: you don't need to enter different keywords for each PDF individually, nor repeatedly open files to check and delete.
HeSoft Doc Batch Tool is a batch document processing software designed for office scenarios. Its core value lies in centralizing repetitive file operations. It is suitable not only for individual PDFs but also for batch processing multiple PDFs in folders, making it particularly useful for high-frequency office tasks in administration, HR, legal, finance, archive organization, and data anonymization.
Effect Preview: Date Keywords Needing Deletion Exist in PDF Before Processing
In this example, the folder to be processed contains 4 PDF files, named 1.pdf, 2.pdf, 3.pdf, and 4.pdf. This means we are not processing just a single PDF, but applying the same find-and-delete rules to multiple PDFs at once.

Opening one of the PDFs reveals that the cover date position contains "April 13, 2017". The red boxes mark the parts needing processing: one is the English month "April," and the other is the four-digit year "2017". This type of content might vary across different PDFs, for example, the month might be different, the year might be different, thus making a simple exact match search inflexible.

If you only need to delete the fixed word "April," you can use an exact text search; but if you want to delete all four-digit years, or match multiple possible months simultaneously, using a formula-based fuzzy text search is more suitable. This allows "fixed words" and "variable words" to be processed within the same batch task.
Post-Processing Effect: Matched PDF Keywords Are Batch Deleted
After processing is complete, opening the PDF again shows that the location where "April 13, 2017" was originally displayed has changed. The English month "April" and the four-digit year "2017" have been deleted. Only the unmatched middle part "13," remains on the page. This indicates that the software has completed the PDF keyword deletion according to the set rules.

It should be noted that the red box in the screenshot marks the blank space after deletion. Since no replacement content was filled in, the software did not replace the text with other characters but directly deleted the matched content. This method is suitable for operations like localized PDF information cleanup, date anonymization, and version information removal.
Operation Step 1: Enter the PDF Tool and Select the Find and Replace Function
After opening HeSoft Doc Batch Tool , select "PDF Tools" from the left tool category. The main interface will display various PDF-related functions, such as adding watermarks to PDFs, converting PDFs to Word, deleting pages from PDFs, etc. The function to use this time is the first item, "Find and Replace Keywords in PDF."

The purpose of clicking this function is to enter the specific workflow for batch finding, replacing, or deleting PDF body content. For the needs of this article, we want to find the month and year in the PDF and leave the replacement content empty, thereby achieving batch deletion.
It is recommended here to first confirm that the text in your PDF is recognizable text. If the PDF is purely scanned images and the text itself is not selectable, the standard text find-and-replace usually cannot target it directly. You need to first confirm whether the file has undergone text recognition. For PDFs with copy-and-searchable body text, this type of batch find-and-replace function is more applicable.
Operation Step 2: Add the PDF Files to be Batch Processed
After entering "Find and Replace Keywords in PDF," you can see buttons like "Add File," "Import Files from Folder," "Clear," and "More" at the top of the interface. The example has already imported 4 PDF files, and the list shows their file names, paths, extensions, creation times, and modification times.

If the number of files is small, you can click "Add File" to select them one by one; if a folder contains a large number of PDFs needing unified processing, you can use "Import Files from Folder." After importing, it is recommended to check the record count and file paths in the list to ensure no incorrect files were selected. The bottom of the screenshot shows "Record Count: 4," indicating that the same batch of processing rules will be applied to the current 4 PDFs.
The expected outcome of this step is: all PDFs from which keywords need to be deleted are added to the pending list. Only files in this list will participate in the subsequent processing. Therefore, before clicking "Next," it's best to check if the file names and paths are correct.
Operation Step 3: Select Formula-based Fuzzy Text Search and Fill in Deletion Rules
Click "Next" to enter "Set Processing Options." In "Set Keyword Options," you can see that the "Search Method" includes "Exact Text Search" and "Formula-based Fuzzy Text Search." The content to be deleted in this example includes variable years, so "Formula-based Fuzzy Text Search" is selected.

In the "Keyword List to Find" on the left, the example fills in two lines of rules: the first line is "April|May," indicating a match for April or May; the second line is "\d{4}", indicating a match for consecutive 4-digit numbers, commonly used to match years. This way, the software will search for text matching these patterns in the PDF content.
On the right is the "Replacement Keyword List," where the interface clearly prompts "Leave empty to delete." Therefore, if the goal is to delete keywords rather than replace them with new text, do not fill in any content on the right side. Keeping the replacement list empty will cause the software to remove the text matched on the left from the PDF.
This step is very critical: if you want to batch delete keywords in a PDF, do not fill in spaces or any other characters on the right side; simply leave it empty. Filling in spaces might leave extra gaps on the page, and filling in other characters will turn it into a replace operation, not a delete operation.
Operation Step 4: Set the Save Location and Start Batch Processing
After completing the keyword rule settings, continue by clicking "Next." As seen from the process bar, there are two subsequent steps: "Set Save Location" and "Start Processing." The purpose of setting the save location is to decide where the processed PDFs will be output, to avoid overwriting the original files or causing file management chaos.
When batch processing PDFs, it is recommended to save the results in a separate output folder, such as "PDFs with Keywords Deleted" or "Processed PDFs." This makes it easy to compare the effects before and after processing and keeps the original files as a backup. For office scenarios involving important documents like contracts, reports, and archives, a safer practice is to keep the originals first and then inspect the output files.
After confirming the save location, enter the "Start Processing" step to execute the task. Once processing is complete, open the output PDFs and check the key areas to confirm whether the target months, years, or other keywords have been deleted. If the rules were set correctly, multiple PDFs will be automatically processed according to the same rules, eliminating the need for manual page-by-page searching.
Frequently Asked Questions and Notes
1. Why use formula-based fuzzy search instead of exact search?
If the keyword is exactly the same in every PDF, exact search is sufficient. But content like dates, years, and serial numbers often changes, for instance, 2017, 2018, 2019 might all appear. Using formula-based fuzzy search can match similar content in one go, making it more suitable for batch deleting non-fixed keywords from many PDFs.
2. Why should the replacement keyword list be left empty?
Because the goal here is deletion, not replacement. The interface prompt says "Leave empty to delete," so do not input any content on the right side. If new text is entered, the software will replace the matched content with that text.
3. Do I need to back up the PDFs before processing?
Backup is recommended. Batch processing is highly efficient, but if the rules are set too broadly, it might delete content that shouldn't be removed. Saving to a new folder first, then spot-checking the results, is a safer office workflow.
4. Can an entire folder be processed at once?
From the operation interface, you can see the "Import Files from Folder" button. Therefore, PDFs from a folder can be batch-imported into the list and then uniformly processed. This is especially useful for organizing dozens or hundreds of PDF documents.
Summary: Reduce Repetitive PDF Deletion Tasks with Batch Processing
The difficulty in batch deleting PDF keywords lies not in deleting a single word, but in how to process consistently and efficiently when dealing with many files, many pages, and much variable content. Using the "Find and Replace Keywords in PDF" function of HeSoft Doc Batch Tool , you can add multiple PDFs to a task list at once, use formula-based fuzzy text search to match dates, years, months, and other content, and achieve deletion by leaving the replacement list empty.
If you are processing a large number of PDF reports, archives, contracts, or public materials and need to delete dates, serial numbers, sensitive terms, or formatted information, you can follow the steps in this article to first test the rules with a small number of files. Once the effect is confirmed, then batch process the entire folder. This not only reduces repetitive work but also lowers the risk of manual omission errors, making PDF content cleanup work more efficient and controllable.