Bulk Redaction Keyword Tutorial for PDFs: Use Wildcards to Clean Multiple PDFs of Months and Years at Once


Translation:EnglishFrançaisDeutschEspañol日本語한국어,Update Time:2026-06-11 09:42:52

Disclaimer: All images, text, and video content on the website are for reference only and may not be the latest, correct, or accurate. In case of any dispute, please refer to the actual experience effect!

This article explains how to use HeSoft Doc Batch Tool to perform wildcard fuzzy search across multiple PDF files using the "Find and Replace Keywords in PDF" function, and batch delete the matched content. The example requires processing 4 PDF files with the goal of deleting the English month names April or May, as well as 4-digit years from the PDF pages. The article will combine screenshots of before processing, after processing, and operation steps to illustrate how to import PDFs, select formula fuzzy search, fill in keyword expressions, leave the replacement content empty to achieve deletion, and remind users to pay attention to backup, as well as the differences between text-based PDFs and scanned PDFs.

In daily office work, PDF files are often used to archive reports, contracts, notices, manuals, and project materials. The problem is that PDFs are not as easily batch-edited directly as Word, docx, or doc documents. Once the same type of sensitive information, date fields, version markers, or fixed keywords appear in dozens or even hundreds of PDFs, opening, searching, and deleting them one by one is very time-consuming and prone to omissions.

This article addresses a very typical batch office problem: using wildcards or formula-based fuzzy matching to batch delete keywords in multiple PDF files. The example includes 4 PDF files, each containing date content similar to "April 13, 2017". We want to delete the English month "April" or "May", and also delete the 4-digit year, such as "2017", while retaining the middle date number "13,". If processed manually, you would need to open 4 PDFs and locate the content individually; if the number of files is larger, the repetitive labor increases exponentially.

With the office software " HeSoft Doc Batch Tool " shown in the screenshot, you can add multiple PDF files to a task at once, use the "Find and Replace Keywords in PDF" function, select "Use formula for fuzzy text search", and then keep the replacement keyword list empty to achieve the effect of batch deleting matched content. The core value of this type of tool is not single file editing, but batch processing files, reducing repetitive operations, and improving efficiency when handling office files like PDF, Word, Excel, and PowerPoint.

Applicable Scenarios: When is Batch Fuzzy Deletion of PDF Keywords Needed?

Batch deleting keywords in PDFs is suitable for scenarios where content formats have patterns, the number of files is large, and the cost of manual modification is high. Particularly when the content to be deleted is not a completely fixed word but a category of similar text, fuzzy searching with wildcards or formulas becomes more practical.

For example, many PDF covers or headers contain date information, which might be "April 13, 2017" or "May 08, 2020". If only ordinary exact search is used, all possible date variations need to be listed one by one; whereas using an expression like "April|May" can match multiple candidate words at once. Another example is that a year is typically 4 digits, and a pattern like "\d{4}" can be used to match 4 consecutive digits, thereby deleting different years in different files.

This type of operation is applicable to the following office scenarios:

  • Batch delete variable fields like dates, years, and months in PDF report covers.
  • Batch clean up old version numbers, old project codes, or batch numbers in multiple PDF contracts.
  • Batch delete some fixed sensitive words, internal markers, or temporary notes in public materials.
  • Batch process repetitive keywords in English PDFs and Chinese PDFs, reducing manual searching and modification.
  • Perform unified content cleanup on multiple PDF files before archiving, external distribution, or data masking.

If your task is to "delete a fixed word", exact search is sufficient; if your task is to "delete a category of text with a pattern", such as English months, 4-digit years, numbers, amount formats, version numbers, etc., using formula-based fuzzy search is more suitable.

Effect Preview: Changes Before and After Processing

Before Processing: Multiple PDF Files Require Unified Cleanup

Before processing, there are 4 PDF files in the folder, named 1.pdf, 2.pdf, 3.pdf, and 4.pdf. All of them need the same content processing. If you open each PDF individually and manually search for and delete the month and year, the steps are not only repetitive but it is also difficult to guarantee consistent processing for each file.

image-Batch delete keywords in PDF,delete text with wildcards in PDF,batch replace content in PDF,fuzzy search and delete in PDF

From the PDF page content, the example files contain a date like "April 13, 2017". The screenshot uses red boxes to mark the two types of content to be deleted: one is the English month "April", and the other is the 4-digit year "2017". The middle "13," is not the target for deletion this time, so more precise rules are needed to only delete the matched month and year.

image-Batch delete keywords in PDF,delete text with wildcards in PDF,batch replace content in PDF,fuzzy search and delete in PDF

After Processing: Matched Month and Year Are Deleted

After processing is complete, open the PDF to check, and you can see that the original position of "April" is now blank, the original position of "2017" is also cleared, while the middle "13," remains. This indicates that this batch processing did not delete the entire date segment, but deleted the specified types of text according to the set fuzzy matching rules.

image-Batch delete keywords in PDF,delete text with wildcards in PDF,batch replace content in PDF,fuzzy search and delete in PDF

This effect is very suitable for PDF batch processing tasks requiring "partial deletion". Users can use formula matching to find content with common patterns, and then achieve deletion through empty replacement, avoiding manual modification file by file.

Operation Steps: Using Wildcards to Batch Delete Keywords in Multiple PDFs

Step 1: Enter the PDF Tool and Select Find and Replace Keywords in PDF

After opening HeSoft Doc Batch Tool , you can see different categories of office processing on the left, such as Word Tools, Excel Tools, PowerPoint Tools, PDF Tools, etc. Because we are processing PDF files this time, first enter the "PDF Tools" category.

In the PDF tools list, select "Find and Replace Keywords in PDF". From the interface description, it can be seen that this function is used to batch find and replace keyword content in PDF files. Although this article's example is called "delete keywords", the implementation method is essentially "find and replace with empty", meaning that after finding the target content, no new replacement text is written, thus achieving a deletion effect.

image-Batch delete keywords in PDF,delete text with wildcards in PDF,batch replace content in PDF,fuzzy search and delete in PDF

The operation purpose of this step is to enter the correct batch processing function module. The expected result is to open a wizard-style processing page, where subsequent tasks can be completed following the process of "Select Records, Set Processing Options, Set Save Location, Start Processing".

Step 2: Add the PDF Files to Be Processed

After entering the function page, the first step is "Select records to process". On the top right of the interface, you can see buttons like "Add Files", "Import Files from Folder", "Clear", "More", etc. For a small number of PDFs, you can click "Add Files" to select them one by one; if the PDFs are all in the same folder, you can use "Import Files from Folder", which is more suitable for batch processing.

In the example, 4 PDF files have been imported, and the list shows the file name, path, extension, creation time, and modification time. The files include 1.pdf, 2.pdf, 3.pdf, 4.pdf, located under a test directory on the D drive. The summary below shows the record count as 4, indicating that this task will process these 4 PDFs simultaneously.

image-Batch delete keywords in PDF,delete text with wildcards in PDF,batch replace content in PDF,fuzzy search and delete in PDF

The operation purpose of this step is to add all PDFs for which keywords need to be batch-deleted to the task list. The expected result is to see all target PDFs in the file list and confirm the record count is correct. If a file is added by mistake, it can be removed using the delete icon on the right side of the list; if reselection is needed, you can also use "Clear" and then re-import.

Step 3: Enter Processing Options and Select Use Formula for Fuzzy Text Search

After confirming the file list is correct, click "Next" at the bottom of the page to enter "Set Processing Options". In the "Set Keyword Options" area, you can see "Search Method". Here there are two choices: "Exact Text Search" and "Use Formula for Fuzzy Text Search".

Since this task is not just deleting one fixed text, but deleting "April or May" and any 4-digit year, you need to select "Use Formula for Fuzzy Text Search". This option is suitable for processing text with certain patterns, such as multiple candidate words, numbers with fixed digit counts, years in dates, etc.

image-Batch delete keywords in PDF,delete text with wildcards in PDF,batch replace content in PDF,fuzzy search and delete in PDF

The operation purpose of this step is to allow the software to search PDF content according to more flexible rules, rather than only finding identical strings. The expected result is that after subsequently filling in formula or wildcard expressions in the keyword list, the software can match corresponding text according to the rules.

Step 4: Fill in the Keyword Rules to Be Deleted

In the "Keyword List to Find", the example has two lines filled in. The first line is "April|May", and the second line is "\d{4}". Understood from the usage scenario, "April|May" is used to match the English months April or May; "\d{4}" is used to match 4 consecutive digits, which is a common year format, e.g., 2017, 2020, 2026, etc.

The key here is: do not just treat the screenshot's example as a fixed answer, but adjust the rules according to your own PDF content. If your PDF requires deleting January, February, March, you can also write the corresponding months into the rule; if you want to delete a certain type of number, you can also use an expression suitable for the number pattern.

The example does not have "Ignore letter case" checked, meaning that case might affect the matching result. If the PDF has both "April" and "april", the user needs to decide whether to enable the ignore case option based on the actual situation, or write different case forms separately.

Step 5: Keep the Replacement Keyword List Empty to Achieve Deletion

On the right side, you can see the "Replacement Keyword List", with a red prompt next to it saying "Leaving blank means deletion". This is the key operation of this article: if you want to delete the found content, you do not need to enter new replacement text, just keep the right side empty.

That is to say, the processing logic this time is: find "April or May" in the PDF and replace the found content with empty; then find 4 consecutive digits and also replace the found content with empty. After this processing, the original English month and year will be cleared, while content not matching the rules will remain.

The operation purpose of this step is to convert "batch replacement" into "batch deletion". The expected result is that in the processed PDF, all text matching the rules will no longer be displayed.

Step 6: Continue to Next Step, Set Save Location and Start Processing

After setting the search rules and deletion method, click "Next". Following the interface flow, there are two more stages: "Set Save Location" and "Start Processing". Although the screenshot does not expand the save location page, it can be reasonably inferred from the wizard steps that the user needs to choose the save location for the processed files according to the interface prompts, and then proceed to the start processing stage.

It is recommended to choose a new output directory before formal processing, or at least ensure there is a backup of the original files. The advantage of batch processing is processing multiple files at once, but it also means that if the rules are written incorrectly, multiple files will be affected simultaneously. Therefore, before processing a large number of PDFs, it is best to test the effect with 1 or 2 sample files first, confirm the deletion scope is correct, and then execute batch processing.

After processing is complete, open the output PDF to check. The result in the example shows that the positions of the month and year have become blank, while "13," still remains, indicating the rule is effective.

Common Questions and Precautions

1. Why use formula-based fuzzy search instead of exact search?

Exact search is suitable for deleting completely identical content, such as deleting the word "Internal Material" from all PDFs. But if the content to be deleted has variations, like different months, different years, different numbers, exact search would require listing many text entries. Formula-based fuzzy search can describe a category of text using rules, making it suitable for batch deleting variable keywords in PDFs.

2. Why can the "Replacement Keyword List" be left blank?

As the screenshot prompt indicates, "Leaving blank means deletion". This means the software, after finding the target text, writes no replacement content, effectively clearing the target text. For batch deleting PDF keywords, this is a very direct method of operation.

3. Is it guaranteed to work on scanned PDFs?

If the text in the PDF itself is selectable and copyable, find and replace is usually easier to take effect. If the PDF is a scanned image, the words on the page may just be picture content and might not be recognized by the text search function. When dealing with scanned documents, it is recommended to test with a small number of files first to confirm whether the target text can be matched.

4. What is the impact of writing rules incorrectly?

If a rule is written too broadly, it may delete content that should not be deleted. For example, "\d{4}" will match all 4 consecutive digits, which might not only be years but also part of a number. Therefore, before processing, observe the PDF content to confirm that such rules will not accidentally harm other important information.

5. Is a backup necessary before batch processing?

Backup is recommended. The efficiency of batch processing files is high, but caution is also needed. Especially for important materials like contracts, formal reports, and archived files, keeping the original files first and then outputting the processed new files is a safer office workflow.

Summary: Using Batch Processing Tools to Reduce Repetitive PDF Deletion Work

The core idea of batch deleting PDF keywords is not complicated: first add multiple PDF files to the task, then use "Find and Replace Keywords in PDF", select "Use Formula for Fuzzy Text Search", fill in matching rules in the keyword list to find, and finally keep the replacement content empty to achieve batch deletion.

In this article's example, through the two rules "April|May" and "\d{4}", the English months and 4-digit years in multiple PDFs were batch deleted. Compared to opening PDFs one by one and manually searching, this method can significantly reduce repetitive labor, especially suitable for processing a large number of office files with similar content formats.

If you frequently need to clean up repetitive content in files like PDF, docx, doc, xlsx, pptx, you can prioritize using office software like HeSoft Doc Batch Tool to delegate repetitive operations to the batch processing workflow. It is recommended to test rules with sample files first, and then execute batch tasks on the complete folder, which can both improve efficiency and reduce the risk of accidental deletion.


Keyword:Batch delete keywords in PDF , delete text with wildcards in PDF , batch replace content in PDF , fuzzy search and delete in PDF
Creation Time:2026-06-11 09:42:31

Disclaimer: All images, text, and video content on the website are for reference only and may not be the latest, correct, or accurate. In case of any dispute, please refer to the actual experience effect!

Related Articles

Don't see the feature you want?

Provide us with your feedback, and after evaluation, we will implement it for free!