In our daily processing of TXT text files, we often encounter the need to clean up duplicate or similar redundant content in documents, which may be system-generated marks, repeated data entries, or specific descriptive words to be uniformly removed. However, since TXT files themselves do not have a deduplication function, manually searching for such similar content not only wastes time but also risks omissions. But we can use fuzzy matching methods to quickly identify paragraphs in the document that are similar to the keywords to be deleted, enabling batch deletion and cleanup.
Here we will introduce how to use formulas to fuzzy search for multiple similar keywords in TXT Notepad, and then batch delete them uniformly to make the document content more concise. Let's take a look at the operation!
When should you delete multiple structurally similar text or numbers in a TXT file?
Cleaning up duplicate data
Program-generated TXT files often contain a large number of duplicate or identically formatted numbers and text records. If you need to delete useless duplicate information, we can use fuzzy matching to clean up these structurally similar contents, making the TXT data file more concise and easier to analyze.
Deleting batch numbers
In some TXT data files, there are a large number of meaningless numbers or annotations. Deleting them individually is very cumbersome. Using formulas to fuzzy search for corresponding numbers can quickly batch delete them, improving data readability and processing efficiency.
Removing templated information
When processing TXT files generated by emails or systems, there will be structurally similar template content. To extract the core information, we need to delete these templated duplicate contents, retaining only specific keywords. You can batch delete text or numbers with corresponding structures through fuzzy search.
Effect preview of fuzzy search and batch deletion of keywords in TXT
Before processing:

After processing:

Operation steps for fuzzy searching and batch deleting keywords in TXT
1. Open [ HeSoft Doc Batch Tool ], select [Text Tools] - [Find and Replace Keywords in Text].

2. In [Add File] or [Import Files from Folder], choose a method to add the TXT files that need similar keywords deleted. You can also directly drag files into the area below to add them. After confirming the files are correct, click Next.

3. Enter the option setting interface, select [Use Formula to Fuzzy Find Text], enter the regular expression formula below the find keyword list, leave the replace keyword list below blank, and finally click Next again. Then click Browse to select the save location for the new file.

4. After the processing is complete, click the red path to open the folder and view the TXT file with the keywords successfully deleted.
