In the process of daily processing TXT text, we often encounter the need to clean up duplicate or similar redundant content in the document, which may be some marks generated by the system, duplicate data bars or specific descriptive words to be cleaned up in a unified way. However, based on the fact that TXT file itself does not have the function of duplicate checking, manual searching for these similar files is not only a waste of reality but also prone to omission. However, we can use the fuzzy matching method to quickly identify paragraphs in the document that are similar to the keywords that need to be deleted, and realize a batch deletion cleanup.
Here's how to use the formula to fuzzy search out multiple similar key words in TXT Notepad, and then uniformly delete them in batches to make the content of the document more concise. Let's have a look at the operation!
What situation to delete txt file multiple structure similar text or number?
toclean up duplicate data
the TXT file generated by the processing program often has a large number of duplicate or same format numbers and text records. If useless duplicate information needs to be deleted, we can use fuzzy matching to clean up the contents of these similar structures, thus making the TXT data file more concise and convenient for analysis.
Delete Batch Number
some TXT data files, there are a large number of meaningless numbers or labels, a single delete is very cumbersome, the use of formula fuzzy search corresponding to the number can quickly batch delete, improve the readability of the data and processing efficiency.
Remove templated information
in processing mail or TXT files generated by the system, there will be template content with similar structure. We need to extract the core information, so we must delete these templated duplicate content and keep only specific keyword words. We can find out the text or numbers of the corresponding structure by fuzzy to delete in batch.
Fuzzy search batch delete TXT keyword effect preview
before treatment:
after treatment:
Fuzzy search TXT keywords and batch delete the operating steps
1. Open 【 HeSoft Doc Batch Tool ], select [Text Tool]-[Find and Replace Keywords in Text]].
2. Select a method in [Add File] or [Import File from Folder] to add TXT file that needs to delete similar keyword words, or drag the file directly to the bottom to add it. After confirming that there is no problem with the file, click Next.
3. Enter the option setting interface, select [Use Formula Fuzzy Search Text], enter the regular expression formula below the searched keyword list, leave blank below the replaced keyword list, and finally click Next again. Then click Browse and select a location to save the new file.
4. After waiting for the processing to end, click the red path to open the folder to view the TXT file that successfully deleted the keyword.