When processing data and organizing spreadsheets in daily work, Excel worksheets may accumulate a large number of texts or numbers with similar names and repetitive structures, such as product codes, customer names, and address information, with many identical and similar contents. Such data may leak important privacy information. If not cleaned and deleted, it will not only affect data leakage but also interfere with the accuracy of subsequent statistical analysis. When facing hundreds of highly similar but not entirely identical redundant texts, manually searching and deleting them is a waste of time, and may also delete key data in the spreadsheet, making the information incomplete. Is there a way to quickly identify these structurally similar texts and numbers and automatically perform batch deletion?
This article explains how to use fuzzy matching techniques to quickly locate unnecessary data with similar structures in Excel and delete them in batches, creating a cleaner and more professional spreadsheet. Let's learn the specific operations together!
When to batch delete structurally similar text and numbers in an Excel worksheet?
Clearing redundant duplicate content
Excel files received from others often contain a large number of identically structured notes, serial numbers, or dates. Not only is this information useless, but it can also affect data statistics and sorting. Manual deletion takes a long time; we can use fuzzy search to batch delete these similar texts and numbers.
Cleaning up template duplicate data
When you need to import Excel data into a system or software, the table format needs to be uniform, but the original data may contain repeated serial numbers, identifiers, or numbers. By batch deleting these structurally similar contents, you can avoid import errors and allow the system to automatically identify clean original content.
Removing duplicate template content
When some departments in a company compile project summaries or weekly reports, they copy the same template multiple times, but each table contains similar explanatory text, examples, or numbers. If not deleted, viewing a large number of tables will only be messy. We can use the batch delete function to remove these structurally identical contents, retain the real data, make the file cleaner, and facilitate subsequent submission, printing, and merging.
Effect preview of using fuzzy matching to batch delete keywords in Excel
Before processing:

After processing:

Steps for using fuzzy search to batch delete structurally similar text and numbers in Xls and Xlsx files
1. Open HeSoft Doc Batch Tool , select [Find and Replace Keywords in Excel] under [Excel Tools].

2. In [Add File] or [Import Files from Folder], choose a method to add the Excel file that needs the structurally similar text deleted. You can also directly drag the file into the import area below. Finally, click Next.

3. Enter the Excel Options setting interface. Check [Cell Text], and then check other options below according to your specific situation.

4. Next, see the Set Keywords option interface. Select [Use Formula for Fuzzy Text Search]. Enter the corresponding regular expression formula in the keyword list below [Find], and leave the [Replace with] keyword list below blank. Finally, click Next to go to the save page, click Browse, and choose a save location for the new file.

5. After processing is complete, click the red path to open the folder and view the Excel files where the keywords have been successfully deleted.
