Project materials, client files, test reports, and system-exported data are often named starting with serial numbers. If all are mixed together in a single directory, subsequent searches and archiving will become inefficient. This article takes files numbered 101, 102, and 103 as examples, explaining how to use HeSoft Doc Batch Tool to categorize by file names and extract serial numbers using custom regular expressions, enabling batch sorting of files into corresponding folders. This method applies to various office file types, such as txt, docx, xlsx, and pdf.
In office scenarios such as project management, customer document archiving, financial voucher organization, and inspection report distribution, file names often contain numbers. For example, project number 101, customer number 102, batch number 103, all appear at the beginning of file names. This naming convention is originally intended for easy identification, but if all files are saved in the same folder, the folder becomes increasingly chaotic as the number of files grows.
Many people will adopt the most direct method: sort by name, drag those starting with 101 to the 101 folder, those starting with 102 to the 102 folder. The problem is, this method relies on manual judgment. Handling a few dozen files is manageable, but handling hundreds rapidly becomes inefficient. What's more troublesome is that when file types are mixed, such as containing txt, docx, doc, xlsx, xls, pdf, and images simultaneously, manual archiving becomes even more prone to omissions.
For tasks like this, where "file names follow a rule and the classification action is repetitive," office software can be used for batch processing. This article explains the operation method of HeSoft Doc Batch Tool with screenshots: using the "Classify files by file name" function, select classification by custom regular expression, extract the first three digits of the file name, and batch organize the files into folders like 101, 102, 103, etc.
Applicable Scenarios: Project Numbers, Customer IDs, and Batch Numbers Can All Serve as Classification Criteria
The example in this article extracts the first three digits of the file name, but they don't necessarily just represent numbers. In practical work, this string of numbers might have clear business meaning:
- Project Number: 101 represents Project A, 102 represents Project B, 103 represents Project C.
- Customer ID: Contracts, attachments, and quotes for different customers are archived by their ID number.
- Department Code: Materials submitted by various departments are stored separately by department code.
- Batch Number: Inspection reports, production records, and logistics data are organized by batch.
- Region Code: Data files for different areas are grouped by region code.
As long as the classification information appears consistently in the file name, it can be extracted via rules. Compared to manual sorting, the advantage of wildcards or regular expressions lies in their reusability: once a rule is set, it can be directly applied to subsequent similar files, eliminating the need for re-evaluation each time.
Effect Preview: From Disordered Files to Number-Based Archiving
Before Processing: File Name Prefixes Differ, but All Piled Together
In the screenshot before processing, the file list already appears sorted by name, but files starting with 101, 102, and 103 are still located in the same directory. Red arrows point out these numbers, indicating that the file names themselves already provide the basis for classification.

If this is a project materials directory, files starting with 101 might belong to one project, files starting with 102 to another project, and files starting with 103 to a third project. Mixing them leads to inconvenient searching later and is not conducive to packaging and delivering by project.
After Processing: Each Number Corresponds to a Folder
The screenshot after processing shows that the files have been organized into three folders: 101, 102, and 103. This result meets our expectation: files whose name beginnings match the same number are placed into the same classification folder.

After completing such organization, the directory structure becomes more suitable for team collaboration. Project leaders can directly view the corresponding project folder, customer materials can be delivered by customer ID, and archivists can quickly check whether the files under each number are complete.
Operation Steps: Use Regular Expressions to Extract Numbers and Classify in Batches
Step 1: Open the File Organization Tool and Select File Name Classification
Open HeSoft Doc Batch Tool , and click on "File Organization" in the left-side function navigation. The right side will display multiple functions related to file organization. This time, we need to archive by project number or customer ID in the file name, so select "Classify files by file name".

Note that file name classification is selected here, not extension classification. Extension classification is suitable for separating file types like txt, docx, pdf, etc.; file name classification is suitable for separating business identifiers like 101, 102, 103. They solve different problems, so clarify your organization basis before choosing.
Step 2: Import Files to Be Organized and Check the List
After entering the function, the first step is to select the records to be processed. You can click "Add files" to add them individually, or click "Import files from folder" to batch import files from an entire directory. In the example screenshot, files have been imported into the list, with name, path, extension, creation time, and modification time clearly displayed.

In the table, you can see file names like 101LON05417.txt, 101LON09060.txt, etc., with the path located at D:\test and the extension being txt. The bottom record count is 20, indicating a total of 20 file records awaiting processing. The purpose of checking the list is to ensure the correct scope of files for batch processing. If there are temporary files, irrelevant files, or improperly named files in the directory, it's recommended to exclude them first to avoid affecting the classification results.
After confirming the file list is correct, click "Next" to enter the processing options settings.
Step 3: Select Classification by Custom Regular Expression
On the processing options settings page, the interface provides multiple classification methods. For simple scenarios, you can categorize by the first character, the first digit, or the first few characters; for more flexible office rules, it's recommended to use "Classify by custom regular expression". This option is selected in the screenshot.

The rule for this example is to extract the first three digits of the file name, so enter: ^\d{3} in the regular expression input box. It can be understood by breaking it into three parts: ^ indicates the beginning of the file name, \d represents a digit, and {3} means match three of them. The final extracted result will be 101, 102, 103.
Compared to ordinary wildcards, regular expressions are more suitable for describing "where to start, what characters to take, and how many characters to take." If you face more complex file name rules, such as starting with two letters followed by three digits, you can adjust the expression after mastering the rules. But in this example, ^\d{3} is sufficiently accurate.
Below the page, there are also "Letter case conversion" settings, including default, convert to uppercase, and convert to lowercase. Since the classification basis in this example is numbers and does not involve case changes, keep the default setting. Click "Next" after completing the settings.
Step 4: Select Save Location and Execute Start Processing
According to the page flow, you will subsequently need to enter "Set save location" and "Start processing". It is recommended to choose a new target directory for the save location, especially when using a certain regex rule for the first time, as this makes it easier to compare original files and processing results. Do not arbitrarily overwrite or mix important directories to avoid increasing the difficulty of verification.
After starting the process, the software will analyze the file names in the list one by one, extract the matched number, and create or use classification folders according to the number. For the example files, files starting with 101 will be placed into the 101 folder, files starting with 102 into the 102 folder, and files starting with 103 into the 103 folder. After processing is complete, you can view the final folder structure in the output directory.
Common Questions and Precautions
1. Can the regex rule only be used for txt files?
No. The example files are txt because the data in the screenshots was easy to display. File name-based classification does not read file content, so as long as the name matches the rule, Word documents (docx/doc), Excel sheets (xlsx/xls), PDF files, presentation files (pptx/ppt), and image files can all use similar rules.
2. What if the file name doesn't start with three digits?
You need to modify the expression according to the actual naming convention. For example, if it starts with a four-digit project number, you can use ^\d{4}; if it starts with two letters, consider a rule for matching letters. The key is to keep the rule consistent with the file name structure.
3. Why might some files not enter the expected folder?
A common reason is that the file name does not conform to the expression. For example, there might be spaces, underscores, or Chinese descriptions in front of the file name, or the number is not at the beginning. It is recommended to preview the file list and confirm the naming structure before use. If inconsistencies in rules are found, you can organize file names first before executing batch classification.
4. What preparations should be made before batch organization?
It is recommended to back up important files first, or set the output location to a new directory; secondly, test the regular expression with a small sample initially; finally, check if the processed folder names are consistent with expectations. This can reduce the risk of operational errors associated with batch processing.
Summary: Use Office Software to Organize Rule-Based Files into Standard Directories
Organizing files in batches by project number or customer ID is essentially about converting the rule information in file names into a folder structure. HeSoft Doc Batch Tool , through its "Classify files by file name" function, allows such repetitive work to be completed with a one-time setup. After importing files, select "Classify by custom regular expression", enter ^\d{3}, set the save location, and start the process. Files with numbers like 101, 102, and 103 will then be automatically archived into their corresponding folders.
If your work frequently involves organizing project materials, customer files, batch reports, or system export data, it is recommended to prioritize this batch processing method. It not only saves the time spent manually creating new folders and dragging files but also makes archiving standards more consistent, reduces misplacement and omission, and truly allows file management to serve the improvement of office efficiency.