Batch classify files by the first three digits of the filename: Use regular expressions to automatically sort TXT, docx, and PDF into folders


Translation:EnglishFrançaisDeutschEspañol日本語한국어,Update Time:2026-06-28 06:53:40

Disclaimer: All images, text, and video content on the website are for reference only and may not be the latest, correct, or accurate. In case of any dispute, please refer to the actual experience effect!

When a large number of file names contain serial numbers, client codes, region codes, or project codes, manually creating new folders and moving files one by one is very time-consuming. This article takes the file names starting with 101, 102, 103 as an example to introduce how to use the "Classify files by file name" feature in HeSoft Doc Batch Tool , by customizing the regular expression ^\d{3} to extract the first three digits of the file name, batch generate corresponding classification folders, and group similar files together.

In daily office work, many files are not piled together completely without rules; instead, their file names already contain information that can be used for classification. For example, for files like 101LON05417.txt, 102NYC53821.txt, 103PAR08578.txt, the leading 101, 102, 103 might represent departments, clients, batches, cities, projects, or order types. The problem is: when the number of files increases from dozens to hundreds or thousands, manually checking file names, creating folders, and dragging files not only is inefficient but also makes it easy to put files in the wrong place.

What this article aims to solve is this typical batch classify files by file name problem. We will use the file organization feature in the office software " HeSoft Doc Batch Tool ", through custom regular expressions to extract the three leading digits from file names, and then let the software automatically sort files into corresponding folders. In the example, files with names starting with 101 will be placed into a 101 folder, files starting with 102 into a 102 folder, and files starting with 103 into a 103 folder. This approach is applicable not only to TXT text files but also to common office files like Word documents (doc, docx), Excel spreadsheets (xls, xlsx), PDFs, images, compressed packages, etc., as long as the file names contain a stable classification rule.

Applicable Scenarios: Which Files Are Suitable for Batch Classification by File Name Using Regular Expressions

"Classify by file name" is suitable for handling materials whose file names have fixed patterns. For instance: the file name starts with a client code followed by the business type and serial number; the first few characters are a project number followed by a city abbreviation or date; the end of the file name contains a version number, month, or year; or there are department codes at a fixed position in the middle of the file name. As long as this information can be matched by wildcards or regular expressions, it can serve as a basis for classification.

Taking the TXT files in the screenshot of this article as an example, the file name structure is roughly: three leading digits + English letter code + numeric serial number + extension. The classification basis we need happens to be the three leading digits of the file name, so we can use the regular expression ^\d{3} to match. "^" means to start matching from the beginning of the file name, "\d" represents a digit, and "{3}" means 3 consecutive digits. In other words, the software will extract 3 digits from the start of each file name as the classification folder name.

This method is particularly suitable for the following office scenarios: archiving financial bills by supplier code, archiving inspection reports by equipment number, archiving business attachments by client code, archiving project materials by project code, archiving photos or scans by batch number, and archiving log files by server or module number. Compared to manual sorting, batch processing files can reduce repetitive work, make the folder structure clearer, and facilitate subsequent search, backup, and handover.

Effect Preview: Files Mixed in the Same Directory Before Processing

Before processing, all files are in the same folder. Although the leading file names already show three groups (101, 102, 103), they are still mixed together. If there are many files, finding all materials for a specific number would rely on sorting, searching, or manual individual filtering.

image-Classify files by filename,batch classification with regular expressions,batch organize files,filename prefix classification,TXT file batch archiving

From the pre-processing screenshot, you can see that the leading three digits are marked on the left side of the file name. For instance, 101LON05417.txt and 101SYD26137.txt belong to 101; 102LON48897.txt and 102NYC53821.txt belong to 102; 103LON23328.txt and 103PAR08578.txt belong to 103. A human can certainly see the pattern, but the problem is that the cost of manual operation quickly increases when the quantity grows. The value of using office software for batch organization lies here: letting the software perform repetitive actions according to rules, while the user only needs to set the classification rule once.

Effect Preview: 101, 102, 103 Classification Folders Automatically Generated After Processing

After processing is complete, the originally mixed files are categorized into corresponding folders based on the three leading digits of their file names. In the example results, you can see that the software generated three folders: 101, 102, and 103, representing the three classification values extracted from the file names.

image-Classify files by filename,batch classification with regular expressions,batch organize files,filename prefix classification,TXT file batch archiving

This result is very intuitive: in the future, to view files corresponding to 101, just go to the 101 folder; for 102 or 103, go directly to the corresponding directory. For a repository needing long-term maintenance, such a directory structure is much easier to manage than a large folder piled with numerous files.

Operation Steps: Using HeSoft Doc Batch Tool to Classify Files by Regular Expression

Step 1: Enter "File Organization" and select "Classify Files by File Name"

After opening HeSoft Doc Batch Tool , select File Organization in the left function bar. On the File Organization page, you can see multiple tools related to file archiving, such as classify by file name, classify by extension, batch create new folders based on existing folders, etc. The function used in this article is the first one: Classify Files by File Name.

image-Classify files by filename,batch classification with regular expressions,batch organize files,filename prefix classification,TXT file batch archiving

The purpose of this step is to tell the software that what we want to do is not renaming or format conversion, but to establish classification relationships based on a segment of the file name. After selecting this function, the software enters a step-by-step processing flow, where subsequent steps involve file import, classification rule setting, save location setting, and start processing.

Step 2: Add or Import Files to Be Classified from a Folder

After entering the "Classify Files by File Name" function, you can see buttons like Add Files, Import Files from Folder, Clear, More at the top of the interface. If files are scattered, you can use "Add Files"; if files are already centralized in a specific directory, it's more suitable to use "Import Files from Folder".

image-Classify files by filename,batch classification with regular expressions,batch organize files,filename prefix classification,TXT file batch archiving

After importing, the software will display the serial number, name, path, extension, creation time, and modification time of the files to be processed in a list. In the screenshot, you can see that the file path is in the D:\test directory, the extension is txt, and the record count is 20. Through this step, users can first confirm that the imported files are correct and avoid adding files that don't need sorting into the batch task. If a file is found that should not be processed, it can be removed according to the operation column in the interface; if the import is wrong, you can also use the "Clear" button at the top to re-select.

The expected result of this step is: all files needing classification appear in the list, and the file name pattern matches the current classification goal. For example, this article aims to classify by the first three digits, so the imported file names should start with three-digit numbers like 101, 102, 103.

Step 3: Select "Classify by Custom Regular Expression" in Processing Options

After confirming the file list is correct, click Next at the bottom to enter "Set Processing Options". This is the key step for batch classification. The interface provides various classification methods, including classify by first character, by first digit, by first English letter, by last few characters, by first few characters, by characters within a custom position range, and Classify by Custom Regular Expression.

image-Classify files by filename,batch classification with regular expressions,batch organize files,filename prefix classification,TXT file batch archiving

In this example, we choose Classify by Custom Regular Expression and fill in the regular expression input box with:

^\d{3}

This expression means: starting from the beginning of the file name, match 3 consecutive digits. For 101LON05417.txt, the match result is 101; for 102NYC53821.txt, it's 102; for 103LON23328.txt, it's 103. The software will use the matched content as the classification folder name, thus realizing batch archiving by file name prefix.

The lower part of the interface also provides letter case conversion options, including Default, Convert to uppercase, Convert to lowercase. The classification basis in this article is digits, so keep it as Default. If your classification basis involves English letters, for example, file names appearing in different cases like abc, ABC, Abc, you can choose whether to unify the case based on actual needs to reduce duplicate classification folders.

Step 4: Set the Save Location and Start Processing

After completing the regular expression setting, continue clicking Next to enter the "Set Save Location" step in the workflow. This step determines where the classified files will be saved. It is recommended to choose an easily identifiable new directory for convenient checking of results after processing. For important materials, you can also run a test in a test directory first to confirm the rules are correct before processing the official files.

After the save location is set, enter the "Start Processing" step. At this point, the software will automatically create corresponding classification folders based on the previously imported file list and the regular expression rule, and group files matching the same classification value together. After processing is complete, you will see a folder structure like 101, 102, 103.

The expected result of this step is: without manually creating folders one by one or dragging files individually, the software completes the batch organization based on the matching results in the file names. For a large number of TXT, docx, xlsx, PDF, and other files, this can significantly reduce repetitive work.

Regular Expression Explanation: Why Use ^\d{3}

Many users find "regular expressions" complex, but in batch file organization, you only need to master a few common patterns. The ^\d{3} used in this article is a very typical file name prefix match rule.

Here, "^" indicates matching the start position of the file name. Without this symbol, the expression might look for three digits anywhere in the file name; adding "^" makes it only match the beginning. This avoids mistakenly taking a later number sequence as the classification basis. "\d" represents any digit, equivalent to one of the numbers 0 to 9. "{3}" means the preceding digit appears consecutively 3 times. So the entire expression means: match the three leading digits at the beginning of the file name.

If the first four characters of your file name are a year or code, you can change the rule to ^\d{4}; if the first two are a region code, change it to ^\d{2}. If the file name starts with English letters, for example, ABC001.docx, consider using a regex pattern that matches leading letters. The specific rule to use depends on your file naming pattern.

FAQ and Precautions

1. Can batch classification still be done if the file name has no fixed pattern?

If the file name is completely irregular, any batch tool will find it difficult to accurately determine which folder it should go to. It is recommended to first observe whether stable information exists in the file names, such as a leading number, date, client code, department abbreviation, or fixed separator. As long as a pattern can be found, you can try to extract it using existing classification methods in the software or custom regular expressions.

2. Can non-TXT files also be sorted using this method?

Yes. The example screenshots show txt files, but the basis for "Classify by File Name" is the file name, not the file content. Therefore, Word documents (doc, docx), Excel spreadsheets (xls, xlsx), PowerPoint files (ppt, pptx), PDFs, images, audio, video, etc., can all be batch organized using similar methods as long as the file names follow the rules.

3. What happens if the regular expression is written incorrectly?

If the expression cannot match the desired content, the classification results may not meet expectations. Therefore, it is recommended to test with a small number of files first. For instance, import 10 to 20 sample files first, confirm that the correct 101, 102, 103 folders can be generated, and then process large batches of materials. For important files, it is also advisable to make a backup before processing.

4. What should be noted when classification criteria include uppercase and lowercase letters?

If the classification value in the file name includes English letters, case differences may lead to generating different folders. For example, abc and ABC might be treated as different classifications. The processing options in the screenshot provide a letter case conversion setting, allowing users to keep the default or unify to uppercase or lowercase as needed to make the classification results more standardized.

Summary: Let Software Handle Repetitive Tasks by Rules for More Efficient File Archiving

The core of batch organizing files is not making users click more, but handing repetitive, mechanical, error-prone actions over to office software. In this article's example, HeSoft Doc Batch Tool used the "Classify Files by File Name" function, combined with the custom regular expression ^\d{3}, to extract 101, 102, 103 from the beginning of file names and automatically generate corresponding folders for archiving.

If your file names also contain client codes, project numbers, department codes, dates, or batch numbers, it is recommended to first identify the most stable naming pattern, and then use the batch classification function for organization. For users who frequently process office files like TXT, docx, xlsx, PDF, this method can significantly reduce the time spent on manual filtering and dragging, making data management more standardized and subsequent file retrieval easier.


Keyword:Classify files by filename , batch classification with regular expressions , batch organize files , filename prefix classification , TXT file batch archiving
Creation Time:2026-06-28 06:53:13

Disclaimer: All images, text, and video content on the website are for reference only and may not be the latest, correct, or accurate. In case of any dispute, please refer to the actual experience effect!

Related Articles

Don't see the feature you want?

Provide us with your feedback, and after evaluation, we will implement it for free!