A large number of PDF files still using temporary names like 1.pdf, 2.pdf, 3.pdf will be very inefficient for subsequent searching, archiving, and delivery. This article takes contract PDFs as an example to introduce how to use HeSoft Doc Batch Tool , by using the number in the file content and combining it with wildcard/regular expression matching rules, to batch rename multiple PDFs to contract number.pdf. The operation process includes selecting the function, importing PDFs, setting matching expressions, confirming saving, and starting processing, suitable for office scenarios such as contracts, orders, invoices, reports, etc., that need to be named according to the body information.
In daily office work, many PDF files are initially obtained from scanners, system exports, or temporary manual saves, and their file names are often sequential numbers like “1.pdf”, “2.pdf”, “3.pdf”. Manually opening a single file, viewing the number, and then renaming it is acceptable, but if a folder contains dozens or hundreds of PDF contracts, orders, reports, or receipts, processing them one by one can be extremely time-consuming, and it's easy to copy the wrong number or miss renaming a file.
The problem addressed in this article is clear: when the PDF body text contains a fixed-format number, such as a contract number, order number, or project number, we want to extract this content and use it as the batch PDF file name. The example in the screenshot involves identifying an 8-digit number following "Contract No." on the first page of the PDF, and then batch renaming the original 1.pdf, 2.pdf, 3.pdf, 4.pdf to 10026877.pdf, 20036655.pdf, 20100511.pdf, and 33952100.pdf.
Below, in conjunction with the interface screenshot of HeSoft Doc Batch Tool , we will explain how to use the "Rename PDF Files Using File Content" function, and use a matching expression similar to wildcards, which is the "regular expression" in the interface, to quickly complete batch renaming of PDFs.
Applicable Scenarios: Which PDFs are suitable for batch renaming with content numbers?
This method is suitable for files where stable, recognizable text exists within the PDF body. For example, the contract number on the first page of a contract, ticket numbers in invoices or receipts, report numbers in test reports, order numbers in order PDFs, and personnel or project numbers in archival materials. As long as these numbers can be recognized in the PDF text and have a relatively fixed format, a matching expression can be used for batch extraction.
Using the screenshot as an example, the PDF content contains a prominent "Contract No." followed by a string of 8 digits. For a person, one would open the PDF, see the number in the red box, and then rename the file to that number; for batch processing software, a rule must be set so that the software automatically finds the text matching the rule from each PDF's content.
If your files are not PDFs, you can also select the corresponding module based on the actual function; for instance, Word documents might be docx or doc, and text files might be txt. This article focuses on batch renaming of PDFs, but the concept is equally applicable to many office needs for organizing file names based on file content.
Effect Preview: File name changes before and after processing
Before processing: PDF file names are simple sequential numbers, making it impossible to determine the content
Before processing, there were 4 PDF files in the folder, named 1.pdf, 2.pdf, 3.pdf, and 4.pdf. From the file names, it was impossible to tell which contract they corresponded to, nor to search or archive them directly by contract number.

After opening one of the PDFs, one could see the contract number at the top of the body text. The red box in the screenshot highlights the 8-digit number "10026877", indicating that the information truly suitable for a file name is actually inside the PDF content, not in the current file name.

After processing: File names directly become the numbers from the PDF body
After batch processing is complete, the original 4 PDFs have been renamed to their corresponding numbered file names. The processed file names include 10026877.pdf, 20036655.pdf, 20100511.pdf, and 33952100.pdf. In this way, the file content can be directly identified by the number within the folder, and it is also convenient to copy them into contract ledgers, project directories, or archiving systems.

This naming convention is more suitable for long-term management than simple sequential numbers. If you need to find a specific contract number later, you only need to search for the number in the folder without opening each PDF to verify.
Operation Steps: Using wildcards/regex for batch PDF renaming
Step 1: Enter the "File Name" category and select the PDF content renaming function
After opening HeSoft Doc Batch Tool , select "File Name" from the function categories on the left. This category centrally houses functions related to batch file name modification, such as finding and replacing keywords in file names, inserting text, and adding prefixes and suffixes.
In the current page, select "7. Rename PDF Files Using File Content". The interface description indicates that this function is used to "batch use certain text from the content of PDF files as the file name for that file". This precisely corresponds to the scenario in this article: extracting the contract number from a PDF and using it as the new PDF file name.

The purpose of selecting this function is to make the software no longer just process existing file names, but to enter the workflow of reading PDF content and setting naming rules. For files like contract PDFs, report PDFs, and order PDFs, this step can significantly reduce the manual effort of opening files to look up numbers.
Step 2: Add or import PDFs from a folder for processing
After entering the "Rename PDF Files Using File Content" function, the interface proceeds to step 1, "Select records to process". At the top, you can see buttons like "Add Files", "Import Files from Folder", "Clear", and "More". For a small number of PDFs, you can use "Add Files"; if there are many PDFs in one folder, "Import Files from Folder" is more suitable.
The screenshot shows 4 PDFs have been imported, with the list displaying information such as serial number, name, path, extension, creation time, and modification time. It can be seen that the file names are still 1.pdf, 2.pdf, 3.pdf, 4.pdf, the extensions are all pdf, and the path is located under a test directory on the D drive.

The purpose of this step is to confirm which PDFs will participate in the batch renaming. After importing, it is recommended to check whether the record count matches the number of target files in the folder, and confirm that no wrong PDFs are in the list. The bottom of the interface shows "Record count: 4", indicating that 4 files will be processed this time.
After confirming the files are correct, click "Next" at the bottom to enter the processing rule settings.
Step 3: Select custom matching text and fill in the expression
After entering step 2, "Set processing options", the interface provides options for "Search Area". The options visible in the screenshot include "First line of text", "First barcode image", and "Text matched by custom formula". In this case, we need to match a contract number from the PDF body, and the contract number is an 8-digit number, so select "Text matched by custom formula".
In the "Regular Expression" input box, fill in:
\d{8}

This expression here can be understood as a more powerful wildcard rule. \d represents a digit, and {8} means it appears exactly 8 times consecutively, so \d{8} will match a consecutive 8-digit number in the PDF content. For the contract number 10026877 in the screenshot, it perfectly matches this rule.
It is important to note that the interface name uses "Regular Expression", which is more precise than a typical wildcard. Standard wildcards are often used to match characters in file names, whereas here we are extracting specified content from the PDF body text. For scenarios like "8-digit contract numbers", "10-digit order numbers", or "fixed prefix plus numbers", regular expressions are more suitable.
Step 4: Set the naming position to overwrite the entire file name
On the same settings page, you can also see the "Position" options, which include "Overwrite entire file name", "To the left of file name", and "To the right of file name" in the screenshot. In this case, the goal is to have the final file name contain only the contract number, without retaining the original 1, 2, 3, 4, so "Overwrite entire file name" is selected.
The expected result of selecting "Overwrite entire file name" is: after the software finds the 8-digit number in the PDF content, it will replace the original file name body with this number, while retaining the PDF extension. For example, 1.pdf will become 10026877.pdf.
If your need is not a complete replacement, but you want to add the number before or after the original file name, you can also choose "To the left of file name" or "To the right of file name" based on the options provided in the interface. However, for contract archiving scenarios, using the contract number directly as the file name is usually clearer.
After completing the settings, click "Next" to proceed to the subsequent save location and processing confirmation workflow.
Step 5: Confirm the save location and start processing
From the process bar, it can be seen that this function has two subsequent steps: "Set Save Location" and "Start Processing". During actual operation, it is recommended to confirm the output location according to your own archiving habits after entering the save location step. For important contracts or official archives, it is advisable not to directly overwrite the only original. You can first output to a new folder, and after checking for errors, replace or archive them.
After confirming the save location, enter the "Start Processing" step to execute the batch rename. Once processing is complete, return to the folder to check the results. Consistent with the post-processing screenshot, the file names should become the 8-digit numbers extracted from the PDF content.
Common Questions and Notes
1. Why is this called a wildcard expression, yet the interface reads Regular Expression?
Many users habitually refer to "matching text by rules" collectively as wildcard matching. Strictly speaking, the input box in the screenshot is for "regular expressions". Regular expressions can achieve effects similar to wildcards, and are more suitable for matching structured text like numbers, dates, and order numbers. The \d{8} in this article is the regex way of writing to match consecutive 8-digit numbers.
2. What if there are multiple 8-digit numbers in the PDF?
If multiple consecutive 8-digit numbers exist in a PDF, simply using \d{8} might match unwanted numbers. In this case, you need to adjust the expression based on the characteristics of the PDF content to make the rule as close to the target number as possible. For example, incorporate fixed text, number prefixes, or location to improve accuracy. Before formal batch processing, it is recommended to test with a small number of files first.
3. If the PDF is a scanned image, can the number be recognized directly?
In the screenshot for this article, the text in the PDF content can be matched by the software according to text rules. If the PDF is just a picture scan without a recognizable text layer, content extraction might be affected. In such cases, you should first confirm whether the text in the PDF can be selected and copied before deciding if it is suitable for direct renaming using content.
4. Do I need to back up before batch renaming?
Backups are recommended, especially for important documents like contracts, financial records, legal files, and project archives. The advantage of batch processing is speed, but it also means that if a rule is set incorrectly once, it could affect multiple files. Therefore, it is advisable to first copy a test directory, confirm that the expression and output results are correct, and then process the official files.
5. Which characters can be included in a file name?
This example extracts purely numeric numbers, which typically does not trigger issues with illegal characters in file names. If extracting text like contract names or customer names, you need to be aware that Windows file names do not support certain special symbols. If a naming failure or unexpected result occurs, you should check whether the extracted text contains characters unsuitable for file names.
Summary: Using content matching rules to reduce repetitive renaming work
Through the "Rename PDF Files Using File Content" function of HeSoft Doc Batch Tool , the original repetitive workflow of manually opening a PDF, finding a number, copying the number, and modifying the file name can be transformed into a one-time rule setup and batch execution. For files like contract PDFs, order PDFs, report PDFs, and receipt PDFs, using wildcards/regular expressions to extract numbers from the body text can significantly improve file organization efficiency.
If your folder also contains a large number of PDFs with unmanageable names like 1.pdf, 2.pdf, scanned.pdf, export.pdf, it is recommended to first pick a few samples, confirm the number format within the body text, and then set up the matching expression according to the steps in this article. After verifying that the rules are error-free, batch import the entire folder for processing to complete the PDF batch renaming more safely and efficiently.