Document Analyzer
The Business Document Analyzer is a powerful tool that extracts values from unstructured documents into the desired data format. This tool utilizes RAG LLM (Retrieval-Augmented Generation Large Language Model) based technology to automatically extract important information from documents, allowing users to request customized information as needed.
Retrieval-Augmented Generation refers to a language model that combines information retrieval with generation. This model enhances traditional generation models by incorporating an information retrieval mechanism, thus creating richer contexts. This means that when generating sentences, the model can search for and utilize external data or knowledge. Through this approach, the model can generate more accurate and useful sentences.
For example, consider the process of generating an answer to the query, "What is the highest mountain in the world?" A typical generation model might not know the exact answer to this question when generating a sentence. However, a RAG model would first retrieve information related to this question from an external database or knowledge graph and then generate an answer based on this information. With such a model, it is possible to generate responses that accurately include the information relevant to the user's question.
Biz Document Analyzer supports two modes:
Default Mode
When a document is uploaded, the AI automatically extracts up to 10 items deemed "important." This mode utilizes RAG LLM technology to analyze the overall content of the document and identify key information.
Single Predictive Mode (SPM)
Along with the document upload, you can request specific information. When the user inputs a particular query, the RAG LLM analyzes the document and provides an accurate answer to the query. For example, if the question is, "What is the invoice number?" the RAG LLM will find and extract the invoice number from the document.
Document Analyzer offers two powerful modes to streamline your document processing tasks. See how each mode works with our sample invoice data.
Default Mode is incredibly simple. Just upload your document, and our AI will automatically extract up to 10 items it deems important. It’s easy to get critical information. However, if you need to find specific details across different business documents, it might not be enough. For those situations, SPM Mode will be much more useful.
# | key | value | result |
1 | invoice_number | INV-00001 | Invoice number is INV-00001 |
2 | invoice_date | May 29,2024 | Invoice was issued on May 29,2024 |
3 | due_date | June 26, 2024 | Payment is due by June 26, 2024. |
4 | amount_due | $4,410.00 | The total amount due is $4,410.00. |
5 | billing_address | Greenfield Technologies,Innovation Drive,Katy, Texas,77493,United States | Billing address is Greenfield Technologies, Innovation Drive, Katy, Texas, 77493, United States. |
6 | company_name | ARGOSIdentity | Company name is ARGOSIdentity. |
7 | company_address | Gallows Branch Road,VA, Virginia,22182,United States | Company address is Gallows Branch Road, VA, Virginia, 22182, United States. |
8 | phone_number | 17570000000 | Phone number is 17570000000. |
9 | terms | Payment is due within 30 days from the invoice date.,Late payments may incur a 5% late fee .. | Terms and conditions are provided. |
10 | items | [{description: 'Laptop(ModelX)', rate: '$1,000.00', quantity: 2, amount: '$2,000.00'}, {description: 'Monitor(24inch)', rate: '$200.00', quantity: 10, amount: '$2,000.00'}, {description: 'ShippingFee', rate: null, quantity: 1, amount: '$200.00'}] | Invoice contains items with descriptions, rates, quantities, and amounts. |
SPM Mode allows you to ask specific questions to get the exact information you need. For example, if you receive numerous invoices with varying service terms, you might wonder, “Are there any late fees or penalties?” By asking this question, the AI accurately extracts the relevant information.
To use SPM, simply enter the key value, your question, and the data format. It’s that easy! This way, you can obtain the precise information you need, significantly boosting your efficiency.
# | Custom Question | Custom Key | Value |
1 | What is the invoice number? | invoice_q1 | INV-00001 |
2 | Who is the invoice billed to? | invoice_q2 | Greenfield Technologies |
3 | How many items are listed in the invoice? | item_q1 | 3 |
4 | What is the quantity of each item with names? | item_q2 | [{name: 'Laptop(ModelX)', quantity: 2},
{name: 'Monitor(24inch)', quantity: 10},
{name: 'ShippingFee', quantity: 1}] |
5 | What is the tax amount? | financial_q1 | $210.00 |
6 | Are there any late fees or penalties? | terms_q1 | 5% late fee for each month past due |
7 | What is the warranty period for the products? | terms_q2 | One year from the date of purchase |
See how these modes work with our sample invoice data. Apply the same techniques to your own business documents to streamline your workflow and boost efficiency! 😀
Search Stage
The user inputs a question, such as "What is the invoice number?" The model understands this question and searches for relevant information in external data sources (e.g., document databases, knowledge graphs)
Information Retrieval
The model finds the specific information, such as "invoice number," from the retrieved data. For example, it might find the invoice number "INV-00001" in the document.
Information Integration
Based on the retrieved information, the model decides on the answer to generate. For example, it may generate the response, "The invoice number is INV-00001."
Answer Generation
The model generates the answer based on the retrieved information and returns it to the user.
For more details, please refer to our API Documentation