Integrated with AI
Cloud AI and on premise AI support using Large Language Models (LLMs)
ChronoScan AI - What is it and what can it do for you?
ChronoScan Capture Advanced and Enterprise editions now support AI using Large Language Models (LLMs).
LLMs are a type of artificial intelligence system that uses deep learning techniques and a large corpus of training data to understand and generate natural language. They can perform various tasks, such as translation, sentiment analysis, text classification, and even document analysis.
LLMs can help you with document analysis by using their natural language understanding capabilities to parse and interpret the meaning of the text, and their natural language generation capabilities to produce the output in a desired format.
We have integrated two different implementations as follows:
OpenAI GPT
- OpenAI GPT - ChronoScan can be custom configured to extract and structure data from any type of document using the online OpenAI GPT service, as an example metadata can be extracted from an accounts payable invoice (including line items) using this method. Other uses could also include classification or summarisation. We have also included function that may be called from Visual Basic Script to utilise with snippets of text, for example address blocks.
These new tools can dramatically decrease the amount of configuration work required to capture data from unstructured documents as well as provide a solution to capture data that is particularly difficult or time consuming with current capture objects. All of this resulting into faster project implementation times at a reduced cost.
In addition LLMs allow you to easily perform tasks that are particularly complex to achieve in other ways, such as summarisation and classification.
For now there are some limitations you need to be aware of when deciding whether to use an LLM for data capture and extraction:
- LLMs are limited to amount of data that can be processed at any one time. This constraint is known as the “context window” and is generally defined by the number of tokens that can be processed, typically you will see numbers of 4K, 16K or 32K tokens. Tokens are the basic units of text or code that an LLM AI uses to process and generate language. As a rule of thumb you can consider that a 1000 tokens will equate to more or less 750 words. As an example a single page AP invoice will consume about 600 tokens for basic header/footer data and an invoice with line items will consume 1500 to 2000 for a single page invoice.
Currently for OpenAI we support GPT 3.5 Turbo with context windows of 4K and 16K and GPT 4 with context windows of 8K and 32K and 128k (gpt-4-1106-preview)
We also support ChatGPT-4-vision-preview and ChatGPT-4o which directly works with the images and does not need any OCR processing.
Pros:
- No OCR needed
- Slightly Faster than text-based requests
Cons:
- Considerable credit consumption
- It requires an internet connexion and therefore the images are handled it by the OpenAI servers
Open source LLMs
For the open source Meta Llama models, Mistral, Gemma2, etc. LLMs the context window you can use will likely be limited by your hardware but for small tasks or tasks where execution time isn't critical they are a great addition. They also provide a low cost way in which to learn about LLMs and their capabilities.
- LLMs require a significant amount of compute resource to process data and typically at this current time the price for hardware to do so is high as is the demand for it. Therefore there is a trade-off of using a service such as OpenAI that utilise very expensive high powered, high speed compute resources compared trying to run LLMs locally on standard servers/workstations.