Introduction
In part 1, we introduced how natural language access will lead to the democratization of analytics. Using a set of examples from Unacast’s analytics copilot, we illustrated the vision for democratization and discussed at a high level how retrieval-augmented generation (RAG) and generative AI work together to enable natural language access to analytics. We provided a few examples why RAG is needed and why large language models (LLM) alone don’t provide the full answer for providing access to analytics. We also explored how natural language access is and will change the user interfaces and user experiences of analytics applications.
In part 2, we discuss generative AI and RAG in more detail. We’ll start by discussing several approaches to providing natural language access to data, as well as the pros and cons of different approaches. Along the way, we extend the definition of the democratization of analytics. After introducing different approaches we will go deeper into RAG with dynamic indexes. We’ll also discuss the need for AIOps and provide a sample process for continuous testing and verification of AI-based analytics applications.
A Generative AI Copilot Example
As we explored in part 1, Una is an analytics copilot embedded in Unacast’s location insights platform. The platform is a retail and real estate intelligence application based on Unacast’s location intelligence data — including place, visitation, trade area data, and other datasets. The location intelligence data that Una operates off of consists of relatively large amounts of structured data. Much of this data is generated via Unacast’s underlying proprietary processes. Unacast uses the location intelligence data it generates in its own analytics applications, and the data is available for use by other organizations in their applications.
Una provides natural language access to the data and analytics supporting the location insights product. Una is an example of a domain specific approach to building a generative AI based copilot. Later, we’ll discuss domain-specific versus general approaches for natural language-based access to data.
The screenshot above is from Una. The venue featured is a Domino’s Pizza store in California, USA.
For a more in-depth look, check out part 1 of the article here.
Approaches to Providing Natural Language Access to Data
The following sections review different approaches to providing natural language access to data and analytics. Approaches vary by scope and implementation, so we’ll discuss approaches in terms of those dimensions.
Approaches categorized by the scope of the target application of the model
There are two principal approaches based on the scope of the target application of the model that are relevant to the democratization of analytics: general and domain specific.
General approach
General approaches provide, as best as possible, broad solutions across information and business domains. A general approach does not seek to be an expert in every domain. A good example in the general approach category are text -> SQL copilots. Azure Studio Copilot and Snowflake Copilot are examples. These models are trained and verified on very large datasets of table schemas, data, and expected results, for example, BIRD-SQL .
General approaches do not perform as well when domain-specific knowledge is required to provide an accurate answer. For example, translating the nuances of a business concept such as earnings before interest, taxes, depreciation, and amortization (EBITDA) to many datasets and datastores is challenging. There are several challenging aspects of this calculation:
- None of the data structures required may use the term EBITDA and few of the table column names may map directly to the underlying language of the components (e.g. amortization) of the calculation
- Integration and aggregation across different datasets in diverse datastores may be required
- Each organization may use different data structures as well as different variations of the EBITDA and related calculations. This link provides good information both in terms of the calculation of EBITDA and variations in the calculation.
General approaches are the most useful when supporting well-versed data analysts to execute tasks more quickly, similar to how developers use code copilots. For this target audience, the general approach is quite effective.
A disadvantage of the general approach is that high business impact and complex user queries require technical execution and validation by skilled professionals, especially when both technical expertise and business domain knowledge is required. Following the above example, the risk and impact of posting incorrect EBITDA results on company valuations is so large that both expert execution and validation is essential.
To achieve the democratization of analytics, we need to reduce the need for expert/skilled validation as much as possible. Requiring skilled experts to effectively and safely use natural language to access data and analytics doesn’t meet the democratization objective.
Accordingly, we extend the definition of the democratization of analytics as follows:
The democratization of analytics occurs when users have access to data analytics for their business domains through non-technical interfaces such as natural language based queries.
Domain-specific approach
The domain-specific approach focuses on specific information or a business domain. Domain-specific analytics are supported by models pre-trained or augmented with the semantics and data required to answer user queries specific to the domain. Part 1 of this article introduced us to the domain of retail and real estate analysis through our location insights platform. Other examples of domains include company financial analyses, logistics network planning, and optimization of infrastructure analyses.
In the example above, we made the case that the general approach is not well suited to complex financial analyses such as calculating EBITDA metrics. Here are the reasons why a domain-specific approach is better suited for supporting specific inquiries:
- The AI model can be trained or augmented on both the terminology and semantics of the domain and the data relevant to the domain. This allows more effective and accurate translation and mapping of the user query to the identity, structure, and versions of the data required
- The search space for domain-specific datasets is smaller and therefore more easily mapped, managed and verified
- Domain-specific calculations are more easily provided and maintained
The domain-specific approach better supports the democratization of analytics by incorporating specific knowledge into AI-based applications, thereby reducing the need for domain and technical expertise.
Una, the Unacast analytics copilot introduced in part 1, is based on the domain-specific approach. With Una, RAG is used to augment the LLM with the domain specifics required to support user queries on data, such as venues, visitations, demographics, and customer profiles for the domains of retail and real estate analytics. The domain specifics are provided through both structured data related to the longitudinal analytics required, as well as unstructured data supporting the semantics and mapping (e.g. to structured data assets) of business domain terminology such as foot traffic and customer profiles.
Implementation-related approaches
There are at least three different implementation approaches to integrating LLMs with your structured and unstructured data to power advanced search and analytics:
- Purpose-built LLM. Train your own LLM and include your structured and unstructured data in the training. Fine-tune the LLM to the domains you are targeting to support.
- RAG with a Fixed Index. Use RAG with a fixed index to identify documents needed for augmentation, along with retrieval mechanisms appropriate for the documents in the fixed index.
- RAG with a Dynamic Index. Use RAG with vector search to query an index to determine the documents needed prior to retrieving them.
The term “documents” as used above is meant to include both structured and unstructured data.
This article is focused on RAG techniques. We’ll quickly review the steps in RAG and the purpose-built LLM approach before diving into the two RAG implementation approaches.
The Steps of RAG
RAG augments the LLM with data and context relevant to the user query within a user session. Once again, the term “documents” is applied to data whether unstructured or structured in nature. The documents used for augmentation provide data not available to the model, typically organizational data that the LLM was not trained on, but they can also include data generated since the LLM was trained.
The basics steps of RAG are:
- Receipt of the user query
- Identification of the set of documents needed to augment the generative AI platform in support of the user query (using a fixed or dynamic index)
- Retrieval of the identified documents
- Preparation of a prompt using a prompt template that incorporates the documents retrieved, LLM guidance, and data related to context
- Execution of the prompt on behalf of the user
- Generation and delivery of the response back to the user
The steps above represent the retrieval, augmentation and generation elements of RAG.
The two RAG approaches primarily differ in terms of how documents required for augmentation of the user query are identified.
Purpose-built LLM
The typical approach for purpose-built LLMs involves fine-tuning a pre-trained model, versus building one from scratch, because of the expense and expertise required for the latter. The advantages of fine-tuning your own LLM model include building domain-specific knowledge directly into the model and fine-tuning the model for the domain ahead of users interacting with the model. In addition, there is better control over the exposure of proprietary and sensitive data when hosting your own LLM versus using a generative AI platform.
The disadvantages include the cost of hosting and fine-tuning the LLM, along with the cost of keeping it up to date with the latest revisions to domain practices. Fine-tuning a pre-trained LLM also requires more AI expertise than using a RAG approach.
The purpose-built LLM approach is effective in certain scenarios, for example as a supporting diagnostic tool for a line of expensive equipment that contains significant intellectual property, or extracting medical summaries and diagnoses from medical records where patient privacy is of the utmost importance. For many analytics use cases this approach is more expensive and time consuming than needed. Key decision factors for deciding on a purpose-built LLM include the need to protect proprietary and sensitive data and cost constraints.
In the future, the cost and time investment to fine-tune and maintain an LLM for a purpose-built model will likely decrease, making the approach more attractive for analytics purposes. Before we move on, we’ll note that even with a purpose-built LLM, RAG may still be used to augment the model with current data.
Exploring the Fixed and Dynamic Index RAG approaches
RAG with a Fixed Index
RAG with a fixed index means that a single set of fixed documents will be retrieved and used to augment our AI-based analytics application for every user query. The approach yields straightforward identification of required documents.
The RAG with a fixed index approach may be used for limited scope, domain-specific use cases, but its best use is for prototypes, quick proof of concepts, and as a stepping stone for learning more complex RAG approaches. The primary advantage of this approach is that it is quick to implement and straightforward to test and verify, since the documents used for augmentation are fixed. New (to some teams) technologies such as vector embeddings, vector stores, and vector search are not required, reducing the initial learning curve.
The disadvantages of this approach are:
- The scope of the analytics copilot is more constrained
- The total size of the supporting documents is limited by the token limitations of the LLM since the documents required are not sub-selected
- The ability to expand the scope of the analytics copilot is limited by a combination of the fixed index and the size of the token space of the generative AI model
The diagram above features two copilots, each supporting a different domain or subdomain. A fixed index of documents is used for each copilot. All user queries are supported by retrieving the indicated documents and creating an augmented prompt using a prompt template. The prompt submitted to the generative AI platform includes the documents retrieved, the original user query, useful instructions (e.g. ‘you are a retail business analyst’), and optional context-related data. Context data may include prior user queries and may also contain context data passed from an application, for example, the application that an analytics copilot is embedded within.
Let’s put this in practice by reviewing the analytics copilot, Una, and the examples presented in Part 1 of this article. Una is able to answer a number of questions related to retail and real estate analytics. Una uses unstructured data such as product documentation and customer profile/segment definitions, structured data related to venues, visitations, trade areas, and more, and makes use of context such as the current venue and time period passed from the location insights application. Una assembles this data using a prompt template and passes it forward to the generative AI platform. The response is returned through the copilot user interface.
A brief note about documents in this context
The documents that support domain-specific analyses can take multiple forms. The documents may be traditional unstructured documents such as product descriptions, user manuals, or metadata such as a URL that can be used to retrieve a document. The documents may also represent metadata describing structured data within a Database Management System (DBMS), object storage, or accessible via an API. If the unstructured or structured data is resident in a file, the file may be retrieved directly. If the data is retrieved by a database query or API call, the indexed data will include metadata that is relevant to retrieving the data. For example, the indexed data may include one or more data definition language (DDL) statements and dictionary information that describes tables in the database or the description of an API along with its request parameters and response format.
RAG with a Dynamic Index
The RAG with a dynamic index approach determines the subset of documents needed to support a user query at run time. It does so by comparing the user query to an indexed document store and selecting the documents most relevant to the inquiry.
Determining the set of documents at runtime has several benefits, including:
- Enabling domain-specific models over larger topic areas by supporting larger amounts of documents applicable to the domain
- Selecting the set of documents used to augment the generative AI model to those most relevant to the user querysome text
- With a greater number of and larger-sized documents, this helps keep the augmented prompt size within the token space limit of the generative AI model
- Selecting the documents most relevant to the user query eliminates the chance of augmenting the prompt with irrelevant material
- Providing the ability to add, remove, or update documents without having to modify a fixed index.
These features make RAG with a dynamic index a better approach for production applications. Further, we believe that a domain-specific, natural language-based approach using RAG with a dynamic index is the current preferred approach for building AI based analytics applications, while acknowledging there are specific scenarios where a purpose-built LLM may be desired.
RAG with a dynamic index requires some additional components over the use of a fixed index. These components include:
- Vector search, also referred to as semantic search with support from vector embedding services and vector stores
- A vector embedding generation process which creates and persists the vectors for the domain documents to a vector store
- A process for adding or updating documents to the vector store
Vector search is favored over keyword-based searching because it is more attuned to the user query semantics. This provides better search results for the purposes of RAG. Getting the best match on relevant documents is vitally important as including documents in the augmented prompt may produce unpredictable and undesirable results from the generative AI model.
Vector search for our dynamic index is used to match the user query to relevant documents. A vector is created for the user query and the vector search performs a similarity match by computing and evaluating the distance between the user query vector and the set of embedding vectors for our documents in the vector store.
If you are not familiar with vector embeddings, vector stores, and vector search, that’s okay. Multiple organizations are producing products that support relatively easy searching of dynamic indexes based on vectors.
After identifying the documents needed for augmentation, we need to retrieve the documents, augment the prompt containing the user query, and generate the response via the AI model. We’ll cover that next.
Retrieval, Augmentation and Generation
The following diagram illustrates the operations involved in performing RAG using a dynamic index. We’ll walk through each step and dive deeper into the Dynamic Retrieval operation.
In step (1) in the diagram above, the user query is received by the analytics application and passed forward to the vector search component. In step (2) the vector search is executed and returns information on the documents that best match the user query. The data/metadata returned is passed along to (3) the dynamic retrieval step, which performs the actual retrieval of the documents. The documents in turn are passed to step (4) augmentation, which combines the user query and the retrieved documents using the prompt template. In step (5) the new prompt is passed to the generative AI platform through its API. The generative AI platform generates the response and returns the response to the analytics application.
For simplicity purposes, the diagram and description of steps above represents a single user query and not a session of multiple related inquiries. In understanding human language, context is king. A single statement or question can be more easily misunderstood or misconstrued than a statement within a longer dialogue. Most generative AI based copilots will retain a certain number of prior user queries and augment the prompt with a set of prior queries.
Dynamic Retrieval: Simple to Complex
Dynamic retrieval means that our application doesn’t know what data it needs to retrieve until runtime. As discussed above, the identification of the data that supports the user query occurs during the vector search operation.
If all the documents are stored in the object storage or within a single DBMS, then the dynamic retrieval is more straightforward as we only need to retrieve from one type of data source. When there are multiple sources of data, there may be a trade-off in the work required to stage the data within one source versus acquiring the data from multiple sources. For example, if the analytics copilot needs the latest weather, there is a trade-off between staging the daily weather data in the DBMS, accessing the data through a marketplace share, or calling a URL to retrieve the latest data.
With unstructured data and a DBMS or search engine that supports vector search, you can combine the vector search and the document retrieval in a single query. Products such as Snowflake, BigQuery and Elasticsearch provide this functionality.
Structured Data Retrieval Challenges
Earlier, we extended the definition of the democratization of analytics: “the democratization of analytics requires that expert execution and validation is not required to use the analytics provided.” Domain-specific models reduce the problem space to make this more achievable, but there are still challenges in querying structured data accurately.
For many organizations and domains, the structured data is too large to fit in the token space of the generative AI platform, so we can’t simply load the data into the prompt with instructions, the user query, and ask for the answer. Even if the data will fit into the AI model token space, this isn’t necessarily a good idea for the following reasons:
- Many user queries on structured data require set oriented processing, and generative AI models are best at text-related tasks versus set operations — though they do possess math and logic skills.
- The generative AI model can still yield inaccurate results; we’ll provide an example of this a bit later in this article.
A better approach is to use the generative AI model to generate the database query or API request, and then execute the query or API call separately. We’ll describe the process flow for doing this a bit later, but there are some key considerations to be aware of, especially in terms of generating queries for databases.
Generating DBMS queries such as SQL from natural language is challenging for the following reasons:
- Large and complex database schemas
Database schemas often have large numbers of tables, datasets that represent different stages of data lifecycles, and potentially complex or encoded schema elements. Selecting the right set of tables to answer the user query is non-trivial, and selecting the wrong tables will lead to inaccurate results.
- Domain-unique terminology / ambiguous concepts / complex concepts
Elements of the user query may include concepts unique to the domain, such as ‘foot traffic.’ They may include terms that have multiple definitions, such as ‘region,’ or complex concepts like ‘EBITDA.’ For example, the user query “show me the foot traffic at Target stores in the Southwest region” may have a different answer depending on whether the region definition of the user query and the region definition of the brand match. There is no standard definition of region across brands, and we also may not be sure which definition of region was intended.
- Complex calculations and missing parameters
A user query may require a complex calculation, and there may be missing parameters that need to be handled. For instance, “show me foot traffic at Domino’s Pizza stores where sales were more than 10% below the average in the second quarter and whose square footage was less than 1,500 square feet.” The year associated with the quarter is not provided and we need to ensure the “stores where sales were more than 10% below the average” query component is calculated correctly.
- Complex queries
Fulfilling user queries may require complex queries in SQL or no-SQL languages, which the generative AI platform may predict incorrectly:
- This is a significant problem in approaches such as text -> SQL and where significant R&D has and is being applied
- There are pitfalls including, but not limited to, non-additive and semi-additive facts which are not summable, and which may not be apparent from the database schema.
Reviewing these challenges, we are presented with many of the same disadvantages of general approaches such as text -> SQL. We need to exploit the advantages of a domain-specific approach to make the problems more tractable and the answers generated more reliable.
Addressing challenges with a domain-specific approach
1. Large and complex database schemas
Here the domain-specific approach shines. We carefully select the tables/datasets specific to the domain to reduce the probability that the tables will be selected incorrectly. We exclude development, test, and staging versions of the tables so that these datasets will not be inadvertently used. We also ensure that they are well documented, providing descriptions of each table/dataset and column. We can provide additional metadata such as defining in detail the ‘region’ scheme (as noted above) used in our tables and provide hints such as using the ‘state’ column to create different region definitions on the fly. Hints or tags can be used to flag non-additive and semi-additive facts.
The approach is to define effective descriptions of the data assets through database metadata. A good practice is to include as much metadata in the database management system, so that it is widely available for different use cases and not just the analytics application, but metadata may be stored in documents external to the database. Wherever the metadata resides, the relevant metadata will be identified through vector search and made available to the generative AI platform through augmentation.
2. Domain-unique terminology / ambiguous concepts / complex concepts: domain-specific documentation
The domain-specific approach provides both focus and scope that assists in addressing these potential issues. We define domain-specific terminology, describe concepts to make them less ambiguous, and document complex concepts through existing or purpose-built unstructured documents. These documents help form the backbone of the data used to augment the generative AI platform with our domain specifics. These documents should be tied into other metadata.
3. Missing parameters and complex calculations
Missing parameters can be handled by instructions in the prompt template. The prompt can instruct the generative AI platform to query the user for missing parameters or to assume default values and state them when returning the results.
Complex calculations can be described in the domain-specific documentation. Instructions in the prompt template may include a summary of the calculation, a link to a document with a more detailed description, and — where the calculation does not represent intellectual property — an instruction to disgorge pseudo-code or code for the calculation so that users can inspect the calculation. Describing the calculation to the generative AI platform helps the model to reproduce it accurately.
4. Complex queries
In terms of accuracy and reliable results, this area is potentially one of the most challenging for an AI-based analytics application. The following are some suggested considerations and approaches.
First, provide the metadata, documentation, and prompt instructions to address items 1-3 above. Test the effectiveness of your efforts from a set of anticipated user queries.
Throughout the development process, create a set of well-documented user queries as prompts along with the expected results for each query and treat these as test cases. For structured data, identify intermediate results expected based on your approach. Use these test cases to continuously test your AI-based analytics application. As development continues, address issues and expand your test cases to cover extensions of issues observed and those that are foreseeable. Expand your test coverage to include domain knowledgeable users. Log their user queries and use them to augment your test cases. Test cases consisting of prompts and expected responses are a critical asset for the AIOps process, which we’ll discuss later.
The following are specific suggestions that you can either try up front or after you observe specific issues arise through your testing:
4.1 Review, assess and refine your structured analytics datasets
Source data is normally processed to make it more suitable for analytics, but there is often room for refinement. Think of the generative AI component as a reasonably technical resource that understands language but doesn’t know much about your data.
If you have a complex database and dataset structure and it appears to be causing issues with the translation of natural language into the query you expect, look at ways to reduce the complexity. Make sure that table relationships are defined, including primary-table to reference-table relationships. Reference tables are important because they contain text recognizable from the user query versus the numeric foreign keys that represent them in the primary table. If your target tables include encoding, such as bit encoded fields, add a reference table that includes the decoding values and create a database view that includes the translations and provides any flattening required. Add the view, versus the underlying tables, to your vector store.
4.2 Consider dimensional data mart designs for your analytics data
Dimensional data models, such as star schemas, were built for analytics. Dimensional data marts have features that assist in natural language to query generation, including:
- Standard, understandable table design patterns for analytics use cases
- Standard analytics query patterns enabled by the standardized table design and the dimensional data warehousing practice
The field of data warehousing with dimensional data marts has query patterns for supporting a variety of analytics on detail and aggregate data across various levels of aggregation as well as patterns for handling semi-additive and non-additive facts. The scope of dimensional data mart design / data warehousing is too large to cover in this article, but the standard query patterns combined with good database metadata can reduce the potential of incorrect query generation by the AI model.
4.3 Consider pre-creating parameterized queries for complex domain-specific calculations
If the domain in question has a few complex calculations critical to your user base, consider creating parameterized queries that have been validated and tested for these cases. Pre-creating many calculations is too high of a workload, represents maintenance debt for your analytics application and reduces the value of natural language access. But for a few critical calculations that are needed to produce accurate, repeatable results, it may be worth the effort. The pre-created queries along with appropriate metadata will also reside in the vector store.
4.4 Consider using APIs
If you have production APIs available for the domain that are relevant to your analytics, consider using them. APIs have several advantages:
- APIs have a fixed set of parameters. Correctly constructed API endpoint requests will return repeatable results
- For complex requests, APIs contain static database access queries and related post processing that have been tested and verified and are a good solution for complex calculations
- Production APIs should have already passed intellectual property, privacy, and security verification
The disadvantage of APIs is that the functionality that they provide is fixed and may be insufficient for your AI-based analytics objectives. Use of APIs may require additional tuning of the application to ensure that the generative AI component does not generate a response when API support is not available for the topic of the user query. APIs can be used in conjunction with database query generation and a mixed approach is possible, but will require careful attention to metadata, tuning, and testing to ensure the correct asset is chosen to support the user query.
Example using generative AI to produce SQL for dynamic retrieval
Below is a manual example of using a generative AI platform to assist with the creation of a SQL-based query to support the user query “return all the other Domino’s Pizza venues in CA.”
You are an SQL expert.
Formulate a query against this relational table schema:
table referencedata.chain.POI (place_id integer not null, name varchar, address1 varchar, city varchar, state varchar, zip varchar, polygon geometry, sq_meters integer, chain varchar, region varchar constraint pk_POI primary key (place_id))
in support of the user query: "return all the other Domino’s Pizza venues in CA"
The current venue has a place identifier of 8675309. Place is a synonym of venue. POI is a synonym of venue. Domino's Pizza is a brand. Chain is a synonym of brand
Certainly! To retrieve all the other Domino's Pizza venues in California (CA) based on the given relational table schema, you can use the following SQL query:
ChatGPT generates a workable SQL query for the relational database given the table metadata provided. There are a couple of important things to point out. The above augmented prompt (in italics) provides some hints on synonyms to help assure that the query is successfully and correctly generated. In practice, we would retrieve and provide additional product documentation or a dictionary for this purpose. The table schema and product documentation/dictionary will be returned as a result from the vector search.
Dynamic retrieval and the dynamic retrieval process
We need to execute the query against the target database, retrieve the results, and generate the final response to the user query. The elements of this were already defined as part of the Dynamic Retrieval flow. What we have added is the loop to the generative AI platform for generation of the query and/or API requests needed. This additional loop is illustrated in the following diagram.
What the future holds in terms of support for RAG
Technical stacks and products are evolving to make it easier and more simple to develop and deploy generative AI-based applications with RAG. There is an arms race in this field including major players like Snowflake, Google, and Microsoft. We will continue to see innovations such as in-line vector creation for text values within a DBMS query, vector search and retrieval of unstructured data within a single query, and improved support for dynamic retrieval of structured data with complex queries.
The need for AIOps
Earlier, we touched briefly on the need for AIOps. The term AIOps is relatively new and will encompass different things for different readers. In the context of generative AI with RAG it should contain at least the following functions:
- A process for generating and persisting vector embeddings for new and modified documents
- Automated test and verification for AI-based analytics applications
Many readers are aware of incorrect answers and hallucinations produced by generative AI models. Test and verification will be performed prior to releasing AI-based analytics applications to production, however, the behaviors of generative AI models are observed to change unexpectedly, necessitating monitoring and refinement of the analytics application over time. RAG with a dynamic index allows new documents to be added to the vector store without modifying the analytics applications, and new or modified prompt templates can be added as well. Any of these modifications may affect the behavior of the analytics application.
A simple example highlighting the need for monitoring
Generative AI models can perform many application, data, and coding tasks for us, but currently they require effective testing and validation. For example, in part 1, we introduced a manual version of RAG, providing a sample of structured data for daily visitors to a mythical Chet’s diner and asking the model to compute the monthly visitors and disgorge its approach and calculation. Below, we see the results when we execute this query with retrieved data using a generative AI platform.
*******
You:
You are a retail analysis expert. Please provide the metric on the number of monthly visitors in January 2024 to Chet's Diner - 1034 Maple Ave using the following data:
Describe how you actually calculated the number of monthly visitors
To calculate the number of monthly visitors to Chet's Diner - 1034 Maple Ave in January 2024, you would simply sum up the number of visitors recorded for each day of January:
Using the provided data:
********
There are a couple of problems with the model’s response. The first one is that the model left off the visitor count for the last day, so the calculation is off by 81 visitors assuming the calculation itself is correct. With the error pointed out, the model correctly restates the result. On a different day, different session, ChatGPT gets it right the first time. If we switch the underlying AI model, we may also get a different answer. We tried this with Gemini, and for this user query, Gemini got it right on the first try.
The second issue is that it is generally not correct to sum visitors, customers, or devices over the time dimension. A person may visit a store multiple times in a day (and possibly be counted each time) and over multiple days in a month. Visitor counts are semi-additive, i.e. they should not be summed over the time dimension. We can provide additional information in the prompt such as a direction to treat certain facts as semi-additive or non-additive, or provide additional data that allows the generative AI model to determine that the counts can not be simply added.
This example is simple but exposes how behavioral changes may affect the results presented by an AI-based analytics application. A continuous monitoring and verification approach is needed to capture potential issues as they occur over time.
Monitoring and verification
The following diagram illustrates application components and a process for the purpose of monitoring over time.
The continuous test and verification process works by executing a set of test cases consisting of test prompts and expected results. The tests are executed on a scheduled basis against the AI-based analytics application which in turn invokes the LLM. A monitoring application reads the test cases and orchestrates the tests, collecting the response received for each test prompt. The assembled information, including the test prompt, the verification response, the LLM response, and the threshold verification score is passed to a similarity scoring component.
The similarity scoring component compares the LLM response to the verification response for each test prompt and computes a similarity score. If the similarity score is lower than the threshold verification score, the test case is flagged for review. Each test case and result are written to the log along with a timestamp and run identifier.
The attention log is a view of the log which returns the set of test cases that were flagged for further review. The AIOps team reviews these test cases and tunes the AI-based analytics application accordingly. The tuning may involve modification of prompt template instructions, the inclusion of new or modified documents within the vector store, or modified instructions for handling parameters relevant to the dynamic retrieval of data.
The similarity scoring feature can be implemented using an LLM or via vector search. Different practitioners will likely prefer one approach over another. Using vector search to determine the semantic similarity between the LLM response and the verification response has the following benefits:
- The vector search will not include the training bias of the LLM
- A similarity score can be generated via a mathematical function, which produces repeatable results for the same input
- Selection of threshold scores from observed similarity scores from test runs using the selected similarity function is straightforward
- The scoring function is performed by a different mechanism (vector search) than the generation mechanism (the LLM), reducing bias potentially introduced by using the same mechanism for both
Test cases are assembled throughout the development and test cycles of the application. The test cases are extended and modified during the lifecycle of the application.
Summary
In part 1, we introduced and defined the democratization of analytics. We discussed the role of generative AI and retrieval-augmented generation (RAG) in supporting this democratization. We walked through examples of Una, a natural language analytics copilot built using RAG and generative AI. We mused about the impact of natural language access to data and analytics on current and future application design.
In part 2, we dove deeper into approaches for providing natural language access to analytics. We discussed where general approaches, such as text -> SQL, may not sufficiently support the democratization of analytics and we adjusted the definition of the democratization of analytics to not require expert execution and validation of results. We discussed three different approaches to implementing natural language access to analytics and settled on RAG with dynamic indexes as the best approach for many current use cases. We explored the operations supported by RAG with dynamic indexes. We dove into challenges with generating the correct queries for structured data. We made the case that AIOps is important to support AI-based analytics applications.
There are other topics to be explored related to the democratization of analytics. One topic is the role of data professionals in bringing the democratization of analytics to fruition while ensuring privacy and safeguarding intellectual property. Another topic is exploring in depth techniques for supporting complex queries on structured data for AI-based analytics.
If you are interested in building your own AI-based application utilizing location data, reach out to our team. We can help you with best practices for utilizing location intelligence and AI to power your software. Book a meeting with us today.