Why and when use Generative AI?

Generative AI, is a powerful Artificial Intelligence technology that creates new content such as text or images. This concept has been around for quite some time and operates by predicting the next piece of information, such as the next word in a sentence. The emergence of ChatGPT has brought generative text AI into the spotlight, especially considering the prevalence of textual data in business applications. Our Data Science team is watching the movement in the market around Generative AI quite closely. Especially, of course, with a view to AI for the SAP community. Florian Leicher, Data Scientist at sovanta, has summarized his thoughts on this topic here:

Top Performer ChatGPT

While ChatGPT’s underlying model is larger than previous versions, it follows the same principles as previous generative AI models. However, what sets ChatGPT apart is its exceptional performance, thanks to extensive and well-designed training data. This breakthrough has sparked a race among other large language models (LLMs) to reach a similar level of sophistication, as evident from the leaderboard:

Leaderboard: chat.lmsys.org

In some ways these models are a game changer, because they are so readily available to anyone. You can interact with them just via natural language instead of knowing special code. The amazing thing is, that even though they are only trained on predicting the next most likely word in the sentence, this concept is abstract enough that it can be used in a wide variety of contexts. The other big advantage is, that you only need a very small amount of examples (“training data”) because most of the knowledge is already implictly contained in the model.

Business Cases for Generative AI

Certain use cases are made much easier with these large language models. Here are some examples:

  • “Classification”
    You might have many customer tickets that are created and want to forward them to the right place. If you only have a limited amount of categories, it will be straight forward to build a prompt that classifies those.
  • “Named Entity Recognition”
    Finding certain categories or structures in natural language text. For example, your customers might send you product orders in emails. It is quite simple to use an LLM to detect the customer’s name, product names and certain intents such as negotiation requests, etc. in the email. You can easily write a prompt that sends two or three correctly labelled examples and then asks for the incoming email to be annotated.
  • “Content Creation”
    Generative AI makes a huge difference in marketing, especially in content creation. Texts on any business related topic can be created with clear prompts. In the process, requests for the length, tonality and alignment of the text are also fulfilled. Summaries, appropriate headlines and more are created in the blink of an eye.

Limitations of LLMs

There are also many cases where the use of LLMs is not suitable because they have several limitations:

  • LLMs are bad at math
    You will probably want to use a regression model or a neural network.
  • LLMs don’t like rules
    It is very hard to give an LLM concrete rules that can be very easily implemented in an algorithm.
  • LLMs can’t read more than 3 pages
    All current LLMs have a fixed number of tokens that can be processed as input and output. In the case of GPT3.5 this is 4096 tokens (~3 pages of single-lined English text). For problems where you have to search through vast amounts of data, you can’t simply ask an LLM to do the work for you.
  • LLMs are great liars
    An LLM only implicitly learns information. Data points that have only appeared very rarely (or not at all) in the trading dataset when it was originally trained are likely not always picked up. So it will produce a lot of wrong information. The reason for these so called “hallucinations” is that the model is alwasy trained to predict the next word – so it has to sound confident even without having actual information. You could use a knowledge graph for exact results.
  • LLMs are always outdated
    The LLM draws its knowledge from a data source that is never up to date on a daily basis. This outdated knowledge cannot be used to answer questions about current events.

KI PoC with the ITZBund for the Federal Finance Administration

Having an overview of invoices and financial transactions is something only the most experienced experts can do. Using data analytics, artificial intelligence and employee know-how, a team from sovanta AG, …

Generative AI vs. customly-trained models

But of course it is totally impressive what is possible with Generative AI these days. That’s why it’s not surprising that ChatGPT and Co. are currently very present in the media. Everyone seems to have an opinion on LLMs. But can they really solve all the challenges, especially with regard to business challenges? My answer is no. Because there are many business scenarios, where the use of an externally hosted LLM is not the preferred approach. Why? Besides privacy reasons, one is vendor-locked and doesn’t have any control over what training data went into the model. This is why in addition to the generic models, which are very large and powerful, the classic approach using smaller customly-trained natural language models is also justified. Such models are likely sufficient and can be more cost-effective. It becomes clear: although there are certain use cases where LLMs work great, they are not the optimal choice for business applications. The perfect way: Companies should harness the benefits of open models while maintaining control over their data and optimizing their training pipeline.

Generative AI is on the right track

The movements in Generative AI are great and the hype is completely justified, considering all that is possible through LLMs. But we must always be aware that it is not a universal remedy, Generative AI complements and facilitates many work steps – but it remains exciting to see in which direction the use cases will develop. We are still at the beginning. What difference will partnerships like SAP’s and Microsoft’s make? The technology is there, and now it remains to be seen how business will work with it. We will keep you up to date.

Florian Leicher
Data Scientist

Your Contact

Florian Leicher works as a Data Scientist at sovanta. He is an expert in process automation and ML-assisted information retrieval. He has a passion for visualizing results machine learning results in captivating diagrams to offer a holistic understanding of the business data.
Tags
AI & Data Artificial Intelligence