Tuesday, July 25, 2023

Generative AI, What Is All the Hype About Anyway?

 

www.freepik.com

Over the past few months, I have been delving into this subject called Generative AI. Unless you have been trapped on a desert island somewhere without any communication with the outside world, it is pretty difficult to miss all the talk and hype around this subject. As someone who makes his living off of technology and, sometimes bleeding edge technology, I felt that it would be wise for me to start learning a little bit about this. What I didn’t expect was the rabbit hole that I would be going into as I began to learn about this. This post is the first in a series of posts about what I have learned in an attempt to bring you, my readers, up to speed on this subject without having to go through the hundreds of blog posts and papers that I have. My goal with this series is to enlighten on the topic of Generative AI and hopefully, give you a jumpstart into this space and teach you about the components of Generative AI; show you how magic works to provide you the answers you receive when you ask ChatGPT, Claude 2 or Bard a question or give it a task; delve a little bit into the current implementations of Generative AI that out there; and give you some insight into how to use these tools through something called prompt engineering.

What is Generative AI?

So, let’s get started by defining what this thing is that we call Generative AI. In a nutshell, Generative AI is a lot like having a genius friend who can consume and remember any piece of information that has been fed to it. In turn, it can use this knowledge to invent new and exciting things like art, music, writing stories and even help scientists and experts solve problems. Okay, so that is pretty high level, generic definition that could have come from the marketing department. So, let’s take a little deeper look into Generative AI and talk a little about how it works.

Where does all of this knowledge come from?

Behind a Generative AI application there are complex machine learning models that are trained to understand the large datasets that are fed to it. A machine learning model is basically a program that allows computers to learn without having to be explicitly programmed. In the days before AI (or before most of us knew about it anyway), computers had to be given explicit instructions via a program for everything you asked it to do. For example, when you click on an icon to launch a program, there is a set of instructions or code that tells the computer what it is supposed to do, i.e., go to the exact path where the executable file associated with the icon that you just click is stored and run it. Everything you ask your computer or a specific application to do requires explicit code that the computer runs to respond to your request. Machine learning allows computers to be able to take things a step further and allow computers the ability to reason or, more accurately, predict what it should do when you ask something of it. Now, for the foreseeable future, this won’t impact the way your computer works. For the next few years, it will still need explicit code in programs to tell it how to respond to a certain user action.

What machine learning and Generative AI is starting to give us today is systems that can be used as the basis to solve complex problems; create automated systems, such as chatbots, that can answer questions and provide you contextual information, freeing up humans to do more high value work; do things like create websites or small applications based on a few sentence description that the user inputs; create photo-realistic artwork based on the user entering a few words describing what they want; create outlines for ad campaigns, papers or even write stories or ad copy; analyze patterns of behavior to catch criminals; and even create formulas for new and exciting pharmaceutical medications. And this is just the start. We in are very early days in this process even though researchers, scientists and data engineers have been working on this for decades. In the long-term, we can expect that AI will be used to offload even more mundane or low-level tasks, automate monitoring of systems and, of course, bring us widespread adoption of the long promised self-driving vehicle.

Now that we have some understanding of what machine learning is, let’s talk a little bit about how these models are developed and how they work.

Machine learning basics

Sidebar: The term machine learning was actually coined way back in the 1950’s, which gives you an idea of just how long this whole process of developing so-called artificial intelligence has taken. What’s even more amazing is that wasn’t even the beginning of the groundwork for artificial intelligence. The roots of AI actually go all the way back to the early 1900’s. The biggest movement in AI didn’t really start to happen, though, until the 1950’s.

Machine learning is basically the process of inputting large amounts of data into machines (we’ll refer to these as computers going forward) and having those computers analyze that data for patterns which they can then turn around and use to make predictions when it is given a task or question relating to the data it has been given. This isn’t a one and done process. It is actually a very iterative process that usually requires several different phases and methods of machine learning. Initially, it might be use something called supervised machine learning which is where the computer receives labeled data such as pictures that are labeled as cat or dog or other things that have been classified or labeled by humans. This allows the computer to begin to recognize things. If you feed enough different labeled pictures of cats and dogs, it will start to be able to recognize or predict based on that information whether a picture that you show it is either a cat or dog. Obviously, this is a very time-consuming endeavor, but is really necessary to teach the model how to recognize basic things.

Unsupervised machine learning is where computers begin to be able to recognize patterns from unlabeled data. This could be anything such as data about your customers, marketing, stock prices, etc. Unsupervised machine learning then uses a technique known as clustering to segment this data to try to discern patterns. With clustering, the computer uses rules to determine the probability that a particular item should belong to a certain group or cluster.

Reinforcement learning is basically the feedback mechanism to tell the model whether the choice that it has made is correct or not. Again, this requires many iterations and quite a bit of human intervention along the way. Eventually, the model has enough data that it can begin to start evaluating its decisions and learn from its mistakes. This can also involve additional models that can act as a checkpoint for the model and provide feedback.

These models are all very complex and use probability and statistics to make their predictions and evaluations. One of the things to keep in mind with these models is that they are using data created by humans. As such, that data carries the biases of its human creators and can cause the models to behave in unintended ways, therefore, these models have to be continually monitored and tuned by humans to try to correct for these potential inherent biases.

We have just scratched the surface of what is involved in machine learning as this is an entire unto itself, but we laid a good foundation for you to begin to understand what is behind the Generative AI application that you may be using. In the next post, we will talk about neural networks and large language models.


No comments:

Post a Comment

The Importance of Our Community Health Centers

With National Health Center Week just wrapping up a couple of weeks ago (August 3-9, 2025), I think it is timely to highlight the work and o...