Over the past few months, I have been delving into this
subject called Generative AI. Unless you have been trapped on a desert island
somewhere without any communication with the outside world, it is pretty
difficult to miss all the talk and hype around this subject. As someone who
makes his living off of technology and, sometimes bleeding edge technology, I
felt that it would be wise for me to start learning a little bit about this.
What I didn’t expect was the rabbit hole that I would be going into as I began
to learn about this. This post is the first in a series of posts about what I
have learned in an attempt to bring you, my readers, up to speed on this
subject without having to go through the hundreds of blog posts and papers that
I have. My goal with this series is to enlighten on the topic of Generative AI
and hopefully, give you a jumpstart into this space and teach you about the
components of Generative AI; show you how magic works to provide you the
answers you receive when you ask ChatGPT, Claude 2 or Bard a question or give
it a task; delve a little bit into the current implementations of Generative AI
that out there; and give you some insight into how to use these tools through something
called prompt engineering.
What is Generative AI?
So, let’s get started by defining what this thing is that we
call Generative AI. In a nutshell, Generative AI is a lot like having a genius friend
who can consume and remember any piece of information that has been fed to it.
In turn, it can use this knowledge to invent new and exciting things like art, music,
writing stories and even help scientists and experts solve problems. Okay, so
that is pretty high level, generic definition that could have come from the marketing
department. So, let’s take a little deeper look into Generative AI and talk a
little about how it works.
Where does all of this knowledge come from?
Behind a Generative AI application there are complex machine
learning models that are trained to understand the large datasets that are
fed to it. A machine learning model is basically a program that allows
computers to learn without having to be explicitly programmed. In the days
before AI (or before most of us knew about it anyway), computers had to be given
explicit instructions via a program for everything you asked it to do. For
example, when you click on an icon to launch a program, there is a set of
instructions or code that tells the computer what it is supposed to do,
i.e., go to the exact path where the executable file associated with the icon
that you just click is stored and run it. Everything you ask your computer or a
specific application to do requires explicit code that the computer runs to respond
to your request. Machine learning allows computers to be able to take things a
step further and allow computers the ability to reason or, more accurately, predict
what it should do when you ask something of it. Now, for the foreseeable
future, this won’t impact the way your computer works. For the next few years,
it will still need explicit code in programs to tell it how to respond to a
certain user action.
What machine learning and Generative AI is starting to give
us today is systems that can be used as the basis to solve complex problems;
create automated systems, such as chatbots, that can answer questions and
provide you contextual information, freeing up humans to do more high value
work; do things like create websites or small applications based on a few
sentence description that the user inputs; create photo-realistic artwork based
on the user entering a few words describing what they want; create outlines for
ad campaigns, papers or even write stories or ad copy; analyze patterns of
behavior to catch criminals; and even create formulas for new and exciting
pharmaceutical medications. And this is just the start. We in are very early
days in this process even though researchers, scientists and data engineers
have been working on this for decades. In the long-term, we can expect that AI
will be used to offload even more mundane or low-level tasks, automate
monitoring of systems and, of course, bring us widespread adoption of the long
promised self-driving vehicle.
Now that we have some understanding of what machine learning
is, let’s talk a little bit about how these models are developed and how they
work.
Machine learning basics
Sidebar:
The term machine
learning was actually coined way back in the 1950’s, which gives you an idea of
just how long this whole process of developing so-called artificial
intelligence has taken. What’s even more amazing is that wasn’t even the
beginning of the groundwork for artificial intelligence. The roots of AI actually
go all the way back to the early 1900’s. The biggest movement in AI didn’t
really start to happen, though, until the 1950’s.
Machine learning is basically the process of inputting large
amounts of data into machines (we’ll refer to these as computers going forward)
and having those computers analyze that data for patterns which they can then
turn around and use to make predictions when it is given a task or question
relating to the data it has been given. This isn’t a one and done process. It
is actually a very iterative process that usually requires several different
phases and methods of machine learning. Initially, it might be use something
called supervised machine learning which is where the computer receives
labeled data such as pictures that are labeled as cat or dog or other things
that have been classified or labeled by humans. This allows the computer to begin
to recognize things. If you feed enough different labeled pictures of cats and
dogs, it will start to be able to recognize or predict based on that information
whether a picture that you show it is either a cat or dog. Obviously, this is a
very time-consuming endeavor, but is really necessary to teach the model how to
recognize basic things.
Unsupervised machine learning is where computers begin
to be able to recognize patterns from unlabeled data. This could be anything
such as data about your customers, marketing, stock prices, etc. Unsupervised
machine learning then uses a technique known as clustering to segment
this data to try to discern patterns. With clustering, the computer uses rules
to determine the probability that a particular item should belong to a certain
group or cluster.
Reinforcement learning is basically the feedback
mechanism to tell the model whether the choice that it has made is correct or
not. Again, this requires many iterations and quite a bit of human intervention
along the way. Eventually, the model has enough data that it can begin to start
evaluating its decisions and learn from its mistakes. This can also involve
additional models that can act as a checkpoint for the model and provide
feedback.
These models are all very complex and use probability and
statistics to make their predictions and evaluations. One of the things to keep
in mind with these models is that they are using data created by humans. As
such, that data carries the biases of its human creators and can cause the models
to behave in unintended ways, therefore, these models have to be continually
monitored and tuned by humans to try to correct for these potential inherent
biases.
We have just scratched the surface of what is involved in
machine learning as this is an entire unto itself, but we laid a good
foundation for you to begin to understand what is behind the Generative AI application
that you may be using. In the next post, we will talk about neural networks and
large language models.
No comments:
Post a Comment