A Step-by-Step Guide on Using Javascript Actions to Supercharge Your Custom GPTs
This tutorial is the second in a series of tutorials covering increasingly complex Custom GPTs. In this installment, you will learn how to power up your Custom GPT with custom code using serverless workers from Cloudflare. Our example GPT will be able to write content with exact word counts, while ChatGPT often fails at this task.
If you are comfortable with creating and publishing Custom GPTs on ChatGPT using normal text instructions, then you should be good to proceed with this tutorial. If, however, you'd like a refresher or a walkthrough on the process of creating a GPT, please refer to the first tutorial in the series: A Step-by-Step Guide on Building Custom GPTs to Make the Most Out of ChatGPT, where I cover the entire process of creating a Custom GPT up until the start of this tutorial.
Basic requirements checklist
- ChatGPT Plus account
- Cloudflare free account
- A reliable web browser to access ChatGPT
Overview of Contents
- Concept Definitions
- Creating custom actions
- Creating and deploying a Cloudflare worker
- Creating OpenAPI schema for ChatGPT
- Adding API key authentication
- Finalizing CGPT functionality
- Testing
Concept Definitions
To provide some background information, I will briefly review the technical terms that will appear in this tutorial. But if you have enough familiarity with any of these terms, feel free to jump ahead!
The Word Count Problem
It was soon after ChatGPT was released that people started realizing that it counts worse than a toddler. For whatever reason, someone decided to ask ChatGPT to count the number of R's in the word "strawberry." It got it right sometimes, but other times it guessed 4 or 2. Just now, I asked it to count a misspelled version of the word that has 4 Rs, and it corrected the word and then counted. Either way, counting letters and words is one thing ChatGPT is really bad at doing.
One common ramification of this problem is that you can't reliably ask ChatGPT for content with a certain word count. Ask for a 500-word post, and you may get 200 or 900. Ask again and the new response is just as wildly variable. In today's tutorial, we'll work on solving this problem with a Custom GPT that can count words.
LLM (ChatGPT)
To start with the basics, an LLM (Large Language Model) is a machine learning model that can perform what is technically referred to as “Natural Language Processing” tasks. These include classifying text, translating, generating, and identifying patterns in ways that allow answering questions and communicating in a natural, human-sounding way. For this use case, we will strictly work with ChatGPT as our LLM UI of choice. If you’re interested in more in-depth information about Machine Learning and LLMs, try out this custom GPT I created to help with that!
Custom GPT
Unlike the normal chat window you use when accessing ChatGPT, a custom GPT allows you to provide additional instructions that are retained in any future chat with this GPT. For example, you can create a custom GPT that can only speak in puns. The directions under the hood will simply look like this: “You only speak in puns.”
This is the most basic level of configuring a Custom GPT. As I mentioned earlier if you're interested in a more detailed walkthrough of how to set up your first Custom GPT, check out the first tutorial in the series. For this installment, however, we will create the most basic Custom GPT and focus on the integration with Cloudflare workers, which will allow us to run Javascript code on AI responses.
Since the term Custom GPT will be repeated frequently in this tutorial, I will use the abbreviations CGPT or GPT going forward.
Custom GPT Actions
This feature allows you to connect your CGPT to external APIs, which essentially means that you can allow your GPT to connect to external services and add more functionality. This will be our main area of focus today as we connect a CGPT to a serverless Cloudflare worker.
For example, you can add real-time weather information, real-time news data, search results acquired from various specific search engines, integrations with software that provides an API, and much more.
API (Application Programming Interface)
APIs, provided by software developers, allow applications to communicate with each other. For example, Google provides APIs for its various services. So if we wanted to build an application that shows a map with the users’ locations, we could use the Google Maps API to provide this to users without creating our own map software or hosting any of the code on our side. We would send the user’s address to the API, and the API would return the code necessary to display the map.
In today’s tutorial, we will look at creating additional custom code and using it to create a serverless CloudFlare worker to host the code and connect it to your CGPT via API, opening the doors to endless possibilities.
A basic utilization for API usage can be seen in this application that I launched in 2023, which allows users to use natural language to describe what they want to watch, and the app uses a series of APIs to curate movies and TV shows.
As you see, I created a movie search application that displays a lot of information without needing a personal database of movies, which can take massive amounts of time and resources to develop. Instead, I pay about $5/month to use this API.
Serverless Workers (using CloudFlare)
Serverless workers are computing platforms with no servers. In other words, they are functions, or code, that can run on-demand and without any infrastructure configuration. Not to mention that it’s free to start with CloudFlare and likely to remain so for most personal use cases. However, please note that heavily using a Cloudflare worker will result in on-demand charges from Cloudflare. I highly recommend that you familiarize yourself with their pricing if you like this method. In this tutorial, we will use CloudFlare workers to showcase how you can think up any custom functionality, convert it to a CloudFlare worker with ChatGPT, and then integrate the worker with your custom GPT using custom actions (API).
This diagram shows how a serverless worker can provide a simple solution for LLM hallucinations and areas where LLMs fall short, such as counting and doing math in general.
Why Javascript?
Javascript is excellent at processing text responses, working with math, and further integration. Additionally, ChatGPT writes almost impeccable Javascript, which will allow us to create robust functionality while not directly writing or modifying code ourselves.
Overall, however, it's important to note that any programming language would likely fit the bill. The idea here is to augment what LLMs can do with what programming languages have always been able to do. For example, programming languages are excellent at looping through data, organizing it, adjusting its formatting, outputting tables, files, etc. Oftentimes, ChatGPT falls short when it comes to tasks like this, or at the very least, fails at handling large amounts of data and ends up hallucinating.
This approach is to be done as needed. Meaning, that it's always recommended to try what you need to achieve directly in ChatGPT, evaluate the results you're getting, and then decide if you need to augment functionality with code.
Creating a Custom Action
Let's go ahead and create a blank CGPT, then give it the basic instructions: “You count the words in the pasted text.” Then, I’ll proceed to test the GPT by pasting in a “Lorem Ipsum” snippet that contains exactly 42 words.
ChatGPT guessed 61 words, which is basically a wildly incorrect guess. We do know, however, that counting words is very easy to do using pretty much any programming language, so it’s now time to look into integrating a serverless CloudFlare worker with our custom GPT to allow it to do things beyond ChatGPT’s capabilities!
Let’s go ahead and click the button “Create new action” to start building the action that will help us get around this problem.
The next screen is where the magic happens. Here, we have a spot to paste “OpenAPI schema,” authentication information, and a link to a privacy policy. Here’s a brief explanation of what each of these does:
- OpenAPI schema: APIs, in order to allow communication with software from the outside world, provide what is known as endpoints, most commonly being URLs. When you call the url, think for example “google.com,” an API URL allows you to pass additional information. Google.com, for example, lets you pass a search parameter to it, such as google.com/search?q=pigeons where it will immediately search for “pigeons” instead of taking you to the homepage. In the same way, one can pass information to an API, and have it return responses based on that information. OpenAPI schema is a formalized method of detailing an API's endpoints and methods (available functionality) without code, using a simple structured format.
- Authentication: APIs, since they provide access from the outside world, are traditionally built with authentication requirements in order to protect them from public access. This is akin to your email account or your computer having a password. APIs can use various methods to authenticate. ChatGPT supports authentication using API keys or OAuth, as well as running without authentication. In this tutorial, we’ll be running our workers without authentication first. Then we will proceed to incorporate API key authentication after testing everything.
- Privacy Policy: If you’re offering your custom GPT for people to use and using an external API, whether it’s one that you create or a third-party API, you are responsible for researching their privacy practices, as well as defining your own comprehensive privacy policy and including a link to it in the GPT.
So now that we’ve covered the general settings, let's go ahead and jump over to the Cloudflare side to start our setup there.
Creating your first Cloudflare Worker
To create a CloudFlare worker, all you need is a free account at cloudflare.com, and you will be ready to go. Once you’re logged in and looking at your dashboard, locate a menu item on the left side labeled “Compute (Workers).”
Next, click the blue “Create” button to create your first worker.
Then go ahead and click the “Create Worker” button under “Create a Hello World” worker because our goal is to create a blank Javascript worker that we can use ChatGPT to populate.
And on the next window, all you’ll have to do is give your worker a name, which in this case will be “wordcounter,” and click the “Deploy” button at the bottom to publish your worker.
Once the worker is deployed, you will see the next screen, where you can click “Edit Code” to start working on the worker.
Recommended by LinkedIn
The next screen will give you a space to work on code and a preview area on the right-hand side.
And in order to make your life easier, I went ahead and created a free custom GPT that will take your request and immediately output the complete code for a CloudFlare worker, without any special instructions. Here’s the link to the GPT. We will be using this GPT now to run this very basic prompt: “Create a word counter.”
We will copy all the generated code and paste it into our newly created worker. Then, we will click the “Deploy” button to save our changes. And now it’s time to test our worker.
Under the code provided by ChatGPT, as you see in the screenshot above, you will see a test string; here, it looks like this:
?text=This+is+a+sample+text+to+count+words
You will copy this text and paste it right after the URL in the preview window on the right side of CloudFlare, then click the “Go” button right next to it.
If everything is configured correctly, which it is in this case, you should see a word count in the black preview area. This is the word count of the sample sentence: "This is a sample text to count words."
With the functionality completed on CloudFlare, we’re now ready to integrate our worker with the custom GPT we created earlier. So, the first step is to generate an OpenAPI schema for our worker. For this, we will be using a custom GPT provided by OpenAI called “Actions GPT,” which allows us to generate OpenAPI schema from any code or documentation. You can access it at this link. Create a new chat with Actions GPT, then paste the code we just added to our worker. Additionally, before submitting your prompt, also copy the worker’s URL as it looked like when we last tested and paste it at the end of the prompt, below the worker code, which is as follows:
Leaving in the sample text will help Actions GPT generate the correct schema and give us the right instructions. No additional explanations in your prompt are necessary.
Now, we will copy all of the generated specifications and paste them into our custom GPT in the OpenAPI Schema box. As soon as you paste the schema, you should see one or more “Available actions” immediately pop up under the box. In this case, we get one action “getWordCount.” In some cases, you may get a red error message in this location. If you do, simply copy the error message and paste it as a follow-up in the Actions GPT chat, and it will correct the output. Now, we’re ready to test with our original Lorem Ipsum string. You will likely be prompted to “Allow” ChatGPT to communicate with your worker, and then voila! We get the correct result.
Additionally, all the “[debug]” notes will not appear once we publish our custom GPT, so we can use it ourselves or share it with others.
This is all great so far, but this API endpoint is public. Meaning, if someone gets their hand on your secret link, they will be able to use it in their own software, potentially costing you money on Cloudflare, or misusing your code, depending on what it does. So let's go ahead and setup authentication.
API Key Authentication
I left this step till the end because it introduces a complication, and testing with multiple unknowns can be confusing for beginners. However, if you're comfortable with these concepts, then I'd say set your worker up with API key authentication from the start.
Let's return to our Javascript Worker Creator GPT, type the sentence "Add API_KEY authentication," and follow it with the entire script from the worker we created earlier so it will look like the prompt in this screenshot. As you see, you will get a response with slightly different code, containing API key checks.
Copy this provided code and paste it into the code editor for the Cloudflare worker and deploy it.
Next, we will generate a new OpenAPI schema for the updated code using the Actions GPT. Simply paste the updated code, along with the test link, as we did earlier.
Now copy the generated OpenAPI schema and paste it in your CGPT in the custom action settings. Luckily, this time, we got an error, so I can show you how to deal with errors.
If anything is wrong with the schema you pasted, you will see a red message under the box telling you what the problem is. Simply copy this line of text and paste it as a follow-up message in the Actions GPT Chat. ChatGPT will swiftly apologize, explain what went wrong, and give you an updated, likely error-free schema. Should you get another error when you paste the schema, simply keep repeating this process as many times as it takes. Eventually, it will work. In most cases, however, it works the first or second time. Of course, this is the no-code approach, so if you have no idea what's happening here, that's ok, though it's recommended to at least skim the responses provided by ChatGPT.
Now that we have completed this part of the setup, let's finish setting up our API key on Cloudflare.
Generate and save your API key
For an API key, you can pretty much use any random alpha-numeric string. However, I like to use this website that simply lets me get a random key by clicking a button. https://generate-random.org/api-key-generator
Copy the generated "yellow" string, by clicking the button under it, then let's head back over to Cloudflare. Locate the part of the code where it says "YOUR_API_KEY" and replace that part of the text with the random string we just generated, then deploy the worker.
Add the API key to your Custom GPT
And the final step is back in our custom GPT. Right above the "OpenAPI Schema" box where we pasted the schema last, there is a small setting labeled "Authentication" with a little settings "gear" on the right edge. Click that to show the authentication window.
In the window that appears, you'll note that there are various methods available for authentication, which makes for a robust set of options. For this particular tutorial, we will choose "API Key" for Authentication Type, paste the API key in the second field where it says [Hidden], and then select "Bearer" for the Auth type option. In future tutorials, I will showcase other integrations that may require using the other types of authentication. Click Save, and we should be ready to make our final test!
Again, I'm using a Lorem Ipsum string that's 42 words long. Let's see if we're still counting correctly:
Yep! We're doing great.
Taking it one step further
Now that we have a reliable word counter, in order to turn this into something substantial, all we have to do is adjust our CGPT instructions a little bit. In other words, we can choose when to invoke custom actions based on custom instructions. In this case, I will attempt to have this GPT try to produce content with accurate word counts, which is something that normal ChatGPT struggles with.
The instructions should be straightforward. Here's what I'll type:
Prompt: When you're asked to write something with a specific word count, first output the text and then send it to the word counter to see if it matches the requested number of words. If not, identify the number of words that need to be added or removed and rewrite the output based on that. Continue doing this until you get something that is within 10% of the requested length.
The 10% guidance is just a failsafe to give it something to check against. The truth is, it will not really know what 10% is, and most of the time, it will try to get an exact word count match. However, in certain cases, using instructions like this can result in an endless loop without allowing some leeway. This 10% will help the CGPT stop and exit infinite loops after a few attempts.
As you can see, the CGPT wrote 41 words first and then corrected them to 42. Upon testing, sure enough, it got it right!
This concludes this installment of the tutorial. If you run into any issues or have any questions or comments, feel free to share them! Additionally, if you enjoyed this tutorial, please follow me for more in-depth walkthroughs and tutorials.
Other articles you might like
If you enjoyed this tutorial, you might like these other tutorials I wrote recently:
- Part#1 of this tutorial series: A Step-by-Step Guide on Building Custom GPTs to Make the Most Out of ChatGPT
- Get high-quality results from A.I. by using outlandish combinations of worlds in your prompts.
- Learn to create custom Chrome browser Extensions with ChatGPT and zero coding.
- Tutorial: A complete beginner's guide to building a website using AI - From Brand to Launch