How to create an e-commerce product description generator

Welcome, this is a workshop using Google's Gemini LLM to build an e-commerce product description generator (for the Build With AI event series).

Gemini LLM

Gemini is a family of multimodal large language models developed by Google DeepMind. Unlike other LLMs, Gemini was said to be unique in that it was not trained on a text corpus alone and was designed to be multimodal, meaning it could process multiple types of data simultaneously, including text, images, audio, video, and computer code. - Source

Vertex AI

Vertex AI is a fully-managed, unified AI development platform for building and using generative AI. Access and utilize AI Studio, Agent Builder, and 130+ foundation models including Gemini 1.5 Pro—all from Vertex AI. - Source

E-commerce product description generator

The goal of this workshop is to build an E-commerce product description generator that focuses on fashion e-commerce products. The usual process for many products that are published on an e-commerce website is, that photos are taken for each product in a studio generally with a model wearing it.

With the photos of the product (with or without a model) and the product's description which includes brand, material, wash instructions, etc a copywriter writes a description for each product. Then it goes through a copy edit process and finally, when the product is published on the website the product description is also included in it. Below is a sample product description of women's Jeans by Guess (possibly written by a human):

Guess Jeans Product description

This workshop is about automating the process, where the copywriter (or someone else), will upload the photos and ask Gemini to generate a product description. The photos used in this workshop are right free images by Dmitriy Steinke from Pexels.

To begin, you will need to have the following pre-requisites sorted:

Have a working Google Cloud account (with some credit)
Cloned / downloaded the repository or this zip file on your machine, for the product images. If you download the zip file please decompress it.

Please be aware of the Vertex AI Pricing as well.

Go to your Google Cloud Console and Create a new project called gemini-ecomm or anything relevant as seen below:

Create a new GCP project

Make sure you have selected the project created in step 1 if you have multiple projects.
Go to Vertex AI from your Google Cloud Console, the easiest way to do it would be to search for vertex on the search bar as seen below:

Search vertex on GCP console

Click on Vertex AI
On the Vertex AI page, then "Enable all recommended APIs" as seen below (it will take some time):

Enable all related Vertex AI APIs

After the APIs are enabled, click on Freeform found on the left menu

Click on Multimodal on Vertex AI page

After that you will see a screen like below, where you can type your prompt and upload images:

Screen to type in your prompt and upload images

On the Prompt experiment page, please make sure you have the gemini-1.0-pro-vision-001 model selected. Then, paste the following prompt in the Prompt text box:

As an expert e-commerce copywriter, analyze the uploaded images of
women's jeans and write a product description for a low to mid-end 
fashion e-commerce website. Please include the details about the 
comfortable to wear jeans and do not include any details about the 
price. Make sure that the copy is written in an engaging and friendly tone.

Then upload the images you find in the repository or the images folder of the unzipped images.zip file. Navigate to womens-jeans-photos folder after clicking the Insert Media option on the middle of the Prompt textbox. Then click Upload and upload all Once all the 8 images, it will look something like the below:

The prompt with the images upload

After that hit the > button to submit and test out the prompt with the uploaded images, you should get a response similar to the following:

The response to the prompt with product description

At this point, it would be a good idea to save your prompt (with images). To do this, click the pen icon beside Untitled prompt above the prompt text box, then type e-commerce-product-desc-generator the click anywhere, it will look like the below while editing:

Name the prompt

As you have named the prompt, you can save it. To save the prompt click Save on the top left part of the right sidebar as shown below:

Save button for the prompt

Then select the region (it is ok to choose us-central1) on the overlay window and save the prompt.

Save the prompt selecting a region

All saved prompts will be accessible in your Prompt management page. You can access it from the Prompt management link on the left sidebar.

All your saved prompts are on the prompt management page

Hurray! The basic e-commerce product description generator is working. Now, you will change some settings to make it better. You can go back to the prompt editing page by clicking the Prompt Name if you are on the Prompt management page.

In terms of configurations, for Gemini 1.0 pro vision there are 4 options you can configure. Those four are explained in plain words below:

Temperature (Randomness/Creativity/Spice): Imagine a roulette wheel (randomness). A high temperature increases the spin's randomness, affecting the chosen word (output).
Output Token Limit (Length): This is like a set word limit (length) for your text. It controls how many words the LLM generates in total.
Top K (Choice): Think of this as picking from a shortlist (choice) of the most likely words. A lower K restricts the options for the next word.
Top P (Probability): This is like a probability wheel (probability). It influences the LLM to pick the next word based on its likelihood (probability), not just being the most likely.

Below is a configuration you can try out, the right settings for this configuration depend on how you want the output to be shaped by Gemini:

4 configs for the Gemini pro vision LLM

It is also important to set up the Safety Settings correctly as per your use case, for now, we will set it at maximum safety (Responsible AI). As seen below, the safety settings (found on the right sidebar are self-explanatory)

4 configs for the Gemini pro vision LLM

You can also tweak the prompt text to make it better, below is another version of the prompt:

As an expert e-commerce copywriter, analyze the uploaded images of women's
jeans and write a product description for a low to mid-end fashion e-commerce
website. Please include the details about the comfortable to wear clothing and
do not include any details about the price. Make sure that the copy is written
in an engaging and direct tone.

You can play around with the prompt and make it more flexible or more specific as per your goals.

The optional code step is next.

If you want to create an API for the e-commerce description generator or want to have more control over what the LLM is called, you can generate code and run it on a Google Cloud Platform service like Google Cloud Run. To generate code, click the <> Get Code link which shows a slider on the right side as follows:

Get code for your Gemini experiment

For this workshop, you will use the Node.js code and try it out. For that you will use Cloud shell and Cloud shell editor.

Click Activate Cloud Shell toward the top right corner of the screen as seen below:

Activate Cloud Shell

In the Cloud shell window, click Open Editor:

Open editor

This will take some time and open up the Google Cloud Shell Editor which looks very similar to VS Code. In the Editor click Hamburger Menu > Terminal > New Terminal as follow:

Open terminal in editor

In the editor's teminal, execute mkdir projects && cd projects && mkdir gemini-workshop && cd gemini-workshop and then pip3 install --upgrade google-cloud-aiplatform :

Commands executed

After vertext AI Python package is installed it will look like the below:

Install Vertex AI Python package

After that you will load the folder in the project, go to Hamburger Menu > File > Open Folder:

Load project to Cloud shell editor

Then type in projects/gem and select the gemini-workshop option and click OK:

Load project to Cloud shell editor

It will load the folder on the Cloud Shell editor, after that, to add a new file click the file+ icon besides GEMINI-WORKSHOP and name it gemini.py

Create new index.js

For the contents of the gemini.py, click <>GET CODE on the Vertex AI Editor screen, while on the Python option copy the code into a file called gemini.py

Get Node.js code for e-commerce descripiton generator

Paste the code in the gemini.py empty file and save it:

Paste copied code

To run the code and test it out, again open the terminal from Hamburger Menu > Terminal > New Terminal and type in python gemini.py then hit enter. It will ask you to Authorise:

Paste copied code

After authorisation the code will run and give an outupt like the below:

Code output

Congrats! You are a Gemini and Vertex AI novice now :). You can close the Cloud Shell Editor. Even shutdown/delete the project if you like.

Further steps

The generated code is more like a proof of concept. You can add an API layer and UI on top of it to make it more useful. You can deploy that API on Google Cloud Run as serverelss containers.

For instance below is a basic UI generated with v0 with the prompt:

An internal tool for e-commerce websites to generate product descriptions,
it will have a product name text box, multi-file upload field, category
select box with clothes, shoes, accessory options, gender select box
with male, female, and unisex options and age select box with infants,
kids, teens, and adults options. Then a button that says Generate.

The UI is below:

Get Node.js code for e-commerce description generator

It would be good idea to read more about LLMs in general and also about Gemini. You can also do course or code labs about Gemini on Cloud Skills boost platform.

Go back to the slides :).