build-with-ai-sydney AI for web devs Tamas Piros

Tamas Piros
Build with AI
Practical AI for Web Developers

Thank you for the standing
ovation!

Agenda
AI Terminology overview
Building sample apps

Arti
fi
cial Intelligence
Machine Learning
Neural Networks
Deep Learning
Generative AI
Large Language Models

AI
Arti
fi
cial Intelligence
Overall science of creating
machines that can perform tasks
that typically require human
intelligence

ML
Machine Learning
A subset of AI where machines
learn from data without being
explicitly programmed

Neural
Networks
A subset of ML inspired by the
structure of the human brain,
used for pattern recognition

Deep
Learning
Subset of neural networks with
many layers (deep) that excels in
processing complex data like
images, audio and text.

Gen(erative) AI
A part of deep learning focused
on creating new content like text,
images or music. In other words
it “generates” stuff.

LLM
Large Language Model
Type of Model trained on
massive amounts of text data. It
can answer questions and write
text.
Gemini is an LLM developed by
Google.

Model
A system that has been trained
to recognise patterns and make
predictions or decisions based
on data.
A tool that learns from examples
and uses that knowledge to
solve problems or perform a
task.

Model
Models can be tailored.
“Sit, stay, down” - your
average dog
But there are
fi
remen’s
dogs, K-9s, guide dogs
etc.

Model
gemini-2.0-flash-001
gemini-1.5-flash
gemini-1.5-pro
text-embedding-004
aqa
+ others
https://ai.google.dev/gemini-api/docs/models/gemini

Multimodal
Model
A type of model that can
understand various inputs (e.g.
audio, image, video, text) and
generate various outputs (e.g.
audio, image and text)
gemini-2.0-flash-001 is a
multimodal model. (*coming soon)
https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-
fl
ash

Transformer
Transformer architecture is a
super-smart text processor.
It uses an attention mechanism
with which it can
fi
gure out how
words relate to each other.
“The cat chased the mouse, and
it ran away”

Attention
Attention assigns a weight to
each word based on its
relevance to others, scoring their
importance.
This helps the system to
determine relationships between
words and use that context to
determine meaning.
https://www.youtube.com/watch?v=KJtZARuO3JY

Hallucination
Models can generate information
that is incorrect, irrelevant or
completely made up even
though it might sound plausible.
Always do fact-checking.

Hallucination
Gemini 2.0 is the least
hallucinating model.

https://github.com/tpiros/gdg-ai-workshop
There are several API keys that you will need.
Check the README in the repo for more information.

Transactional
These interactions generate an
answer based on an input but
they are one-off. They do not
“remember” the conversation.
There’s no actual conversation
here.

Transactional
import { GoogleGenerativeAI } from '@google/
generative-ai';
const genAI = new
GoogleGenerativeAI(process.env.GEMINI_API_KEY
);
const model = genAI.getGenerativeModel(
{ model: 'gemini-1.5-flash' }
);
const prompt = 'What is Star Wars?';
const result = await
model.generateContent(prompt);
console.log(result.response.text());

Transactional
const result =
await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
const chunkText = chunk.text();
process.stdout.write(chunkText);
}

Try it out
npm run prompt-stream

Temperature
Parameter that controls the
randomness (creativity) of a
model’s responses.
Different models have different
temperature control parameters.
gemini-1.5-flash: 0.0 - 2.0 (default 1.0)
gemini-2.0-flash: 0.0 - 2.0 (default 0.7)

Temperature
What is the capital of Italy?
T0.2: “Rome”
T0.8: “Rome, the Eternal City”
T1.2: “Rome is the capital of Italy,
known for its ancient history and
landmarks like the Colosseum”
T1.6: “Rome, though Florence could claim
the title for its art and culture”

Try it out
npm run temperature [float]

Chat
import { GoogleGenerativeAI } from '@google/
generative-ai';
const genAI = new
GoogleGenerativeAI(process.env.GEMINI_API_KEY
);
const model = genAI.getGenerativeModel(
{ model: 'gemini-1.5-flash' }
);
const chat = model.startChat(“Hello”);
console.log(result.response.text());

Chat
Allows for back-and-forth
discussion.
Do note that sendMessage()
keeps history by default, while
sendMessageStream() does not -
you need to manage it.

Chat
History can be maintained by
passing a history array to the
startChat method.
const history = [];
let result = await model
.startChat({ history })
.sendMessageStream(input);

Chat
Content needs to be associated
with a role. Gemini supports 2
roles:
user and model.
User is the role which provides
the prompts.
Model is the role that provides
the responses.

Chat
Note that the history array must
conform to the following
schema.
{
role: ‘user',
parts: [{
text: userInput
}]
}

Try it out
npm run chat
npm run chat-stream
npm run chat-with-history

Token limits
Models have token limits on both
input and output.
For example gemini-2.0-
flash-001 has 1,048,576 input
token limit and 8,192 output
token limit.
https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-
fl
ash

Model
parameters
Models take parameter values
that control how they generate a
response.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values

Max output
tokens
How many tokens (words) should
be returned. Generally 100
tokens is about 60-80 words.

Top-K
Top-K controls how a model
selects tokens by limiting the
choices to the K most probable
tokens.
Top-K = 1 means that the model
always selects the most probable
token.
Top-K = 3 randomly selects the top
3 most probable tokens.

Top-P
Top-P dynamically limits token
selection to the smallest set of
words where the cumulative
probability meets a threshold.
Lower values make responses
more predictable. Higher values
consider less probably words.

Top-P
Top-K
“The dog is…”
Word Probability
happy 0.35
running 0.25
barking 0.20
sleeping 0.10
jumping 0.05
fl
ying 0.02
Top-K = 3
Top-P = 0.7
Word Probability Cumulative Probability
happy 0.35 0.35
running 0.25 0.60
barking 0.20 0.80
sleeping 0.10 0.90
jumping 0.05 0.95
fl
ying 0.02 0.97

Safety
Block harmful content such as
harassment, hate, sexually
explicit language, dangerous
content, and content that could
jeopardise civic integrity.
https://ai.google.dev/gemini-api/docs/safety-settings

Safety
import { HarmBlockThreshold, HarmCategory }
from "@google/generative-ai";
const safetySettings = [
{
category: HarmCategory.HARM_CATEGORY_HARASSMENT,
threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
},
{
category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
},
];
const model = genAi.getGenerativeModel(
{
model: “gemini-1.5-flash",
safetySettings,
}
);
https://ai.google.dev/gemini-api/docs/safety-settings

Context
What happens in multi-turn
conversational systems with the
context?
Often there’s a cutoff, we ask the
LLM to generate a summary and
store that summary so that the
model is aware of what was
discussed so far, without running
out of context.

Structured
Output
Instruct the model to respond
using a speci
fi
c format. The
format can be embedded in the
prompt or build a schema
programmatically.
https://ai.google.dev/gemini-api/docs/structured-output?hl=en&lang=node

Structured
Output
const schema = {
description: 'Array of original Star Wars films with details',
type: SchemaType.ARRAY,
items: {
type: SchemaType.OBJECT,
properties: {
title: {
type: SchemaType.STRING,
description: 'Title of the film',
nullable: false,
},
released: {
description: 'Release date of the film',
nullable: false,
},
characters: {
description: 'Notable characters in the film',
nullable: false,
},
plot: {
description: "Short summary of the film's plot",
nullable: false,
},
},
required: ['title', 'released', 'characters', 'plot'],
},
};
https://ai.google.dev/gemini-api/docs/structured-output?hl=en&lang=node

Live
Information
Models are trained on vast
amount of data, and they have a
training cut-off date. In other
words, information that is inside
a model may be outdated.
Also, live information cannot be
retrieved by the models.

Try it out
npm run weather [city]

Function
Calling
Allows AI to call prede
fi
ned
functions to perform speci
fi
c
actions. These actions can be for
data retrieval or anything else.
A function is just (in our case) a
JavaScript function.

Function
Calling
const model = genAI.getGenerativeModel({
model: 'gemini-2.0-flash-exp',
tools: [
{
functionDeclarations: [
{
name: 'getWeather',
description:
'gets the weather for a city and returns the
forecast using the metric system.',
parameters: {
type: 'object',
properties: {
city: {
type: 'string',
description: 'the city for which the weather
is requested',
},
},
required: ['city'],
},
},
],
},
],
toolConfig: { functionCallingConfig: { mode: 'AUTO' } },
});

Recommendation
Q&A system
Knowledge-base

Recommendation
Q&A system
Knowledge-base
Systems can be build to provide
product recommendations.
We can also provide documents
to LLMs for data retrieval.

Vector
Vectorisation is the process of
turning words into vectors (list of
numbers)
The attention mechanism uses
the context to assign meaning to
a word allowing the model to
understand and process words
accurately.

Vector
I deposited money at the bank.
We sat by the bank of the river.

Vector
Without vectorisation how do
you determine the real meaning
of the word bank?

Vector
Remember context matters.
Words like “deposit”, “money” is
a strong signal that bank is a
fi
nancial institution.
“Sat”, “river” and “by” suggest
bank means riverbank.

Embedding
An embedding is a way of
representing words as numerical
vectors.
For example
“king” [0.8, 0.1, 0.6]
“queen” [0.8, 0.1, 0.8]
“dog” [0.3, 0.8, 0.2]

Distance
The distance between the
vectors determines their how
“close” they are to one another.
For example
“king” [0.8, 0.1, 0.6]
“queen” [0.8, 0.1, 0.8]
“dog” [0.3, 0.8, 0.2]

Distance
Think about a product
recommendation. And let’s do
some super-simple maths.
Let’s say that Kiasu Karen
bought an apple 🍎 and a banana
🍌.
We take sweetness and texture
into consideration.

Distance
Product Sweetness Texture Vector
🍎 6 1 [6,1]
🍌 7 2 [7,2]
AVG [6.5,1.5]
Product Sweetness Texture Vector
🍋 1 1 [1,1]
🥭 8 2 [8,2]

Distance
Product Sweetness Texture Vector Distance
🍎 6 1 [6,1]
🍌 7 2 [7,2]
AVG [6.5,1.5]
🍋 1 1 [1,1] 5.52
🥭 8 2 [8,2] 1.58

Vector (sidenote)
Mother
Father

Vector (sidenote)
Mother
Father
Daughter
Son

Vector
These numerical representations
can be up to 15000 dimensions.

Create
embeddings
import { GoogleGenerativeAI } from
'@google/generative-ai';
const genAI = new
GoogleGenerativeAI(process.env.GEMINI_API_K
EY);
const model =
genAI.getGenerativeModel(
{ model: ‘text-embedding-004’ }
);
async function run() {
const result =
await model.embedContent('king');
console.log(result.embedding.values);
}
run();

Create
embeddings
Important!
Embedding models have to
match.
If `text-embedding-004` is
used in the query, the vector
database must also contain
embedding using the same
model.

On Google’s platform this can
also be done via BigQuery.
BigQuery supports storing
embedding and searching
against vectors.
Create
embeddings

Vector
database
Stores embeddings (numerical
representation of words)

Embedding
Embeddings are important
because they allow for use-
cases that revolve around QA
systems, recommendation
systems and similarity searches.
Note searches can only be run
against Vector indices.

Embedding
1. Ingest data to BigQuery
2. Create a model
3. Create embeddings
4. Create index
5. Query

CREATE OR REPLACE VECTOR INDEX ìdx`
ON ài-workshop-448608.ai_workshop_films.films_with_embeddings`
(ml_generate_embedding_result)
OPTIONS(
distance_type='COSINE', index_type='IVF', ivf_options='{"num_lists": 10}'
)
CREATE OR REPLACE MODEL ài-workshop-448608.ai_workshop_films.film_embedding`
REMOTE WITH CONNECTION ài-workshop-448608.asia-southeast1.ai_workshop_connection`
OPTIONS(
endpoint = 'text-embedding-005'
);
CREATE OR REPLACE TABLE ài-workshop-448608.ai_workshop_films.films_with_embeddings` AS (
SELECT *
FROM ML.GENERATE_EMBEDDING(
MODEL ài-workshop-448608.ai_workshop_films.film_embedding`,
(
SELECT *, CONCAT(title, '', overview) AS content
FROM ài-workshop-448608.ai_workshop_films.films`
)
)
);
https://www.youtube.com/watch?v=eztSNAZ0f_4

https://www.youtube.com/watch?v=eztSNAZ0f_4
SELECT * FROM VECTOR_SEARCH(
TABLE `ai-workshop-448608.ai_workshop_films.films_with_embeddings`,
'ml_generate_embedding_result',
(
SELECT ml_generate_embedding_result AS embedding_col FROM ML.GENERATE_EMBEDDING
(
MODEL `ai-workshop-448608.ai_workshop_films.film_embedding`,
(SELECT "Star Wars" AS content),
STRUCT(TRUE AS flatten_json_output)
)
)
,top_k => 5
);

Try it out
npm run recommendation “”
npm run recommendation-advanced

Similarity
Search
1. Embed the content
2. Embed the question/query
Return results based on
similarity (distance) matches.

Similarity
Search
BUT
Highest similarity score
doesn’t mean highest
relevance.

Similarity
Search
User read and likes “Harry Potter
and the Prisoner of Azkaban”
Book Similarity Score
Harry Potter and the Cursed Child 0.98
Percy Jackson & The Olympians 0.85
The Magicians by Lev Grossman 0.82
The other two results have lower similarity
scores but are more relevant to what the user
originally read.
https://www.youtube.com/watch?v=o5_t6Ai--ws

Fine Tuning
Process of adapting a pre-
trained model for speci
fi
c tasks
or use-cases. Done by providing
examples to the model. (This is
also called supervised
fi
ne-
tuning)
https://ai.google.dev/gemini-api/docs/model-tuning

Try it out
npm run fine-tuning

RAG
Retrieval-Augmented Generation
is a technique where retrieval is
done from source(s) speci
fi
ed by
the user and generation is done
using that information and
answering in natural language.
https://cloud.google.com/use-cases/retrieval-augmented-generation?hl=en

RAG
Pipeline
https://miro.medium.com/v2/resize:
fi
t:1200/1*J7vyY3EjY46AlduMvr9FbQ.png

Agents
An AI agent is a smart assistant
that can observe a situation,
think about what to do and act to
achieve the desired outcome/
goal.

Agents
There are many types of agents -
some are just calling functions,
some systems can also be multi-
agent.

Agents
Multi agent systems can pass
work to each other and
effectively communicate with
each other to react to a task.

Agentic
RAG
Agentic RAG combines AI agents
with RAG.
It initially retrieves answers from
speci
fi
ed data sources but can
autonomously expand its search
via function calling if no relevant
information is found.

Media
Assets
While Generative AI can be used
to generate imagery, we’ll focus
on modifying existing media
assets.

Generative
Fill
Generative
fi
ll can expand an
image in any dimension.

Generative
Fill
Uses a method called inpainting,
where AI looks at the colours,
textures and patterns around the
image and generates new
content and blends that with the
rest of the image.

Generative
Replace
Replace objects

Generative
Remove
Remove objects from photos

Generative
Recolour
Change the colour of objects

Try it out
npm run ecommerce
https://videoapi.cloudinary.com/video-demo/
video-smart-cropping

C2PA
C2PA (Coalition for Content
Provenance and Authenticity) is a
standard developed to provide a
framework for verifying
authenticity and provenance of
digital content.

Try it out
https://contentcredentials.org/verify?source=https://
eric-cloudinary-res.cloudinary.com/image/upload/
fl_c2pa,w_1000,q_auto/verify_tool_01_l1002253.jpg

Evals
Evals are a method for testing AI
systems to assess their
performance and reliability. They
are crucial in production to
measure how well a model
functions and responds to
different inputs.
https://cloud.google.com/vertex-ai/generative-ai/docs/models/determine-eval

Evals
A basic eval could be: “Did the AI
call the function when it had to”?
Score 1 - yes, Score 0 - no
Learn more
https://github.com/
braintrustdata/autoevals

Evals
https://cloud.google.com/vertex-ai/generative-ai/docs/models/determine-eval

References
• https://ai.google.dev/
• https://aistudio.google.com
• https://cloud.google.com/learn/training/machinelearning-ai
• https://ai.cloudinary.com/
• https://cloudinary.com/products/cloudinary_ai

build-with-ai-sydney AI for web devs Tamas Piros

More Related Content

Similar to build-with-ai-sydney AI for web devs Tamas Piros

More from Geshan Manandhar

Recently uploaded

build-with-ai-sydney AI for web devs Tamas Piros