2
The Internet
Technical origin:ARPANET (late 1960’s)
One of earliest attempts to network heterogeneous,
geographically dispersed computers
Email first available on ARPANET in 1972 (and
quickly very popular!)
ARPANET access was limited to select
DoD-funded organizations
3.
3
The Internet
Open-access networks
Regional university networks (e.g., SURAnet)
CSNET for CS departments not on ARPANET
NSFNET (1985-1995)
Primary purpose: connect supercomputer centers
Secondary purpose: provide backbone to connect
regional networks
4.
4
The Internet
The 6supercomputer centers connected by the early NSFNET backbone
5.
5
The Internet
Original NSFNETbackbone speed: 56 kbit/s
Upgraded to 1.5 Mbit/s (T1) in 1988
Upgraded to 45 Mbit/s (T3) in 1991
In 1988, networks in Canada and France
connected to NSFNET
In 1990, ARPANET is decommissioned,
NSFNET the center of the internet
6.
6
The Internet
Internet: thenetwork of networks
connected via the public backbone and
communicating using TCP/IP communication
protocol
Backbone initially supplied by NSFNET,
privately funded (ISP fees) beginning in 1995
7.
7
Internet Protocols
Communication protocol:how computers talk
Cf. telephone “protocol”: how you answer and end
call, what language you speak, etc.
Internet protocols developed as part of
ARPANET research
ARPANET began using TCP/IP in 1982
Designed for use both within local area
networks (LAN’s) and between networks
8.
8
Internet Protocol (IP)
IPis the fundamental protocol defining the
Internet (as the name implies!)
IP address:
32-bit number (in IPv4)
Associated with at most one device at a time
(although device may have more than one)
Written as four dot-separated bytes, e.g.
192.0.34.166
9.
9
IP
IP function: transferdata from source device to
destination device
IP source software creates a packet representing the
data
Header: source and destination IP addresses, length of data,
etc.
Data itself
If destination is on another LAN, packet is sent to a
gateway that connects to more than one network
11
Transmission Control Protocol
(TCP)
Limitationsof IP:
No guarantee of packet delivery (packets can be
dropped)
Communication is one-way (source to destination)
TCP adds concept of a connection on top of
IP
Provides guarantee that packets delivered
Provide two-way (full duplex) communication
12.
12
TCP
Source Destination
Can Italk to you?
OK. Can I talk to you?
OK.
Here’s a packet.
Got it.
Here’s a packet.
Here’s a resent packet.
Got it.
Establish
connection.
{
{
{
Send packet
with
acknowledgment.
Resend packet if
no (or delayed)
acknowledgment.
13.
13
TCP
TCP also addsconcept of a port
TCP header contains port number representing an
application program on the destination computer
Some port numbers have standard meanings
Example: port 25 is normally used for email transmitted
using the Simple Mail Transfer Protocol (SMTP)
Other port numbers are available first-come-first
served to any application
15
User Datagram Protocol(UDP)
Like TCP in that:
Builds on IP
Provides port concept
Unlike TCP in that:
No connection concept
No transmission guarantee
Advantage of UDP vs. TCP:
Lightweight, so faster for one-time messages
16.
16
Domain Name Service(DNS)
DNS is the “phone book” for the Internet
Map between host names and IP addresses
DNS often uses UDP for communication
Host names
Labels separated by dots, e.g., www.example.org
Final label is top-level domain
Generic: .com, .org, etc.
Country-code: .us, .il, etc.
17.
17
DNS
Domains are dividedinto second-level domains,
which can be further divided into subdomains, etc.
E.g., in www.example.com, example is a
second-level domain
A host name plus domain name information is
called the fully qualified domain name of the
computer
Above, www is the host name, www.example.com
is the FQDN
18.
18
DNS
nslookup program providescommand-line
access to DNS (on most systems)
looking up a host name given an IP address is
known as a reverse lookup
Recall that single host may have multiple IP
addresses.
Address returned is the canonical IP address
specified in the DNS system.
19.
19
DNS
ipconfig (on windows)can be used to
find the IP address (addresses) of your
machine
ipconfig /displaydns displays the
contents of the DNS Resolver Cache
(ipconfig /flushdns to flush it)
20.
20
Analogy to TelephoneNetwork
IP ~ the telephone network
TCP ~ calling someone who answers,
having a conversation, and hanging up
UDP ~ calling someone and leaving a
message
DNS ~ directory assistance
21.
21
Higher-level Protocols
Many protocolsbuild on TCP
Telephone analogy: TCP specifies how we initiate
and terminate the phone call, but some other protocol
specifies how we carry on the actual conversation
Some examples:
SMTP (email) (25)
FTP (file transfer) (21)
HTTP (transfer of Web documents) (80)
22.
22
World Wide Web
Originally,one of several systems for
organizing Internet-based information
Competitors: WAIS, Gopher, ARCHIE
Distinctive feature of Web: support for
hypertext (text containing links)
Communication via Hypertext Transport Protocol
(HTTP)
Document representation using Hypertext Markup
Language (HTML)
23.
23
World Wide Web
TheWeb is the collection of machines (Web
servers) on the Internet that provide information,
particularly HTML documents, via HTTP.
Machines that access information on the Web
are known as Web clients. A Web browser is
software used by an end user to access the Web.
24.
24
Hypertext Transport Protocol
(HTTP)
HTTPis based on the request-response
communication model:
Client sends a request
Server sends a response
HTTP is a stateless protocol:
The protocol does not require the server to
remember anything about the client between
requests.
25.
25
HTTP
Normally implemented overa TCP connection (80 is
standard port number for HTTP)
Typical browser-server interaction:
User enters Web address in browser
Browser uses DNS to locate IP address
Browser opens TCP connection to server
Browser sends HTTP request over connection
Server sends HTTP response to browser over connection
Browser displays body of response in the client area of the
browser window
26.
26
HTTP
The information transmittedusing HTTP is
often entirely text
Can use the Internet’s Telnet protocol to
simulate browser request and view server
response
27.
27
HTTP
$ telnet www.example.org80
Trying 192.0.34.166...
Connected to www.example.com
(192.0.34.166).
Escape character is ’^]’.
GET / HTTP/1.1
Host: www.example.org
HTTP/1.1 200 OK
Date: Thu, 09 Oct 2003 20:30:49 GMT
…
{
Send
Request
{
Receive
Response
Connect {
30
HTTP Request
Start line
Example: GET / HTTP/1.1
Three space-separated parts:
HTTP request method
Request-URI (Uniform Resource Identifier)
HTTP version
31.
31
HTTP Request
Start line
Example:GET / HTTP/1.1
Three space-separated parts:
HTTP request method
Request-URI
HTTP version
We will cover 1.1, in which version part of start line
must be exactly as shown
32.
32
HTTP Request
Start line
Example: GET / HTTP/1.1
Three space-separated parts:
HTTP request method
Request-URI
HTTP version
33.
33
HTTP Request
Uniform ResourceIdentifier (URI)
Syntax: scheme : scheme-depend-part
Ex: In http://www.example.com/
the scheme is http
Request-URI is the portion of the requested URI
that follows the host name (which is supplied by the
required Host header field)
Ex: / is Request-URI portion of
http://www.example.com/
34.
34
URI
URI’s are oftwo types:
Uniform Resource Name (URN)
Can be used to identify resources with unique names, such
as books (which have unique ISBN’s)
Scheme is urn
Uniform Resource Locator (URL)
Specifies location at which a resource can be found
In addition to http, some other URL schemes are
https, ftp, mailto, and file
35.
35
HTTP Request
Start line
Example: GET / HTTP/1.1
Three space-separated parts:
HTTP request method
Request-URI
HTTP version
36.
36
HTTP Request
Common requestmethods:
GET
Used if link is clicked or address typed in browser
No body in request with GET method
POST
Used when submit button is clicked on a form
Form information contained in body of request
HEAD
Requests that only header fields (no body) be returned in the
response
38
HTTP Request
Header fieldstructure:
field name : field value
Syntax
Field name is not case sensitive
Field value may continue on multiple lines by
starting continuation lines with white space
Field values may contain MIME types, quality
values, and wildcard characters (*’s)
39.
39
Multipurpose Internet Mail
Extensions(MIME)
Convention for specifying content type of a
message
In HTTP, typically used to specify content type
of the body of the response
MIME content type syntax:
top-level type / subtype
Examples: text/html, image/jpeg
40.
40
HTTP Quality Valuesand
Wildcards
Example header field with quality values:
accept:
text/xml,text/html;q=0.9,
text/plain;q=0.8, image/jpeg,
image/gif;q=0.2,*/*;q=0.1
Quality value applies to all preceding items
Higher the value, higher the preference
Note use of wildcards to specify quality 0.1 for
any MIME type not specified earlier
41.
41
HTTP Request
Common headerfields:
Host: host name from URL (required)
User-Agent: type of browser sending request
Accept: MIME types of acceptable documents
Connection: value close tells server to close connection after
single request/response
Content-Type: MIME type of (POST) body, normally
application/x-www-form-urlencoded
Content-Length: bytes in body
Referer: URL of document containing link that supplied URI for
this HTTP request
44
HTTP Response
Status line
Example: HTTP/1.1 200 OK
Three space-separated parts:
HTTP version
status code
reason phrase (intended for human use)
45.
45
HTTP Response
Status code
Three-digit number
First digit is class of the status code:
1=Informational
2=Success
3=Redirection (alternate URL is supplied)
4=Client Error
5=Server Error
Other two digits provide additional information
See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
47
HTTP Response
Common headerfields:
Connection, Content-Type, Content-Length
Date: date and time at which response was generated (required)
Location: alternate URI if status is redirection
Last-Modified: date and time the requested resource was last
modified on the server
Expires: date and time after which the client’s copy of the
resource will be out-of-date
ETag: a unique identifier for this version of the requested
resource (changes if resource changes)
48.
48
Client Caching
A cacheis a local copy of information
obtained from some other source
Most web browsers use cache to store
requested resources so that subsequent
requests to the same resource will not
necessarily require an HTTP request/response
Ex: icon appearing multiple times in a Web page
53
Client Caching
Cache advantages
(Much) faster than HTTP request/response
Less network traffic
Less load on server
Cache disadvantage
Cached copy of resource may be invalid
(inconsistent with remote version)
54.
54
Client Caching
Validating cachedresource:
Send HTTP HEAD request and check Last-
Modified or ETag header in response
Compare current date/time with Expires header
sent in response containing resource
If no Expires header was sent, use heuristic
algorithm to estimate value for Expires
Ex: Expires = 0.01 * (Date – Last-Modified) + Date
55.
55
Character Sets
Every documentis represented by a string of integer
values (code points)
The mapping from code points to characters is defined
by a character set
Some header fields have character set values:
Accept-Charset: request header listing character sets that the
client can recognize
Ex: accept-charset: ISO-8859-1,utf-8;q=0.7,*;q=0.5
Content-Type: can include character set used to represent the
body of the HTTP message
Ex: Content-Type: text/html; charset=UTF-8
56.
56
Character Sets
Technically, many“character sets” are
actually character encodings
An encoding represents code points using variable-
length byte strings
Most common examples are Unicode-based
encodings UTF-8 and UTF-16
IANA maintains complete list of Internet-
recognized character sets/encodings
57.
57
Character Sets
Typical USPC produces ASCII documents
US-ASCII character set can be used for such
documents, but is not recommended
UTF-8 and ISO-8859-1 are supersets of US-ASCII and
provide international compatibility
UTF-8 can represent all ASCII characters using a single byte
each and arbitrary Unicode characters using up to 4 bytes each
ISO-8859-1 is 1-byte code that has many characters common in
Western European languages, such as é
58.
58
Web Clients
Many possibleweb clients:
Text-only “browser” (lynx)
Mobile phones
Robots (software-only clients, e.g., search engine
“crawlers”)
etc.
We will focus on traditional web browsers
61
Web Browsers
Primary tasks:
Convert web addresses (URL’s) to HTTP
requests
Communicate with web servers via HTTP
Render (appropriately display) documents
returned by a server
62.
62
HTTP URL’s
Browser usesauthority to connect via TCP
Request-URI included in start line (/ used for path
if none supplied)
Fragment identifier not sent to server (used to
scroll browser client area)
http://www.example.org:56789/a/b/c.txt?t=win&s=chess#para5
host (FQDN) port
authority
path query fragment
Request-URI
63.
63
Web Browsers
Standard features
Saveweb page to disk
Find string in page
Fill forms automatically (passwords, CC numbers, …)
Set preferences (language, character set, cache and HTTP
parameters)
Modify display style (e.g., increase font sizes)
Display raw HTML and HTTP header info (e.g., Last-Modified)
Choose browser themes (skins)
View history of web addresses visited
Bookmark favorite pages for easy return
64.
64
Web Browsers
Additional functionality:
Execution of scripts (e.g., drop-down menus)
Event handling (e.g., mouse clicks)
GUI for controls (e.g., buttons)
Secure communication with servers
Display of non-HTML documents (e.g., PDF)
via plug-ins
65.
65
Web Servers
Basic functionality:
Receive HTTP request via TCP
Map Host header to specific virtual host (one of many host
names sharing an IP address)
Map Request-URI to specific resource associated with the
virtual host
File: Return file in HTTP response
Program: Run program and return output in HTTP response
Map type of resource to appropriate MIME type and use to
set Content-Type header in HTTP response
Log information about the request and response
66.
66
Web Servers
httpd: UIUC,primary Web server c. 1995
Apache: “A patchy” version of httpd, now the most
popular server (esp. on Linux platforms)
IIS: Microsoft Internet Information Server
Tomcat:
Java-based
Provides container (Catalina) for running Java servlets
(HTML-generating programs) as back-end to Apache or IIS
Can run stand-alone using Coyote HTTP front-end
67.
67
Web Servers
Some Coyotecommunication parameters:
Allowed/blocked IP addresses
Max. simultaneous active TCP connections
Max. queued TCP connection requests
“Keep-alive” time for inactive TCP connections
Modify parameters to tune server
performance
68.
68
Web Servers
Some Catalinacontainer parameters:
Virtual host names and associated ports
Logging preferences
Mapping from Request-URI’s to server
resources
Password protection of resources
Use of server-side caching
69.
69
Tomcat Web Server
HTML-basedserver administration
Browse to
http://localhost:8080
and click on Server Administration link
localhost is a special host name that means
“this machine”
Guy-Vincent Jourdan ::CSI 3140 :: based on Jeffrey C. Jackson’s slides 71
Tomcat Web Server
Some Connector fields:
Port Number: port “owned” by this connector
Max Threads: max connections processed
simultaneously
Connection Timeout: keep-alive time
72.
Guy-Vincent Jourdan ::CSI 3140 :: based on Jeffrey C. Jackson’s slides 72
Tomcat Web Server
73.
Guy-Vincent Jourdan ::CSI 3140 :: based on Jeffrey C. Jackson’s slides 73
Secure Servers
Since HTTP messages typically travel over a
public network, private information (such as credit
card numbers) should be encrypted to prevent
eavesdropping
https URL scheme tells browser to use encryption
Common encryption standards:
Secure Socket Layer (SSL)
Transport Layer Security (TLS)
74.
Guy-Vincent Jourdan ::CSI 3140 :: based on Jeffrey C. Jackson’s slides 74
Secure Servers
Browser
Web
Server
I’d like to talk securely to you (over port 443)
Here’s my certificate and encryption data
Here’s an encrypted HTTP request
Here’s an encrypted HTTP response
Here’s an encrypted HTTP request
Here’s an encrypted HTTP response
TLS/
SSL
TLS/
SSL
HTTP
Requests
HTTP
Responses
HTTP
Requests
HTTP
Responses
75.
Guy-Vincent Jourdan ::CSI 3140 :: based on Jeffrey C. Jackson’s slides 75
Secure Servers
Man-in-the-Middle Attack
Browser
Fake
DNS
Server
What’s IP
address for
www.example.org?
100.1.1.1
Fake
www.example.org
100.1.1.1
Real
www.example.org
My credit card number is…
76.
Guy-Vincent Jourdan ::CSI 3140 :: based on Jeffrey C. Jackson’s slides 76
Secure Servers
Preventing Man-in-the-Middle
Browser
Fake
DNS
Server
What’s IP
address for
www.example.org?
100.1.1.1
Fake
www.example.org
100.1.1.1
Real
www.example.org
Send me a certificate of identity