…and other stuff
that make the web work
Bits ‘bout Moi!
 Senor Bipin Upadhyay
    Developer, Directi Pvt. Ltd.
    Lead, NULL Open Security Group – Mumbai Chapter
    OWASP ESAPI-PHP Committer
    Part of IHP (Honeynet Project)
    Amateur Photographer
I know Kung-fu…
If Only it was true…
Think about the possibilities…
I know Kung-fu
Me too..
Me three..
Sigh! But it ain’t true, yet!
Agenda




http://icanhascheezburger.files.wordpress.com/2009/02/funny-pictures-cat-has-naps-on-his-agenda.jpg
Agenda
 Intro: What & Why???

 OSI model: Back to the basics

 10000 feet view: How the web works

 RFC 2616: Anatomy

 RFC 2965: Handling Statelessness
Agenda
 Intro: What & Why???

 OSI model: Back to the basics

 10000 feet view: How the web works

 RFC 2616: Anatomy

 RFC 2965: Handling Statelessness
Bit of History
 Mar’89 – T.B. Lee presents “Information Management:
              A Proposal”
   Aug’91 – Announces WWW
   Mar’93 – Mosaic announced
   Mar’94 – Netscape found
   Oct’94 – W3C found by T.B. Lee
Web 2.0, uh!




http://www.wagnerblog.com/images/AjaxDarkSide.jpg
HTTP: What is it?
 Part of the Application Layer of TCP/IP protocol suite
HTTP: What is it?
        Part of the Application Layer of TCP/IP protocol suite
        A set of grammatical rules for a client and server to
            communicate




http://www.flickr.com/photos/joshfassbind/4584323789/
HTTP: What is it?
 Part of the Application Layer of TCP/IP protocol suite
 A set of grammatical rules for a client and server to
  communicate
 HTTP is what powers the WWW
…but




http://www.flickr.com/photos/quinnanya/4456123452/
Why should I bother?
        Because:
           web development sucks




http://www.flickr.com/photos/sneeu/1589152071/
Why should I bother?
 Because:
    web development sucks
    Even your grandmom knows, ‘tis all about fundamentals
Why should I bother?
 Also:
    facilitates debugging,
    improves understanding of security & performance
Why should I bother?
Agenda
 Intro: What & Why???

 OSI model: Back to the basics

 10000 feet view: How the web works

 RFC 2616: Anatomy

 RFC 2985: Handling Statelessness




                                     http://www.flickr.com/photos/stephenpoff/2312981944/
OSI & TCP/IP protocol suite
         OSI is a reference model




http://blog.uad.ac.id/imam_riadi/files/2009/01/osi-layer.jpg
OSI & TCP/IP protocol suite…
        TCP/IP protocol suite is implementation of OSI




http://www.hill2dot0.com/wiki/index.php?title=Image:G0209_TCPIP_vs_OSI.jpg
OSI & TCP/IP protocol suite…
 Visual learning: Wireshark, baby
    http://www.wireshark.org/
Agenda
 Intro: What & Why???

 OSI model: Back to the basics

 10000 feet view: How the web works

 RFC 2616: Anatomy

 RFC 2965: Handling Statelessness
The Communication
        My favorite interview question:




http://www.flickr.com/photos/terryhart/2890904949/
The Communication
 My favorite interview question:
   What all happens between the time when:


                                      and the page is
     we click on a                    completely
     hyperlink                        rendered in a
                                      browser
Web      DB
Brower   Proxy   Internetz   LB
                                  Server   Server
Client                            Server (null.co.in)


                                               Web            DB
Brower        Proxy   Internetz   LB
                                              Server         Server
Client                                       Server (null.co.in)


                                                                  Web            DB
    Brower            Proxy     Internetz            LB
                                                                 Server         Server




null.co.in




                              Browser cache/ hosts
                                file/ DNS server
Client                                             Server (null.co.in)


                                                                        Web            DB
    Brower            Proxy            Internetz           LB
                                                                       Server         Server




null.co.in
                              74.53.228.212




                                    Browser cache/ hosts
                                      file/ DNS server
Client                                       Server (null.co.in)


                                                          Web            DB
Brower        Proxy       Internetz          LB
                                                         Server         Server



                           SYN




                      TCP Connection: There, bro?
Client                                      Server (null.co.in)


                                                         Web            DB
Brower        Proxy       Internetz         LB
                                                        Server         Server



                           SYN

                         SYN-ACK




                      TCP Connection: Yo!
Client                                        Server (null.co.in)


                                                           Web            DB
Brower        Proxy       Internetz           LB
                                                          Server         Server



                           SYN

                         SYN-ACK

                           ACK




                      TCP Connection: Cool!
Client                                       Server (null.co.in)


                                                          Web            DB
Brower        Proxy       Internetz          LB
                                                         Server         Server



                            GET /




                      HTTP: Got this file?
Client                                        Server (null.co.in)


                                                           Web            DB
Brower        Proxy       Internetz           LB
                                                          Server         Server



                            GET /
                          200 OK
                          index.html




                      HTTP: Yup! Here ‘tis.
Client                                        Server (null.co.in)


                                                           Web            DB
Brower        Proxy       Internetz           LB
                                                          Server         Server



                            GET /
                          200 OK
                          index.html
                            GET /js.js
                            GET /pic.jpg




                      HTTP: Can I have these as well?
Client                                     Server (null.co.in)


                                                        Web            DB
Brower        Proxy       Internetz        LB
                                                       Server         Server



                            GET /
                          200 OK
                          index.html
                            GET /js.js
                            GET /pic.jpg
                          200 OK
                          more content…
                      HTTP: Sure!
Client                                      Server (null.co.in)


                                                         Web            DB
Brower        Proxy       Internetz         LB
                                                        Server         Server



                           FIN




                      TCP Connection: Arigato, am done.
Client                                       Server (null.co.in)


                                                          Web            DB
Brower        Proxy       Internetz         LB
                                                         Server         Server



                           FIN

                         FIN-ACK




                      TCP Connection: Sayonara!
The Communication
 …. or simply
The Communication
 Web 2.0 has shrunk the client and server distinction




 Conventionally, client sends an HTTP request
 Server responds with an HTTP response
The Communication: HTTP Request
 Request Line
    Request Method
    Requested Resource
    HTTP Version used


 Headers
   General Headers
   Request Headers
   Entity Headers


 Content (Optional)
The Communication: HTTP Response
 Status Line
    HTTP version(s) understood by server
    Status code (3 digit numerical value)
    Status description


 Headers
   General Headers
   Response Headers
   Entity Headers


 Content (Optional)
Agenda
        Intro: What & Why???

        OSI model: Back to the basics

        10000 feet view: How the web works

        RFC 2616: Anatomy

        RFC 2965: Handling Statelessness




http://www.saynotocrack.com/wp-content/uploads/2007/06/flinstones-anatomy.jpg
Anatomy
 HTTP Request and Response are comprised of various
 components:
   Request Methods
   Response Status Codes
   Request Headers
   Response Headers
   General Headers
   Entity Headers
   Content (MIME Media Types)
Anatomy: Request Methods
 Humans can convey emotions in several ways
 Why should HTTP clients lag!!!
 HTTP methods describe the type of communication




  GET          POST        HEAD        OPTIONS
  TRACE        PUT         DELETE      CONNECT
Anatomy: Response Status Codes
 Indicate the server’s mood corresponding to a request
 Combination of a numerical code, and a short
  description
 Cab be categorized in 5 categories:
       1xx        --     Informational
       2xx        --     Successful
       3xx        --     Redirection
       4xx        --     Client Error
       5xx        --     Server Error
Anatomy: Request Headers
 Specific to an HTTP Request
 Carry information about the client, and the type of
  request
 Facilitates better understanding between client and
  server

  Host              Accept-Language   If-Modified-Since   Referer
  User-Agent        Authorization     If-None-Match       Expect
  Accept            Proxy-            If-Range            From
                    Authorization
  Accept-Charset    Max-Forwards      If-Unmodified-      TE
                                      Since
  Accept-Encoding   If-Match          Range
Anatomy: Response Headers
 Specific to an HTTP Response
 Carry information about the server, and the type of
 response




  Accept-Ranges   ETag       Retry-After   WWW-Authenticate
  Age             Location   Server        Proxy-Authenticate
  Vary
Anatomy: General Headers
 Carry information about the HTTP transaction
 Can be a part of request, as well as response




  Cache-Control       Keep-Alive   Pragma    Via
  Connection          Upgrade      Trailer   Warning
  Transfer-Encoding   Date
Anatomy: Entity Headers
 Carry information about the content
 Mainly a part of HTTP response




  Allow              Content-Language   Content-Location   Content-Range
  Content-Encoding   Content-Length     Content-MD5        Content-Type
  Expires            Last-Modified
Anatomy: Content
 IANA maintains a list of valid content types
 It is specified by the Content-Type Entity header
 Categorized in 9 MIME Media types:




  application   audio        example      image
  message       model        multipart    text
  video
Agenda
 Intro: What & Why???

 OSI model: Back to the basics

 10000 feet view: How the web works

 RFC 2616: Anatomy

 RFC 2965: Handling Statelessness
Handling Statelessness
 HTTP is a stateless protocol
Handling Statelessness
 HTTP is a stateless protocol
   i.e., server’s got a bad memory
Handling Statelessness
        Cookies to rescue




http://www.flickr.com/photos/lij/283869088/
Handling Statelessness
 Cookies:
    are text files stored by client browser
    maintain session by storing information
    are non-executable
Handling Statelessness
 Cookie attributes:
    name=value
    expires=value
    domain=value
    path=value
    Secure
    HttpOnly --not a part of spec
Conclusion
  The single biggest problem in communication
     is the illusion… that it has taken place.
                              --George Bernard Shaw
Conclusion
   The single biggest problem in communication
      is the illusion… that it has taken place.
                               --George Bernard Shaw

 Think about it 
Q&A!!!
 Got queries? Raise your hands.
 Arigato! 


 Contact info:
    Om—At—[projectbee.org/null.co.in]
    http://projectbee.org/
    Twitter - @bipinu
    Flickr -- projectbee

"Http protocol and other stuff" by Bipin Upadhyay

  • 2.
    …and other stuff thatmake the web work
  • 3.
    Bits ‘bout Moi! Senor Bipin Upadhyay  Developer, Directi Pvt. Ltd.  Lead, NULL Open Security Group – Mumbai Chapter  OWASP ESAPI-PHP Committer  Part of IHP (Honeynet Project)  Amateur Photographer
  • 4.
  • 5.
    If Only itwas true…
  • 6.
    Think about thepossibilities…
  • 7.
  • 8.
  • 9.
  • 10.
    Sigh! But itain’t true, yet!
  • 11.
  • 12.
    Agenda  Intro: What& Why???  OSI model: Back to the basics  10000 feet view: How the web works  RFC 2616: Anatomy  RFC 2965: Handling Statelessness
  • 13.
    Agenda  Intro: What& Why???  OSI model: Back to the basics  10000 feet view: How the web works  RFC 2616: Anatomy  RFC 2965: Handling Statelessness
  • 14.
    Bit of History Mar’89 – T.B. Lee presents “Information Management: A Proposal”  Aug’91 – Announces WWW  Mar’93 – Mosaic announced  Mar’94 – Netscape found  Oct’94 – W3C found by T.B. Lee
  • 15.
  • 16.
    HTTP: What isit?  Part of the Application Layer of TCP/IP protocol suite
  • 17.
    HTTP: What isit?  Part of the Application Layer of TCP/IP protocol suite  A set of grammatical rules for a client and server to communicate http://www.flickr.com/photos/joshfassbind/4584323789/
  • 18.
    HTTP: What isit?  Part of the Application Layer of TCP/IP protocol suite  A set of grammatical rules for a client and server to communicate  HTTP is what powers the WWW
  • 19.
  • 20.
    Why should Ibother?  Because:  web development sucks http://www.flickr.com/photos/sneeu/1589152071/
  • 21.
    Why should Ibother?  Because:  web development sucks  Even your grandmom knows, ‘tis all about fundamentals
  • 22.
    Why should Ibother?  Also:  facilitates debugging,  improves understanding of security & performance
  • 23.
    Why should Ibother?
  • 24.
    Agenda  Intro: What& Why???  OSI model: Back to the basics  10000 feet view: How the web works  RFC 2616: Anatomy  RFC 2985: Handling Statelessness http://www.flickr.com/photos/stephenpoff/2312981944/
  • 25.
    OSI & TCP/IPprotocol suite  OSI is a reference model http://blog.uad.ac.id/imam_riadi/files/2009/01/osi-layer.jpg
  • 26.
    OSI & TCP/IPprotocol suite…  TCP/IP protocol suite is implementation of OSI http://www.hill2dot0.com/wiki/index.php?title=Image:G0209_TCPIP_vs_OSI.jpg
  • 27.
    OSI & TCP/IPprotocol suite…  Visual learning: Wireshark, baby  http://www.wireshark.org/
  • 28.
    Agenda  Intro: What& Why???  OSI model: Back to the basics  10000 feet view: How the web works  RFC 2616: Anatomy  RFC 2965: Handling Statelessness
  • 29.
    The Communication  My favorite interview question: http://www.flickr.com/photos/terryhart/2890904949/
  • 30.
    The Communication  Myfavorite interview question:  What all happens between the time when: and the page is we click on a completely hyperlink rendered in a browser
  • 31.
    Web DB Brower Proxy Internetz LB Server Server
  • 32.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server
  • 33.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server null.co.in Browser cache/ hosts file/ DNS server
  • 34.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server null.co.in 74.53.228.212 Browser cache/ hosts file/ DNS server
  • 35.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server SYN TCP Connection: There, bro?
  • 36.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server SYN SYN-ACK TCP Connection: Yo!
  • 37.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server SYN SYN-ACK ACK TCP Connection: Cool!
  • 38.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server GET / HTTP: Got this file?
  • 39.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server GET / 200 OK index.html HTTP: Yup! Here ‘tis.
  • 40.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server GET / 200 OK index.html GET /js.js GET /pic.jpg HTTP: Can I have these as well?
  • 41.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server GET / 200 OK index.html GET /js.js GET /pic.jpg 200 OK more content… HTTP: Sure!
  • 42.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server FIN TCP Connection: Arigato, am done.
  • 43.
    Client Server (null.co.in) Web DB Brower Proxy Internetz LB Server Server FIN FIN-ACK TCP Connection: Sayonara!
  • 44.
  • 45.
    The Communication  Web2.0 has shrunk the client and server distinction  Conventionally, client sends an HTTP request  Server responds with an HTTP response
  • 46.
    The Communication: HTTPRequest  Request Line  Request Method  Requested Resource  HTTP Version used  Headers  General Headers  Request Headers  Entity Headers  Content (Optional)
  • 47.
    The Communication: HTTPResponse  Status Line  HTTP version(s) understood by server  Status code (3 digit numerical value)  Status description  Headers  General Headers  Response Headers  Entity Headers  Content (Optional)
  • 48.
    Agenda  Intro: What & Why???  OSI model: Back to the basics  10000 feet view: How the web works  RFC 2616: Anatomy  RFC 2965: Handling Statelessness http://www.saynotocrack.com/wp-content/uploads/2007/06/flinstones-anatomy.jpg
  • 49.
    Anatomy  HTTP Requestand Response are comprised of various components:  Request Methods  Response Status Codes  Request Headers  Response Headers  General Headers  Entity Headers  Content (MIME Media Types)
  • 50.
    Anatomy: Request Methods Humans can convey emotions in several ways  Why should HTTP clients lag!!!  HTTP methods describe the type of communication GET POST HEAD OPTIONS TRACE PUT DELETE CONNECT
  • 51.
    Anatomy: Response StatusCodes  Indicate the server’s mood corresponding to a request  Combination of a numerical code, and a short description  Cab be categorized in 5 categories:  1xx -- Informational  2xx -- Successful  3xx -- Redirection  4xx -- Client Error  5xx -- Server Error
  • 52.
    Anatomy: Request Headers Specific to an HTTP Request  Carry information about the client, and the type of request  Facilitates better understanding between client and server Host Accept-Language If-Modified-Since Referer User-Agent Authorization If-None-Match Expect Accept Proxy- If-Range From Authorization Accept-Charset Max-Forwards If-Unmodified- TE Since Accept-Encoding If-Match Range
  • 53.
    Anatomy: Response Headers Specific to an HTTP Response  Carry information about the server, and the type of response Accept-Ranges ETag Retry-After WWW-Authenticate Age Location Server Proxy-Authenticate Vary
  • 54.
    Anatomy: General Headers Carry information about the HTTP transaction  Can be a part of request, as well as response Cache-Control Keep-Alive Pragma Via Connection Upgrade Trailer Warning Transfer-Encoding Date
  • 55.
    Anatomy: Entity Headers Carry information about the content  Mainly a part of HTTP response Allow Content-Language Content-Location Content-Range Content-Encoding Content-Length Content-MD5 Content-Type Expires Last-Modified
  • 56.
    Anatomy: Content  IANAmaintains a list of valid content types  It is specified by the Content-Type Entity header  Categorized in 9 MIME Media types: application audio example image message model multipart text video
  • 57.
    Agenda  Intro: What& Why???  OSI model: Back to the basics  10000 feet view: How the web works  RFC 2616: Anatomy  RFC 2965: Handling Statelessness
  • 58.
    Handling Statelessness  HTTPis a stateless protocol
  • 59.
    Handling Statelessness  HTTPis a stateless protocol  i.e., server’s got a bad memory
  • 60.
    Handling Statelessness  Cookies to rescue http://www.flickr.com/photos/lij/283869088/
  • 61.
    Handling Statelessness  Cookies:  are text files stored by client browser  maintain session by storing information  are non-executable
  • 62.
    Handling Statelessness  Cookieattributes:  name=value  expires=value  domain=value  path=value  Secure  HttpOnly --not a part of spec
  • 63.
    Conclusion Thesingle biggest problem in communication is the illusion… that it has taken place. --George Bernard Shaw
  • 64.
    Conclusion The single biggest problem in communication is the illusion… that it has taken place. --George Bernard Shaw  Think about it 
  • 65.
    Q&A!!!  Got queries?Raise your hands.
  • 66.
     Arigato!  Contact info:  Om—At—[projectbee.org/null.co.in]  http://projectbee.org/  Twitter - @bipinu  Flickr -- projectbee

Editor's Notes

  • #2 http://www.jellymuffin.com/generators/fordummies/
  • #11 http://i194.photobucket.com/albums/z202/CopyDat/copycat%20stuff/kung-fu.jpg