Multimedia Databases



Possible solutions to the R limitations
Topics Covered
 Why  multimedia is a problem for databases
 Support for searching
 Types of Databases
 Relational Databases and Multimedia
   – Example - Access
 Object   Relational Databases and Multimedia
   – Example - Oracle
 Object   Oriented Databases and Multimedia
   – Example - Jasmine
The problem - recap
   Data increasingly means not just numbers and small strings
    but multimedia data as well – structured (text, images, video,
    audio, VR, etc.)
   Databases promise:
    –   well structured data organisation
    –   efficient storage of large amounts of data
    –   querying
    –   transactional support for concurrent users
   If you include multimedia data
    – multimedia is large and may swamp other data
    – multimedia data structures are completely different from standard
      database structures
    – multimedia data structures do not easily lend themselves to content-
      based searching
Data integration
 Databases already integrate various kinds of
  data, numbers, dates, small text strings.
 They do this by the use of domains
  – i.e. each atomic value in the database belongs to
    one of a small number of types
 each   type has two aspects:
  – a range of values which are acceptable
  – some operations which are available
Cont ….
 the schema indicates a domain for each part
  of the database and the DBMS enforces the
  domain constraint
  – e.g. in a Relational Database, each column is
    assigned a domain
 Therefore  a DBMS must provide domain
  types for any kind of data that they wish to
  house and theoverall structure will deal with
  the integration
Domain types of MM data
    DBMS typically provide three different kinds of
     domain for multimedia data:
1.   large object domains, sequences of data often of
     two kinds
          Binary Large Objects – BLOBs – which are an unstructured
           sequence of bytes
          Character Large Objects – CLOBs – which are an
           unstructured sequence of characters
1.   file references – instead of holding the data, a
     file reference contains a link to the data (OLE in
     Access)
2.   genuine multimedia data types – (Oracle and
     Jasmine)
Cont …
 There is an important difference between
 the last of these and the first two:
  – multimedia data types present the possibility of
    exploiting the structure of the data for querying
    and manipulation
  – large objects at best allow you to extract
    sections or to concatenate them
  – file references mean that the DBMS has no
    access the data at all
Querying MM data
A   DBMS permits a user to search the database by
  content e.g. give the name of the student with
  matriculation number 0123456 We would like to
  do the same with multimedia data e.g. give the
  pictures painted by Picasso or sound files with
  female singer hitting top C
 With standard data this is easy – numeric and
  string operators are well understood With
  multimedia data this is more difficult and requires
  some method of identifying contents of which
  there are two kinds:
Cont …
 automatic   identification
  – an algorithm takes the data and returns a
    measure which can be compared – e.g. of
    blackness
 manual   identification
  – a person examines the data and catalogues it –
    e.g. in a table of pictures, there is a column for
    the picture and another for the painter
Types of Database
 There are three kinds of DBMS that might be used
  for housing multimedia data.
 Relational DBMS store everything as First
  Normal Form tables
   – all data items are atomic and are held in rectangular
     tables
   – data can only be related if they are in one or in two
     records connected by a common value (foreign key)
   – records are identified only by content
   – it is difficult (if not impossible) to extend the set of
     domains
Cont …
 Object-oriented   DBMS store everything as
  classes of objects
  – all data is held as components of objects (like Java
    variables)
  – data is related by object reference (i.e. one class
    variable has a type which is another class and the
    values of that variable are instances of that class)
  – the set of classes is extensible and so you can freely
    create domains
Cont …
 Object-relationalDBMS are
 fundamentally relations but are not First
 Normal Form
  – the values in cells can be object references as
    well as atomic values
  – new types can be defined
How can we use these
          different types?
 In   a relational database, you can have:
  – domain types for large objects
  – using a string type for file names
  – extra file types as in OLE in Access
 In   an object-oriented database, you can have:
  – specially designed classes for multimedia
 In   an object-relational database, you can have:
  – specially designed types for multimedia
R type database e.g. Access and
            OLE
   Object Linking and Embedding was Microsoft’s first
    architecture for integrating files of different types:
    Each file type in Windows is associated with an
    application It is possible to place a file of one type inside
    another:
    – either by wholly embedding the data in which case it is rendered
      by a plug-in associated with the program
    – or by placing a link to the data in which case it is rendered by
      calling the original program
   Access works with this system by providing a domain type
    for OLE
   • There’s not much you can do with OLE fields since the
    data is in a format that Access does not understand
   • You can plug the foreign data into a report or a form and
    little else
R databases e.g. BFILEs in
           Oracle

 The BFILE datatype provides access to
 BLOB files of up to 4 gigabytes that are
 stored in file systems outside an Oracle
 database.
  – The BFILE datatype allows read-only support
   of large binary files; you cannot modify a file
   through Oracle. Oracle provides APIs to access
   file data.
Large Object Types in Oracle
          and SQL3
 Oracle        and SQL3 support three large object types:
     – BLOB - The BLOB domain type stores unstructured
       binary data in the database. BLOBs can store up to four
       gigabytes of binary data.
     – CLOB – The CLOB domain type stores up to four
       gigabytes of single-byte character set data
     – NCLOB - The NCLOB domain type stores up to four
       gigabytes of fixed-width and varying width multi-byte
       national character set data
* SQL3 is a significant extension to standard SQL which turns into a full object-based
    language
Cont …
 These   types support
  – Concatenation – making up one LOB by putting two
      of them together
  –   Substring – extract a section of a LOB
  –   Overlay – replace a substring of one LOB with another
  –   Trim – removing particular characters (e.g.
      whitespace) from the beginning or end
  –   Length – returns the length of the LOB
  –   Position – returns the position of a substring in a LOB
  –   Upper and Lower – turns a CLOB or NCLOB into
      upper or lower case
  –   LOBs can only appear in a where clause using “=”,
      “<>” or “like” and not in group by or order by at all
Large Object Types in
           MySQL
MySQL has four BLOB and four CLOB (called
  TEXT in MySQL) domain types:
 TINYBLOB and TINYTEXT – store up to 256
  bytes
 BLOB and TEXT – store up to 64K bytes
 MEDIUMBLOB and MEDIUMTEXT – store up
  to 16M bytes
 LONGBLOB and LONGTEXT – store up to 4G
  bytes
Oracle interMedia Audio,
          Image, and Video
   Oracle interMedia supports multimedia storage, retrieval,
    and management of:
    – BLOBs stored locally in Oracle8i onwards and containing audio,
      image, or video data
    – BFILEs, stored locally in operating system-specific file systems
      and containing audio, image or video data
    – URLs containing audio, image, or video data stored on any HTTP
      server such as Oracle Application Server, Netscape Application
      Server, Microsoft Internet Information Server, Apache HTTPD
      server, and Spyglass servers
    – Streaming audio or video data stored on specialized media
      servers such as the Oracle Video Server
The Object Relational Multimedia
  Domain Types in interMedia
 interMedia
           provides the ORDAudio, ORDImage,
  and ORDVideo object types and methods for:
  – updateTime ORDSource attribute manipulation
  – manipulating multimedia data source attribute
    information
  – extracting attributes from multimedia data
  – getting and managing multimedia data from Oracle
    interMedia, Web servers, and other servers
  – performing a minimal set of manipulation operations on
    multimedia data (images only)
Cont …
 The  properties available are:
 ORDImage – the height, width, data size of the
  on-disk image, file type, image type,compression
  type, and MIME type
 ORDAudio – the format, encoding, number of
  channels, sampling rate, sample size,compression
  type, and audio duration
 ORDVideo – the format, frame size, frame
  resolution, frame rate, video duration, number of
  frames, compression type, number of colours, and
  bit rate
Cont …
 Oracle   also stores metadata including:
  – source type, location, and source name
  – MIME type and formatting information
  – characteristics such as height and width of an
    image, number of audio channels, video frame
    rate, pay time, etc.
OO databases – e.g. Jasmine
 Jasmine  is an Object-Oriented database and has an
  application known as Studio is its development
  environment
 It comes with a number of built in classes include
  four multimedia classes:
   –   Picture -
   –   Image –
   –   Video –
   –   Audio -
 These   come with manipulation and compression
  facilities They also have been made to fit well
  with Java Media Framework
conclusions
 At   present you can not do much with MM data,
   there are two reasons for this:
1. It is very large
   – indexing on multimedia data is not reasonable nor is
     storing a default value
   – other retrieval may be slowed down
   – transactions may be compromised
2. The properties are not well understood or
   implementable in reasonable time
   – what does it mean to say that one image is before
       another in order therefore there are few operators in the
       where clause that work
Cont ..
 At
   the moment, there is no reason for putting
 multimedia data into a relational database
  – it just slows everything down
  – and you can’t do very much
 You could use an object relational or object
 oriented database
  – now you can do more
  – but the products are immature
  – and everything will be slow
Cont …
 There are three main reasons for integrating
  multimedia data with a database:
 1. Cataloguing the data
   – a column for file names is good enough
 2.   Decorating Reports
   – The OLE approach works well here Otherwise a file
       name column and a simple application for generating
       the reports would do
 3.   Web Applications
   – Again a file name column is good enough

Lecture 3 multimedia databases

  • 1.
  • 2.
    Topics Covered  Why multimedia is a problem for databases  Support for searching  Types of Databases  Relational Databases and Multimedia – Example - Access  Object Relational Databases and Multimedia – Example - Oracle  Object Oriented Databases and Multimedia – Example - Jasmine
  • 3.
    The problem -recap  Data increasingly means not just numbers and small strings but multimedia data as well – structured (text, images, video, audio, VR, etc.)  Databases promise: – well structured data organisation – efficient storage of large amounts of data – querying – transactional support for concurrent users  If you include multimedia data – multimedia is large and may swamp other data – multimedia data structures are completely different from standard database structures – multimedia data structures do not easily lend themselves to content- based searching
  • 4.
    Data integration  Databasesalready integrate various kinds of data, numbers, dates, small text strings.  They do this by the use of domains – i.e. each atomic value in the database belongs to one of a small number of types  each type has two aspects: – a range of values which are acceptable – some operations which are available
  • 5.
    Cont ….  theschema indicates a domain for each part of the database and the DBMS enforces the domain constraint – e.g. in a Relational Database, each column is assigned a domain  Therefore a DBMS must provide domain types for any kind of data that they wish to house and theoverall structure will deal with the integration
  • 6.
    Domain types ofMM data  DBMS typically provide three different kinds of domain for multimedia data: 1. large object domains, sequences of data often of two kinds  Binary Large Objects – BLOBs – which are an unstructured sequence of bytes  Character Large Objects – CLOBs – which are an unstructured sequence of characters 1. file references – instead of holding the data, a file reference contains a link to the data (OLE in Access) 2. genuine multimedia data types – (Oracle and Jasmine)
  • 7.
    Cont …  Thereis an important difference between the last of these and the first two: – multimedia data types present the possibility of exploiting the structure of the data for querying and manipulation – large objects at best allow you to extract sections or to concatenate them – file references mean that the DBMS has no access the data at all
  • 8.
    Querying MM data A DBMS permits a user to search the database by content e.g. give the name of the student with matriculation number 0123456 We would like to do the same with multimedia data e.g. give the pictures painted by Picasso or sound files with female singer hitting top C  With standard data this is easy – numeric and string operators are well understood With multimedia data this is more difficult and requires some method of identifying contents of which there are two kinds:
  • 9.
    Cont …  automatic identification – an algorithm takes the data and returns a measure which can be compared – e.g. of blackness  manual identification – a person examines the data and catalogues it – e.g. in a table of pictures, there is a column for the picture and another for the painter
  • 10.
    Types of Database There are three kinds of DBMS that might be used for housing multimedia data.  Relational DBMS store everything as First Normal Form tables – all data items are atomic and are held in rectangular tables – data can only be related if they are in one or in two records connected by a common value (foreign key) – records are identified only by content – it is difficult (if not impossible) to extend the set of domains
  • 11.
    Cont …  Object-oriented DBMS store everything as classes of objects – all data is held as components of objects (like Java variables) – data is related by object reference (i.e. one class variable has a type which is another class and the values of that variable are instances of that class) – the set of classes is extensible and so you can freely create domains
  • 12.
    Cont …  Object-relationalDBMSare fundamentally relations but are not First Normal Form – the values in cells can be object references as well as atomic values – new types can be defined
  • 13.
    How can weuse these different types?  In a relational database, you can have: – domain types for large objects – using a string type for file names – extra file types as in OLE in Access  In an object-oriented database, you can have: – specially designed classes for multimedia  In an object-relational database, you can have: – specially designed types for multimedia
  • 14.
    R type databasee.g. Access and OLE  Object Linking and Embedding was Microsoft’s first architecture for integrating files of different types:  Each file type in Windows is associated with an application It is possible to place a file of one type inside another: – either by wholly embedding the data in which case it is rendered by a plug-in associated with the program – or by placing a link to the data in which case it is rendered by calling the original program  Access works with this system by providing a domain type for OLE  • There’s not much you can do with OLE fields since the data is in a format that Access does not understand  • You can plug the foreign data into a report or a form and little else
  • 15.
    R databases e.g.BFILEs in Oracle  The BFILE datatype provides access to BLOB files of up to 4 gigabytes that are stored in file systems outside an Oracle database. – The BFILE datatype allows read-only support of large binary files; you cannot modify a file through Oracle. Oracle provides APIs to access file data.
  • 16.
    Large Object Typesin Oracle and SQL3  Oracle and SQL3 support three large object types: – BLOB - The BLOB domain type stores unstructured binary data in the database. BLOBs can store up to four gigabytes of binary data. – CLOB – The CLOB domain type stores up to four gigabytes of single-byte character set data – NCLOB - The NCLOB domain type stores up to four gigabytes of fixed-width and varying width multi-byte national character set data * SQL3 is a significant extension to standard SQL which turns into a full object-based language
  • 17.
    Cont …  These types support – Concatenation – making up one LOB by putting two of them together – Substring – extract a section of a LOB – Overlay – replace a substring of one LOB with another – Trim – removing particular characters (e.g. whitespace) from the beginning or end – Length – returns the length of the LOB – Position – returns the position of a substring in a LOB – Upper and Lower – turns a CLOB or NCLOB into upper or lower case – LOBs can only appear in a where clause using “=”, “<>” or “like” and not in group by or order by at all
  • 18.
    Large Object Typesin MySQL MySQL has four BLOB and four CLOB (called TEXT in MySQL) domain types:  TINYBLOB and TINYTEXT – store up to 256 bytes  BLOB and TEXT – store up to 64K bytes  MEDIUMBLOB and MEDIUMTEXT – store up to 16M bytes  LONGBLOB and LONGTEXT – store up to 4G bytes
  • 19.
    Oracle interMedia Audio, Image, and Video  Oracle interMedia supports multimedia storage, retrieval, and management of: – BLOBs stored locally in Oracle8i onwards and containing audio, image, or video data – BFILEs, stored locally in operating system-specific file systems and containing audio, image or video data – URLs containing audio, image, or video data stored on any HTTP server such as Oracle Application Server, Netscape Application Server, Microsoft Internet Information Server, Apache HTTPD server, and Spyglass servers – Streaming audio or video data stored on specialized media servers such as the Oracle Video Server
  • 20.
    The Object RelationalMultimedia Domain Types in interMedia  interMedia provides the ORDAudio, ORDImage, and ORDVideo object types and methods for: – updateTime ORDSource attribute manipulation – manipulating multimedia data source attribute information – extracting attributes from multimedia data – getting and managing multimedia data from Oracle interMedia, Web servers, and other servers – performing a minimal set of manipulation operations on multimedia data (images only)
  • 21.
    Cont …  The properties available are:  ORDImage – the height, width, data size of the on-disk image, file type, image type,compression type, and MIME type  ORDAudio – the format, encoding, number of channels, sampling rate, sample size,compression type, and audio duration  ORDVideo – the format, frame size, frame resolution, frame rate, video duration, number of frames, compression type, number of colours, and bit rate
  • 22.
    Cont …  Oracle also stores metadata including: – source type, location, and source name – MIME type and formatting information – characteristics such as height and width of an image, number of audio channels, video frame rate, pay time, etc.
  • 23.
    OO databases –e.g. Jasmine  Jasmine is an Object-Oriented database and has an application known as Studio is its development environment  It comes with a number of built in classes include four multimedia classes: – Picture - – Image – – Video – – Audio -  These come with manipulation and compression facilities They also have been made to fit well with Java Media Framework
  • 24.
    conclusions  At present you can not do much with MM data, there are two reasons for this: 1. It is very large – indexing on multimedia data is not reasonable nor is storing a default value – other retrieval may be slowed down – transactions may be compromised 2. The properties are not well understood or implementable in reasonable time – what does it mean to say that one image is before another in order therefore there are few operators in the where clause that work
  • 25.
    Cont ..  At the moment, there is no reason for putting multimedia data into a relational database – it just slows everything down – and you can’t do very much  You could use an object relational or object oriented database – now you can do more – but the products are immature – and everything will be slow
  • 26.
    Cont …  Thereare three main reasons for integrating multimedia data with a database:  1. Cataloguing the data – a column for file names is good enough  2. Decorating Reports – The OLE approach works well here Otherwise a file name column and a simple application for generating the reports would do  3. Web Applications – Again a file name column is good enough