.. my history might well be your future ...

ted nelson

information retrieval


...



perspectives -- multimedia information retrieval


...



the artwork

  1. kata -- japanese martial arts picture.
  2. signs -- japanese coats of arms,  [Signs], p. 140, 141.
  3. photographs -- Jaap Stahlie, two early experiments (left, and right)

5

information retrieval

information retrieval is usually an afterthought

information retrieval

learning objectives

After reading this chapter you should be able to describe scenarios for information retrieval, to explain how content analysis for images can be done, to characterize similarity metrics, to define the notions of recall and precision, and to give an example of frequence tables, as used in text search.

Searching for information on the web is cumbersome. Given our experiences today, we may not even want to think about searching for multimedia information on the (multimedia) web.

Nevertheless, in this chapter we will briefly sketch one of the possible scenarios indicating the need for multimedia search. In fact, once we have the ability to search for multimedia information, many scenarios could be thought of.

As a start, we will look at two media types, images and documents. We will study search for images, because it teaches us important lessons about content analysis of media objects and what we may consider as being similar. Perhaps surprisingly, we will study text documents because, due to our familiarity with this media type, text documents allow us to determine what we may understand by effective search.

...



Amsterdam Drugport


Amsterdam is an international centre of traffic and trade. It is renowned for its culture and liberal attitude, and attracts tourists from various ages, including young tourists that are attracted by the availability of soft drugs. Soft drugs may be obtained at so-called coffeeshops, and the possession of limited amounts of soft drugs is being tolerated by the authories.

The European Community, however, has expressed their concern that Amsterdam is the centre of an international criminal drug operation. Combining national and international police units, a team is formed to start an exhaustive investigation, under the code name Amsterdam Drugport.

information

media types


retrieval


...



information retrieval


Information retrieval, according to  [IR], deals with the representation, storage, organisation of, and access to information items.

To see what is involved, imagine that we have a (user) query like:

find me the pages containg information on ...

information retrieval models


vector models


image query


content-based description


shape


property


example



  shape descriptor: XLB=10; XUB=60; YLB=3; YUB=50   (rectangle)  
  property descriptor: pixel(14,7): R=5; G=1; B=3 
  

definitions


example



  property: (bwcolor,{b,w},bwalgo) 
  

...



similarity-based retrieval


How do we determine whether the content of a segment (of a segmented image) is similar to another image (or set of images)?

solutions

metric approach


distance d:X->[0,1] is distance measure if:


           d(x,y) = d(y,x)
  	 d(x,y) <= d(x,z) + d(z,y)
  	 d(x,x) = 0
  

pixel properties


complexity


a set of points in k-dimensional space for k = n + 2

feature extraction


...



transformation approach


Given two objects o1 and o2, the level of dissimilarity is proportional to the (minimum) cost of transforming object o1 into object o2 or vice versa

transformation operators



    to_1,...,to_r  -- translation, rotation, scaling
  

cost


distance


advantages


operations



   rotate(image-id,dir,angle)
   segment(image-id, predicate)
   edit(image-id, edit-op)
  

...



image repository


mission


Our goal is to study aspects of the deployment and architecture of virtual environments as an interface to (intelligent) multimedia information systems ...

...



query


problems


...



effective search


precision and recall



  precision = ( returned and relevant ) / returned  
  recall = ( returned and relevant ) / relevant 
  

anomalies


example


term/documentd0d1d2
snacks100
drinks103
rock-roll011

complextity


compare term frequencies per document -- O(M*N)

reduction


...



...



user-oriented measures


...


(a) context(b) self-reflection

Aesthetics


...



5. information retrieval

concepts


technology


projects & further reading

As a project, you may implement simple image analysis algorithms that, for example, extract a color histogram, or detect the presence of a horizon-like edge.

You may further explore scenarios for information retrieval in the cultural heritage domain. and compare this with other applications of multimedia information retrieval, for example monitoring in hospitals.

For further reading I suggest to make yourself familiar with common techniques in information retrieval as described in  [IR], and perhaps devote some time to studying image analisis,  [Image].

the artwork

  1. artworks -- ..., Miro, Dali, photographed from Kunstsammlung Nordrhein-Westfalen, see artwork 2.
  2. left Miro from  [Kunst], right: Karel Appel
  3. match of the day (1) -- Geert Mul
  4. match of the day (2) -- Geert Mul
  5. match of the day (3) -- Geert Mul
  6. mario ware -- taken from gammo/veronica.
  7. baten kaitos -- eternal ways and the lost ocean, taken from gammo/veronica.
  8. idem.
  9. PANORAMA -- screenshots from field test.
  10. signs -- people,  [Signs], p. 252, 253.

6

content annotation

video annotation requires a logical approach to story telling

content annotation

learning objectives

After reading this chapter you should be able to explain the difference between content and meta information, to mention relevant content parameters for audio, to characterize the requirements for video libraries, to define an annotation logic for video, and to discuss feature extraction in samples of musical material.

Current technology does not allow us to extract information automatically from arbitrary media objects. In these cases, at least for the time being, we need to assist search by annotating content with what is commonly referred to as meta-information.

In this chapter, we will look at two more media types, in particular audio and video. Studying audio, we will learn how we may combine feature extraction and meta-information to define a data model that allows for search. Studying video, on the other hand, will indicate the complexity of devising a knowledge representation scheme that captures the content of video fragments.

Concluding this chapter, we will discuss an architecture for feature extraction for arbitrary media objects.

audio databases


  • audio signals -- compression, discrete representation
  • musical patterns -- similarity-based retrieval

audio data model


  • meta-data -- describing content
  • features -- using feature extraction

example



   singers -- (Opera,Role,Person)
   score -- ...
   transcript -- ...
  

signal-based content


  • audio data -- %F(x) over time x
  • wave -- period T, frequency f = 1/T
  • velocity -- v = w/T = w * f , with w wavelength
  • amplitude -- a

windowing


  • break signal up in small windows of time

feature extraction


  • intensity -- watts/m^2
  • loudness -- in decibels
  • pitch -- from frequency and amplitude
  • brightness -- amount of distortion

video annotation


  • what are the interesting aspects?
  • how do we represent this information?

video content



  video v, frame f 
  f has associated objects and activities 
  objects and activities have properties
  

property


  property: name = value 
  

object schema


   (fd,fi) -- frame-dependent and frame-independent properties 
  

object instance: (oid,os,ip)

example


frameobjectsframe-dependent properties
1Janehas(briefcase), at(path)
-housedoor(closed)
-briefcase
2Janehas(briefcase), at(door)
-Dennisat(door)
-housedoor(open)
-briefcase

frame-independent properties


objectframe-independent propertiesvalue
Janeage35
height170cm
houseaddress...
colorbrown
briefcasecolorblack
size40 x 31

activity

  • activity name -- id
  • statements -- role = v

example


   { giver : Person, receiver : Person, item : Object } 
   giver = Jane, receiver = Dennis, object = briefcase 
  

video libraries



  which videos are in the library 
  what constitutes the content of each video
  what is the location of a particular video
  

query language for video libraries


  • segment retrievals -- exchange of briefcase
  • object retrievals -- all people in v:[s,e]
  • activity retrieval -- all activities in v:[s,e]
  • property-based -- find all videos with object oid

VideoSQL



  SELECT -- v:[s,e] 
  FROM -- video:<source><V> 
  WHERE -- term IN funcall 
  

example



  SELECT  vid:[s,e]
  FROM video:VidLib
  WHERE (vid,s,e) IN VideoWithObject(Dennis) AND
  	object IN ObjectsInVideo(vid,s,e) AND
  	object != Dennis AND
  	typeof(object) = Person
  

To improve library access, the Informedia Digital Video Library uses automatic processing to derive descriptors for video. A new extension to the video processing extracts geographic references from these descriptors.

The operational library interface shows the geographic entities addressed in a story, highlighting the regions discussed in the video through a map display synchronized with the video display.

The map can also serve as a query mechanism, allowing users to search the terabyte library for stories taking place in a selected area of interest.

questions


  • what -- content-related
  • when -- position on time-continuum
  • where -- geographic location

More recently, it has been recognized that the process of spatialization -- where a spatial map-like structure is applied to data where no inherent or obvious one does exist -- can provide an interpretable structure to other types of data.

atlas of cyberspace


We present a wide range of spatializations that have employed a variety of graphical techniques and visual metaphors so as to provide striking and powerful images that extend from two dimension 'maps' to three-dimensional immersive landscapes.

feature grammar



  
  detector song; ## to get the filename
  detector lyrics; ## extracts lyrics
  detector melody; ## extracts melody
  detector check;  ## to walk the tree
  
  atom str name;
  atom str text;
  atom str note;  
  
  midi: song;
  
  song: file lyrics melody check;
  
  file: name;
  
  lyrics: text*;
  melody: note*;
  
  


  event('twinkle',2,time=384, note_on:[chan=2,pitch=72,vol=111]).
  event('twinkle',2,time=768, note_off:[chan=2,pitch=72,vol=100]).
  

melody detector



  int melodyDetector(tree *pt, list *tks ){
  char buf[1024]; char* _result;
  void* q = _query;
  int idq = 0; 
  
    idq = query_eval(q,"X:melody(X)");
    while ((_result = query_result(q,idq)) ) {
           putAtom(tks,"note",_result);
           }
    return SUCCESS;
  } 
  

prediction techniques


  • social-based -- dependent on (group) rating of item(s)
  • information-based -- dependent on features of item(s)
  • hybrid methods -- combining predictors

definition(s)


  • rating -- a value representing a user's interest
  • recommendation -- item(s) that might be of interest to the user
  • regret -- a function to measure the accuracy of recommendations

guided tour(s)


  • automated (viewpoint) navigation in virtual space,
  • an animation explaining, for example, the construction of an artwork, or
  • the (narrative) presentation of a sequence of concept nodes.

projects & further reading

As a project, think of implementing musical similarity matching, or developing an application retrieving video fragments using a simple annotation logic.

You may further explore the construction of media repositories, and finding a balance between automatic indexing, content search and meta information.

For further reading I advice you to google recent research on video analysis, and the online material on search engines.

the artwork

  1. works from  [Design]
  2. faces -- from www.alterfin.org, an interesting site with many surprising interactive toys in flash, javascript and html.
  3. mouth -- Annika Karlson Rixon, entitled A slight Acquaintance, taken from a theme article about the body in art and science, the Volkskrant, 24/03/05.
  4. story -- page from the comic book version of City of Glass,  [Glass], drawn in an almost tradional style.
  5. story -- frame from  [Glass].
  6. story -- frame from  [Glass].
  7. story -- frame from  [Glass].
  8. white on white -- typographical joke.
  9. modern art -- city of light (1968-69), Mario Merz, taken from  [Modern].
  10. modern art -- Marocco (1972), Krijn Griezen, taken from  [Modern].
  11. modern art -- Indestructable Object (1958), Man Ray, Blue, Green, Red I (1964-65), Ellsworth Kelly, Great American Nude (1960), T. Wesselman, taken from  [Modern].
  12. signs -- sports,  [Signs], p. 272, 273.

effective retrieval requires visual interfaces

information system architecture

learning objectives

After reading this chapter you should be able to dicuss the considerations that play a role in developing a multimedia information system, characterize an abstract multimedia data format, give examples of multimedia content queries, define the notion of virtual resources, and discuss the requirements for networked virtual environments.

From a system development perspective, a multimedia information system may be considered as a multimedia database, providing storage and retrieval facilities for media objects. Yet, rather than a solution this presents us with a problem, since there are many options to provide such storage facilities and equally many to support retrieval.

In this chapter, we will study the architectural issues involved in developing multimedia information systems, and we will introduce the notion of media abstraction to provide for a uniform approach to arbitrary media objects.

Finally, we will discuss the additional problems that networked multimedia confront us with.

issues


  • multimedia storage and retrieval -- homegrown, third-party and legacy sources
  • information architecture -- common format, native format, hybrid
  • media abstraction -- unified indexes, query relaxation

content organisation


  • autonomy -- index per media type
  • uniformity -- unified index
  • hybrid -- media indexes + unified index

Principle of Uniformity


... from a semantical point of view the content of a multimedia source is independent of the source itself, so we may use statements as meta data to provide a description of media objects.

  • from a semantical point of view the content of a multimedia source is independent of the source itself.
  • use statements as meta data
  • md(o) -- metadata associated with media object o

tradeoffs


  • metadata can be stored using standard relational and OO structures
  • manipulating metadata is easy
  • feature extraction is (!) straightforward

    is it?


software architecture


  • a database of media object, supporting
  • operations on media objects, and offering
  • logical views on media objects

information retrieval cycle


  1. specification of the user's information need
  2. translation into query operations
  3. search and retrieval of media objects
  4. ranking according to likelihood or relevance
  5. presentation of results and user feedback
  6. resulting in a possibly modified query

  • despite high interactivity, access is difficult;
  • quick response is and will remain important!

media abstraction


  • state -- smallest chunk of media data
  • feature -- any object in a state
  • attributes -- characteristics of objects
  • feature extraction map -- to identify content
  • relations -- to capture state-dependent information
  • (inter)relations between 'states' or chunks

example -- image database



  states: { pic1.gif,...,picn.gif } 
  features: names of people 
  extraction: find people in pictures 
  relations: left-of, ... 
  

example -- video database



  states:  set of frames 
  features:  persons and objects
  extraction:  gives features per frame 
  relations:  frame-dependent and frame-independent information
  inter-state relation:  specifies sequences of frames
  

simple multimedia database


  • a finite set M of media abstractions

structured multimedia database


  • equivalence relations --to deal with synonymy
  • partial ordering -- to deal with inheritance
  • query relaxation -- to please the user

SMDS -- functions



  Type: object  |->  type 
  ObjectWithFeatures:  f |-> { o |  object o contains  f }  
  ObjectWithFeaturesAndAttributes:  (f,a,v) |-> { o |  o contains f with  a=v }  
  FeaturesInObject:  o |-> { f | o  contains  f }  
  FeaturesAndAttributesInObject:  o |-> { (f,a,v) | o  contains  f  with  a=v }  
  

SMDS-SQL



SELECT -- media entities
  • m -- if m is not a continuous media object
  • m:[i,j] -- m is continuous, i,j integers (segments)
  • m.a -- m is media entity, a is attribute

FROM

  • <media><source><M>

WHERE

  • term IN funcall

example



    SELECT M
    FROM   smds source1 M
    WHERE  Type(M) = Image AND
  	 M IN ObjectWithFeature("Dennis") AND
  	 M IN ObjectWithFeature("Jane") AND
  	 left("Jane","Dennis",M)
  

hybrid representations: HM-SQL


  • express queries in specialized language
  • perform operations (joins) between SMDS and non-SMDS data

differences


  • function calls are annotated with media source
  • queries to non-SMDS data may be embedded

example HM-SQL



   SELECT M
   FROM smds video1, videodb video2
   WHERE M IN smds:ObjectWithFeature("Dennis") AND
         M IN videodb:VideoWithObject("Dennis")
  

digital libraries


Digital libraries are constructed -- collected and organized -- by a community of users. Their functional capabilities support the information needs and users of this community. Digital libraries are an extension, enhancement and integration of a variety of information institutions as physicalplaces where resources are selected, collected, organized, preserved and accessed in support of a user community.

... federated structures that provide humans both intellectual and physical access to the huge and growing worldwide networks of information encoded in multimedia digital formats.

digital libraries (5S)


  • streams: (content) -- from text to multimedia content
  • structures: (data) -- from database to hypertext networks
  • spaces: (information) -- from vector space to virtual reality
  • scenarios: (procedures) -- from service to stories
  • societies: (stakeholders) -- from authors to libraries


   D-Lib Forum -- www.dlib.org
   Informedia -- www.informedia.cs.cmu.edu
  

networked multimedia


  • real-time transmission of continuous media information (audio, video)
  • substantial volumes of data (despite compression)
  • distribution-oriented -- e.g. audio/video broadcast

network criteria


  • throughput -- bitrates, burstiness
  • transmission delay -- including signal propagation time
  • delay variation -- jitter
  • error rate -- data alteration, loss

  • multicasting and broadcasting capabilities
  • document caching

Quality of Service


Quality of Service is a concept based on the statement that not all applications need the same performance from the network over which they run. Thus, applications may indicate their specific requirements to the network, before they actually start transmitting information data.

QoS requirements


  • hard requirements
  • guidance for optimizing internal resources
  • criteria for acceptance

virtual objects

  • VO = { (O_i,Q_i,C_i) | 1 <= i <= k }

where

  • C_1,...,C_k -- mutually exclusive conditions
  • Q_1,...,Q_k -- queries
  • O_1,...,O_k -- objects

networked virtual environments


  • shared sense of space -- room, building, terrain
  • shared sense of presence -- avatar (body and motion)
  • shared sense of time -- real-time interaction and behavior

  • a way to communicate -- by gesture, voice or text
  • a way to share ... -- interaction through objects

challenges


  • network bandwidth -- limited resource
  • heterogeneity -- multiple platforms
  • distributed interaction -- network delays
  • resource management -- real-time interaction and shared objects
  • failure management -- stop, ..., degradation
  • scalability -- wrt. number of participants

manage dynamic shared state

  • the Java Media Framework, and
  • the DLP+X3D platform

java Media Framework


The JavaTM Media APIs meet the increasing demand for multimedia in the enterprise by providing a unified, non-proprietary, platform-neutral solution. This set of APIs supports the integration of audio and video clips, animated presentations, 2D fonts, graphics, and images, as well as speech input/output and 3D models. By providing standard players and integrating these supporting technologies, the Java Media APIs enable developers to produce and distribute compelling, media-rich content.

recommender economy


  • cross sale -- users who bought A also bought B
  • up sale -- if you buy A and B together ...

recommender model



  U = user
  I = item
  B = behavior
  R = recommendation
  F = feature
  
  • observations -- U \* I \* B
  • recommendations -- U \* I

  B = [ time = 20sec, rating = r ]
  F = [ artist = rembrandt, topic = portrait ]
  R = [ artist(rembrandt) = r, topic(portrait) = r ]
  

  A = [  p_{1}, p_2 , ... ]
  where p_{k} = [ f_1 = v_1, f_2 = v_2, ... ]
  
with as an example

   A_{nightwatch} = [ artist=rembrandt, topic=group ]
   A_{guernica} = [ artist=picasso, topic=group ]
  

distance metric



       d(x,y) = d(y,x)
       d(x,y) <= d(x,z) + d(z,y)
       d(x,x) = 0
  

dimension(s)


  • positive vs negative
  • individual vs community/collaborative
  • feature-based vs item-based

interpretation(s)


  • neutral interpretation -- use d(s_{n}, a_{k}) < d(s_{n}, s_{n+1} )
  • positive interpretation -- increase w(feature(a_{k}))
  • negative interpretation -- decrease w(feature(s_{n+1}))

projects & further reading

As a project, you may implement a multi-player game in which you may exchange pictures and videos, for example pictures and videos of celebrities.

Further you may explore the development of a data format for text, images and video with appropriate presentation parameters, including postioning on the screen and intermediate transitions.

For further reading you may study information system architecture patterns, nd explore the technical issues of constructing server based advanced multimedia applications in  [Fundamentals].

the artwork

  1. examples of dutch design, from  [Flat].
  2. idem.
  3. screenshots -- from splinter cell: chaos theory, taken from Veronica/Gammo, a television program about games.
  4. screenshots -- respectively Sekken 5, Sims 2, and Super Monkey Ball, taken from insidegamer.nl.
  5. screenshots -- from Unreal Tournament, see section 7.3.
  6. idem.
  7. idem.
  8. resonance -- exhibition and performances, Montevideo, april 2005.
  9. CHIP -- property diagram connecting users.
  10. signs -- sports,  [Signs], p. 274, 275.