Event Title:

Sampling on the fly from massive data

Speaker:

Ravi Kannan,William K. Lanman Professor of Computer Science and Applied Mathematics

Affiliation:

Yale University,USA

Abstract/Brief Description:

Many modern problems involve massive amounts of data which may be stored

on disk, but will not fit into Random Access Memory (RAM).

 

A natural approach to these problems is to draw a random sample of the

data which can be stored in RAM and then have algorithms process the

sample. In the last decade, several such algorithms have been developed

for a variety of combinatorial and Linear Algebra problems in theoretical

Computer Science, with applications to many areas - for example,

Information Retrieval, Principal Component Analysis, Clustering, and

Databases among others.  The central task is to prove that the answers to

"random sub-problems" give us a good estimate of the answers to the whole

problem.

 

After touching upon the abstraction of these problems from applications

and the data processing model, the talk will discuss the broad

mathematical techniques involved in the algorithms.

Subject Area:

Mathematics

Date:

Monday, August 2, 2004

Time:

4:00pm

Venue:

Lecture Hall I, Mathematics Department, IISC., Bangalore-12