untitled

DATABASE RESEARCH GROUP

College of Computing
Georgia Institute of Technology
Atlanta, Georgia 30332-0280
(404)-894-8358

The Database Research Group at Georgia Tech in the College of Computing is dedicated to conducting research on all aspects of database related problems in the development of large scale applications. The group is headed by Prof. Sham Navathe and includes Professors Ed Omiecinski and Leo Mark. It has a laboratory of its own equipped with about a dozen Sun SPARC workstations (including a SPARC 10 and a SPARC 20) , several WINDOWS-NT workstations and some Macintosh machines. The database server machine has several relational and object-oriented database management systems including ORACLE, SYBASE, INFORMIX Universal Server (to be received), Objectstore, Versant, and Ode. Experimental systems and tools are constantly under development by graduate students. Currently there are about five doctoral students, one post-doctoral research associate (to arrive) and several masters students.

Following is a list of some of the on-going projects:

Mobile Intermittently Connected Databases: This project (in conjunction with Synchrologic, Inc. of Atlanta) is investigating issues related to consistency of data, propagation of updates, transaction processing, conflict resolution etc. for a client-server architecture in which the clients are mobile and are only intermittently connected to the server. Multicasting protocols are being evaluated for improving efficiency so that the architecture may be scaled up to thousands of clients in typical applications such as sales force automation.
Data Mining Algorithms: In this ongoing project we have developed efficient algorithms for discovery of association rules that discover interesting relationships in existing large volumes of raw transaction data. Preliminary work has also been done on detecting negative associations (or lack of relationship) among certain types of data. The work is applicable to large transaction volumes occurring in supermarkets, banks, insurance companies, telephone service, etc. The algorithms are currently being applied to mining associations and similarities among images.
Visualization and User Interface Construction for Large Document Databases: We have implemented a prototype system to improve the performance of users who wish to efficiently search a large document space without using any keywords. The interface uses techniques for informing the user visually about the relevance of highly ranked documents vis-à-vis the request. A thesaurus is employed to allow the user select related and unrelated terms. Positive and negative feedback windows let the user refine the original request by selecting or rejecting documents or parts of documents as well as thesaurus words. Extensive user studies were conducted to establish the usefulness of this approach and have showed that visual interfaces improve the user performance in document retrieval.
Mitochondrial Genome Database: As our contribution to the human genome initiative, in this project we are building a genome database containing information specific to the mitochondrial chromosome which has a ring-like structure with 15,500 base pairs. As a joint effort with the Molecular Genetics department at Emory (and Prof. Doug Wallace,) we are creating a web site (http://www.gen.emory.edu/mitomap.html ) where queries can be posed related to specific information about genes, genetic defects related to disease, gene-gene interactions and functional information about genes. The site will help scientists worldwide in obtaining and contributing human mitochondrial genome information in one place.
Database with a time dimension: In a database, if the transaction time of each update is recorded and the new transaction is appended to the database, we term it as a ìtransaction timeî database. This project is examining a variety of issues for such databases including efficient storage, query processing with incremental algorithms, etc.
Security Modeling and Query Processing in Heterogeneous Databases: Many schemes for security enforcement in databases have been proposed including Discretionary Access Control (in SQL authorization schemes) and Mandatory Access Control (MAC) in Goverment with levels of security for users. In this project we have defined a common representation of security models at the conceptual ER -like model level, and at a formal level, that will help in accessing data across heterogeneous schemes. Methods for query processing and indexing of these databases are being investigated.

Besides the above, some of the recently completed projects include:

HIPED - Heterogeneous Intelligent Processing for Engineering Design - In this project supported by the DARPA Intelligent Integration of Information (I3) program, a front end design assistant system called interactive Kritik was integrated with back end access to a variety of heterogeneous databases .

Parallel DB Reorganization - examined the problem of reorganizing the allocation of data to disks in a shared nothing parallel database environment; also considered efficient index reorganization.

Index Construction and utilization for Query Processing documenting databases - considered the problem of constructing an efficient indexing scheme in a document database and its utilization in parallel query processing.

Organization and performance improvement of video server databases - considered different partitioning and merging techniques for combining data from multiple video streams and evaluated them for relative performance improvement.

Hypermedia Modeling - considered the problem of browsing and query processing in databases which are equivalent to a network of interconnected hypermedia nodes. An algebra for the model and algorithms for efficient evaluation of constraints are developed. The work has application to the modeling of the information on the web.

Automatic Metadata Management - considered the problem of maintaining a metadata repository of constraints when database instances are updated. Subsequent query processing would use updated constraints. Detailed data structures are proposed to capture constraints in a graph and algorithms are proposed to utilize this semantic constraint graph for query processing.

We are very keen to do collaborative projects with industry and government agencies where we can apply our research strengths to meaningful real life application problems.

For further information contact :-

1. Prof. Shamkant B. Navathe
Head, Database Research Group
sham@cc.gatech.edu

2. Gwen Baker,
Administrative Assistant
gwen@cc.gatech.edu
phone: (404) 894-8358
fax: (404) 894 9442

http://www.cc.gatech.edu/computing/Database/database.html