Integrity critical databases, such as financial information used in high-value decisions, are frequently published over the internet. Publishers of such data must satisfy the integrity, authenticity, and non-repudiation requirements of end clients. Providing this protection over public data networks is costly. This is partly because building and running secure systems is hard. In practice, large systems cannot be verified to be secure and are frequently penetrated. The negative consequences of a system intrusion at the data publisher can be severe. The problem is further complicated by data and server replication to satisfy availability and scalability requirements.
We aim to reduce the trust required of the publisher of large, infrequently updated databases. To do this, we separate the roles of data owner and data publisher. With a few trusted digital signatures on the part of the owner, an untrusted publisher can use techniques based on Merkle hash trees to provide authenticity and non-repudiation of the answer to a database query. We do not require a key to be held in an online system, thus reducing the impact of system penetrations. By allowing untrusted publishers, our solution moves towards more scalable publication of large databases over the Internet. The general concept underlying the authentic data publication scheme is illustrated in the figure below.
Note that data represented in different data models require different "forms" of summary information and verification object, which rely on computation on different data representations. Currently, our approach supports relational data and XML data to limited types of queries. Development of the first prototype is undergoing for performance evaluation. A general data model for the approach is developed, which provides a general framework for the efficient computation of verification objects under different settings.