In order to be ready for future increases in data demand, we are investigating alternative architectures for data distribution. In particular, we are looking at Usenet, the Internet news network, in which well over 22,000 sites participate. A typical large Usenet site receives over 14,000,000 articles in a day for a total of over 282 gigabytes of incoming traffic alone. Usenet is a massive, unmanaged, distributed, heterogeneous network, and yet most posted articles arrive at their destinations quickly. Binary messages constitute the majority of traffic volume.
Use of network news software addresses several known limitations of the current LDM. News is organized into a virtually limitless number of hierarchically structured newsgroups which can be used as data feed types, mitigating a current limit. News software flows through the network via a "flooding algorithm". This approach uses redundant article transmission that allows an article to reach a site by the fastest route possible and alleviates the problem of developing and maintaining network topologies by hand. This algorithm is also robust in the face of site failure, avoiding the task of providing failover topologies. And, with sufficient disk space, articles (data products) can be stored as long as necessary to meet needs of downstream sites.
Network news service is based upon the Network News Transport Protocol (NNTP). Several open source implementations of this protocol exist. We are looking at Internet News (INN), an open source package provided by the Internet Software Consortium. This software has been evolving for over 20 years, benefitting from the expertise of many knowledgeable people. INN already provides nearly all of the functionality of the current LDM, plus several additional desirable features.
This talk will describe network news and INN features relevant to Internet data distribution and discuss trade offs in using a news server approach for this purpose. Experimental results will be presented. Finally, it will describe the role of data distribution in Unidata's effort to build Thematic Real-time Environmental Distributed Data Services (THREDDS).
Supplementary URL: