2002 Annual

Thursday, 17 January 2002: 2:29 PM
Object-oriented handling of numerical data for scientific analysis and visualization -- basic idea and implementation for Ruby
Takeshi Horinouchi, Kyoto University, Kyoto, Japan; and N. Kawanabe
Poster PDF (17.4 kB)
Kinds and size of data an atmospheric scientist deals with have been increasing rapidly as global research corporation and computer power grow. To manage this situation, it would be needed for him or her to have a software or a programming library with which different kinds of data can be treated efficiently in a consolidated way. The consolidation may be achieved by making use of the object-oriented way to separate data and accessors to them. We propose a framework to realize it and implement it with the object-oriented language Ruby.

Numerical data of physical quantities that we handle are typically gridded, whether regularly or not. A first step to handle the data concisely would be to combine them with their grid values and other information such as units to form an "object". This is the way of organizing data that file formats such as NetCDF and HDF4 suppose to make the data self-descriptive. Then it becomes possible to devise abstract operations on the data as physical quantities rather than just as numerical arrays. An example of such operations is to slice a multi-dimensional data in terms of physical coordinate values. The organization into an object would also serve for visualization, since axes and titles can be drawn automatically from information stored in the object.

By using object-oriented languages, we can hide internal structure of a data object from the user and make him or her access them only through abstract operations. The accessors can be the same in many cases whether the actual data resides entirely on computer memory or are kept in a file (letting the data object consisting of file handlers). Self-descriptive file formats such as those stated above can easily be adapted to this framework, and even non-self-descriptive formats can be conformed to it if the user provides ancillary information needed.

To realize such data handling we have been developing a class library for use in the object-oriented language Ruby. Since broad formats can be covered and the accessors to data will be consolidated as much as possible, users of the library would naturally develop applications that can be used easily by others. Therefore, the library is expected to become a basis on which data-handling applications are developed and shared in research communities.

Ruby is perhaps the best object-oriented scripting language to date and is freely available from http://www.ruby-lang.org. Since it can be used interactively, it is suitable for interactive data analysis. Yet, an interactive trial and error can be organized smoothly into a program if needed. Since Ruby offers strong network support, we are envisioning to extend our library to be able to handle remotely-stored data. The original distribution of Ruby, however, does not have efficient multi-dimensional numerical arrays nor scientific visualization tools needed by the library. The companion paper by Kawanabe et al presents our development of such infrastructure.

Supplementary URL: http://www.gfd-dennou.org/arch/ruby/