![]() |
||
|
|
|
|
|
Progetti di ricercaPubblicazioniCollaborazioniTesiSuperThesSuperThesVIZConcordSuperThesIn late 2000, a Memorandum of
Understanding (MOU) has been concluded between EKOLab, Umweltbundesamt
Berlin, Umweltbundesamt Vienna
and the Technisches Bureau Hermann Stallbaumer (TBHS). The MoU was
recently renewed for the period 2004-2006. General information about SuperThesSuperThes is based
firstly on the experience gained working with the software
package THESmain/THESshow and secondly on the specifications of the
partners. Major design goals for the new software are:
Description of the Thesaurus Maintenance Software SuperThes is used for visualisation and
maintenance of thesaurus data according to DIN 1462 /ISO 2788 and DIN /ISO
5964 The application has been developed in Delphi, a Pascal dialect by
Borland/Inprise. The programming language offers true object-oriented
programming support as well as the stability common to Pascal programs.
SuperThes will run by default on all modern 32-Bit Microsoft Operating
systems as a client/server application. All persistent data will be kept
within the relational database system Interbase. SuperThes as an
application has an English user interface. Language codes follow
ISO-standard 639-1. Figure 1: Versatile configurable data windows in Unicode Thesaurus SpecificationThe application provides user interface and functionality to create and maintain monolingual and multilingual thesauri. The structural functionality, as well as the terminology used, conforms to ISO standards 2788 and 5964. The user definable set of hierarchical rules is a superset of ISO standards 2788 and 5964. Other rules, like the minimum requirements for field entry, are also user definable. The adherence to these rules is automatically enforced by the application itself. Main rules are the enforcement of reciprocity, the check for arithmetic loops, the check for duplicate entries, the enforcement for uniqueness of concept, to name just the more important ones. Besides the main Thesaurus structure, a SuperThes Thesaurus may contain other user definable tables, which may be used to describe the main Thesaurus in a more detailed way or to keep attached data in them. An example might be the themes-table in EARTh, or a table containing geographic names or other tables. Thesaurus data are kept in standard database files, that allow the exchange of thesauri, the creation and deletion of thesauri on a simple file base. SuperThes allows to keep on the same computer system an unlimited number of thesauri and of course of different Thesaurus systems with different data definitions. More than one instance of the program may be used simultaneously on the same computer, showing different data contents e.g. EARTh together with UDK-Thesaurus. All Thesaurus data may contain "translations" (linguistic equivalents) for up to 30 languages using all character sets and glyphs available in the installation system. Input method editors are supported.
Figure 2: Multilingual Support, ISO 639-1 compliant Data TypesIn SuperThes, a Thesaurus may be defined due to the required data model. There are only three fields required for any model: one is the record neutral identifier; the second is a field for the status notation controlling the relations, the third is the history field where the information on any modification is stored. All further fields are definable. Beside the standard data types advanced data types like images or formatted text (text forms) are available. SuperThes will contain predefined templates for UDK-Thesaurus and EARTh Thesaurus.
Figure 3: Build your own data structure. User definable tables and data fields
Figure 4: Various editors for text, forms, images etc. User InterfaceThe program is designed to create and edit thesauri using a graphical user interface according to rules defined for Microsoft Windows. The program functions are hierarchically ordered. Each window provides context sensitive help. State of the art features, like drag and drop and right click context menus, will be used where useful. All functions are available via mouse control, as well as via keyboard control. Operation of similar functions will be similar in all windows. The following key functions are implemented:
Figure 5: Server management utilities included Data Exchange Data exchange may be performed in several ways:
SuperThesVIZ
SuperThesVIZ is a web-based tool which allows access to the SuperThes databases via the Internet. The main goal for the software development is to ensure the convenience of the user interface for the MS Windows-based application. SuperThesVIZ is platform independent, based on Java servlet technology. Requirements for the SuperThes visualisation moduleRequirement that is different from standard web application
Design objectives:
Applied technologies
Features
Configuration of thesaurus presentation
Technical RequirementsClientside:
Web-Server:
Database-Server:
ConcordConcord has been conceived and developed in the framework of Čulogos’ language engineering environment. From a terminological and linguistic point of view, Concord is language-independent and can be applied to any Latin-characters language. Special characters are fully supported. Two interface languages are available: Italian and English. An easy-to-use interface has been developed according to the terminologist’s point of view. The linguistic engine of Concord manages atoms, stop words and complex terms according to a self-learning logic, allowing the system to apply on new terms each learned element and structure. For each term, Concord proposes a pre-assigned representation of its elements. The user reviews the proposal and can modify it in each element. All terms and elements are browseable as concordances. The concordance method of Concord has been derived from the IntraText Digital Library. The working flowConcord project can be seen as
divided into three elaboration steps:
The first two steps require a direct intervention by the user, the last one is performed by the software. Concord interface is divided in tabs corresponding to such steps. In the first phase (fig. 1) the user reviews each term splitting it into its different parts corresponding to:
Figure 1 - The first tab – term analysis A preliminary subdivision is proposed by the software and is based on the results of former analysis. In such a subdivision, the user can set:
A full text search tool allows
to locate any term in the term list by searching for the whole term or a
part of it. The second phase: analysis of atomsThe second phase allows the user to review the terms starting from the atoms in an atom-to-term interface. Reviewing atoms includes the definition of sub-atoms, i.e. prefixes and suffixes.
Figure 2 - The second tab – atom analysis In the left window, the
complete list of the atoms validated during the first phase is presented.
In the right window all the elements that will be used for the validation
of sub-atoms are presented. In the lower part of the window, the bar for
the navigation and editing of the table of atoms is displayed. The
navigation bar presents 10 buttons. Using this bar it will be possible to
navigate through the table, insert or delete records, edit, confirm and
undo an operation, update the table
Figure 3 - The partial index It is now possible, using the
Concord symbolism, to make changes to the atom structure and divide it
into sub-atoms. A sub-atom could be a prefix or a suffix.
Figure 4 - The result of the subdivision of “chlorobenzene” in sub-atom When a term contains not only a prefix and a suffix but also a final part that has to be ignored like in “bioacidic”, the symbol “.” is used (bio<>acid.ic”). The symbols “<” and “>” could also be used in combination like in “agrifoodstuff”; the resulting string is “agri<>food<stuff” where “agri” is a prefix, “food” is a suffix of “agri” and a prefix of “stuff” while “stuff” is not considered. The third phase: the indexOnce the phase 2 is completed, next step is represented by the generation of the final index and to export the results in text format. It is sufficient to click on “File, export” choosing the name of the file to be generated.Results, their use and future developmentConcord can be easily applied to thesauri, dictionaries and any other lexical/terminological content since it refers to standard databases and allows parametric configuration. It is foreseen the application of Concord also to microthesauri on which EKOLab is currently working like the GIS and Remote Sensing micro-thesaurus and the SnowTerm project. Another point under development is the integration at the level of tables between Concord and other software like SuperThes. Once implemented, this will allow to work using the same database storing data and creating internal links. |