Glossary

ACRONYMS

AHSS: Arts, Humanities and Social Sciences

ARK: Archival Resource Key, a scheme for the persistent identification of information objects

CH: Cultural Heritage

CIDOC-CRM: CIDOC Conceptual Reference Model

CPU: Central Processing Unit

CSV: Comma-Separated Values

DC: Dublin Core

DC-Net: Digital Cultural Heritage Network, the ERA-NET project supported by the European Commission in the frame of PF7 eInfrastructures programme

DCH: Digital Cultural Heritage

DOI: Digital Object Identifier, system for identifying content objects in the digital environment

DoW: Description of work

DRM: Digital Right Management

EC: European Commission

EDM: Europeana Datamodel

EFG: European Film Gateway

ERA-NET: European Research Area Network, a type of project supported by the European Commission in the frame of FP7 to contribute to the joint programming with Member States on strategic areas of the research, to contribute to the establishment of the European research  area.

ESE: Europeana Semantic Elements

FOAF: Friend of a Friend

FP7: The seventh Framework Programme for research and technological development of the European Commission

GEMET: General Multilingual Environmental Thesaurus

GLAM: Galleries, libraries, archives and museums

ICOM: International Council of Museums

ICT: Information and Communication Technologies

IPR: Intellectual Property Rights

ISO: International Standard Organisation

JPI: Joint Programming Initiative

LIDO: Light Information Describing Objects

LOD: Linked Open Data

MIMO: Musical Instruments Museums Online

MSEG: Member States Experts Group

NBN: National Bibliography Numbers

NREN: National Research and Education Network

OAI-PMH: Open Archive Initiative - Protocol for Metadata Harvesting

OWL: Web Ontology Language

PID: Persistent Identifier

PURL: Persistent Uniform Resource Locators

R&D: Research and Development

RDF: Resource Description Framework, a general language for conceptual description and modelling of information that is implemented within web resources

RDFS: RDF Schema

SKOS: Simple Knowledge Organisation System

UGC: User Generated Content

UML: Unified Modeling Language

URI: Uniform Resource Identifier, a string of characters used to identify a name or a resource

URL: Uniform (or Universal) resource locator, it is a specific character string that constitutes a reference to an Internet resource

URN: Uniform Resource Name, it is a type of uniform resource identifier (URI)

VIAF: Virtual Information Authority File

WG: Working Group

WP: Work-package

GLOSSARY

- Bi/ Multilingual GUI: it refers to the Graphic User Interface (GUI) on the front end. The user interface may be multi or bilingual while the controlled vocabulary may not exist or be monolingual and so this fact is noted.

- Bi-directional: This is a specific issue pertaining to Semitic languages that differs from other languages in being read right to left. In most cases lexicons that are bi-directional can be opened in mirror image, an example of this could be reflected in the "search" button on the screen. The buttons that appear on the right for searching an English term would appear on the left for a Arabic or Hebrew term.

Catalogue: A concept related to the Metadata of one or more Collections of a library, independent of the digital or non‐digital nature of the related Contents. This is a vague concept, as sometimes it might be used to mean a set of Metadata (thus, an information entity) but it also might be used to mean the system that manages the creation, editing and storage of that Metadata (in those cases it is more correctly named, in the library domain, as the “Cataloguing System”, but it is also common to see that system simply named “Catalogue” for the same purpose).

Collection: An intentionally‐defined set of Content, compiled under a specific policy. This is a common concept in the library domain, so it is used here with the same meaning as in that domain.

Content: The digital objects that can be accessed through Metadata. Content is typically held on Data Providers’ / Aggregators’ sites. Content is usually defined by its individuality and cultural, intellectual or artistic expression. Content has a reference to an individual object of the real world or is born digital. Examples: photographs, books, letters, films, paintings, television, etc. Note: In online delivery, Content excludes the peripheral packaging / platform.

Contextual Resources: Catch‐all term for resources which help to provide context for the Contentand make it possible to enrich the services to be developed by the Service Providers (such as Europeana). Data such as linked data, ontologies, vocabularies, thesauri, classifications, taxonomies, etc.  

- Controlled vocabulary (1): It is a lexicon built in a linear format. This list is similar to subject headings and includes pre-coordinated terms. Searches are performed by choosing from a list (facets). Example: Library Congress Subject Headings. 

- Controlled vocabulary (2): It is a list of terms that have been explicitly enumerated. This list is controlled by and is available from a controlled vocabulary registration authority. All terms in a controlled vocabulary should have an unambiguous, non-redundant definition. This is a design goal that may not be true in practice. It depends on how strict the controlled vocabulary registration authority is regarding registration of terms into a controlled vocabulary. As a minimum the following tow rules should be enforced: if the same term is commonly used to mean different concepts in different contexts, then its name is explicitly  qualified to resolve this ambiguity. If multiple terms are used to mean the same thing, one of the terms is identified as the preferred term in the controlled vocabulary and the other terms are listed as synonyms, aliases or non-preferred.

Data: Catch‐all term including Metadata, Images, Audio and Moving image previews. In the scope of this document, this concept also includes, by default, Full‐text Data.

Data Aggregator: Organisation that collects, formats and manages Data from Data Providers before making that available to Service Providers (such as Europeana).

Data Collection: The Data corresponding to a specific Collection.

Data Export Task: This is a task of harvesting a Data collection from the TEL Aggregator by a Service Provider.

Data Ingest Task: A task of harvesting aData Collection from a Data Provider.

Data Provider: Organisation that makes Data available to a Data Aggregator (such as the TEL Aggregator) or a Service Provider (such as Europeana).

Data Provider Record: A Data Provider Record is a generic concept to name all the structured information the TEL Aggregator maintains about a Data Provider. That concept comprises all the descriptive and contact information, as well as the information about all the Data that the Data Provider is willing to provide for Data Harvest Tasks.

Data Schema: A description of the structure of a specific Data.

Enriched Data: Data that has been subject to a process of Enrichment, Normalisation or Transformation.

- Enrichment: A process that generates Enriched Data from Raw Data. It can consist of adding machine‐generated new attributes to Records(such as linking to authority files, geographic data, etc., making use of Contextual Resources) ; in this case the values assigned to the attributes can consist of data (such as a textual string or a temporal date) or a URI to an external entity. In the particular case of this project, this also comprises the building of search indexes from the full‐text. Other kinds of processes of Enrichment are Transformation and Normalisation.

- Free text: Almost all electronic databases allow free-text or keyword searching. In this type of search, the system usually looks for your search terms in every field of the record (not just in the subject heading or descriptor fields) and it looks for those terms to occur exactly as you type them, without mapping or translating them to controlled vocabulary terms.

- Full‐text Data: Data in the form of text representing literal transcriptions of written or spoken words from the Content. This is a new class of Data to be considered, not covered (and so not to be confused) by the concepts of Contextual Resources or Metadata.

Mapping: An expression of rules to convert Data structured according to a source Data Schema into new Data structured according to a target Data Schema.

Metadata: Metadata is information about Content, describing its characteristics to aid in its identification, discovery, interpretation and management. Metadata is given to Europeana and drives discovery of Content held at the Data Provider’s/Aggregator’s site. Metadata are usually facts or fact‐like information, containing little individual artistic/creative expression. Examples: Bibliographic or filmographic data, temporary and spatial data, etc.

Normalisation: A kind of Enrichment in order to make the Data conformant with its declared Data Schema. This might comprise, for example, adding missing mandatory attributes or the normalisation of values (e.g. the normalisation of date values to ISO 8601 compliant strings).

- OWL: The Web Ontology Language is a family of knowledge representation languages for authoring ontologies. The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web. OWL is endorsed by the World Wide Web Consortium and has attracted academic, medical and commercial interest. OWL is based on the RDF specification.

Preview: A reduced size or length audio and/or visual representation of Content, in the form of one or more images, text files, audio files and/or moving image files.

- RDF: The Resource Description Framework is a family of W3C specifications originally designed as a metadata data model. The RDF data model is based upon the idea of making statements about resources (in particular web resources) in the form of triples. Triples are the expressions of statements about resources which are presented as subject-predicate-object expressions. The subject denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between the subject and the object. The RDF specification is based on the XML encoding.

- Thersaurus: It is a networked collection of controlled vocabulary terms. This means that a thesaurus uses associative relationships in addition to parent-child relationships. The expressiveness of the associative relationships in a thesaurus vary and can be as simple as "related to term" as in term A is related to term B. A thesaurus has two kinds of links: broader/narrower term, which is mich like the generalization/specialization link, but may include a variety of others (just like taxonomy). In fact, the broader/narrower links of a thesaurus is not really different from a taxonomy, as described above. A thesaurus has another kind of link, which typically will not be a hierarchical relation, although it could be. This link may not have any explicit meaning at all, other than that there is some relationship between the two terms.

-Transformation: A kind of Enrichment by applying a set of Mapping rules to Raw Data in order to produce new Enriched Data structured according to a target Data Schema. It is important to stress that a Transformation only uses the Raw Data ‘as it is’, which might imply the need for Normalisation to ensure that the Enriched Data is fully conformant with the target Data Schema.

UIM: Unified Ingestion Management tool, also called Ingestion Control Panel, represents the extensible framework to manage the whole ingestion process.   

URI: Uniform Resource Identifier, URLs (Uniform Resource Locators) are URIs.

Raw Data: The Data the Aggregator collects from the Data Providers.  

Record: The unit of Metadata concerning a single Content object.

- SKOS: In the set of formats, SKOS is more and more required by Web services. Europeana for instance has decided to format in SKOS all the metadata they harvest for a homogeneous and effective exploitation of the resources, of the data and their related descriptions. SKOS is based on the RDF specification and enable a migration towards OWL ontologies. SKOS is not a formal knowledge representation language but rather used for modeling controlled vocabularies such as thesauri or classifications which are of a different nature than ontologies. 

- Truly Bi/ Multilingual - Bi/ multilingual parallel cells. If the lexicon is truly bi/ multilingual, the same number of results would be found if the term is searched in either language. The lexicon would also be able to act a translation tool. If the data were input in English, for example, the Arabic or Hebrew equivalent would fill in the parallel cell.

- XML (Extensible Markup Language): XML is a set of rules for encoding documents in machine-readable form. XML's design goals emphasize simplicity, generality, and usability over the internet. It is a textual data format, with strong support via Unicode for the languages and scripts of the world. Although XML's design focuses on documents, it is widely used for the representation of arbitrary data structures, for example in web services.

 

Navigation

Newsletter

Dédale | Paris | Tél : +33 (0) 1 43 66 09 66 | contact@dedale.info

Back to homepage
Minerva Knowledge Base
Page top

By continuing your visit to this site, you accept the use of cookies to enable you to share content via share buttons of social networks and to allow us to measure the audience. For more information and set cookies

x