[Overview | Goals | People | Publications | Contact |Acknowledgements]



“Be a yardstick of quality. Some people aren’t used to an environment where excellence is expected.” -Steve Jobs




Digital Libraries contents continue to grow and the amount available through different sources such as Open Access Journals gives us a unique opportunity to explore, to extend, to learn and to improve on new approaches to tackle complex problems from a diversity of disciplines. However this availability of digital content raises the question of how to find information that is valuable –the so-called 5th V in Big Data. Indeed, one of the most important open issues in the Digital Library community today is how to model data to find and represent its value from this sea of digital content. And the possibility to take advantage of all the valuable knowledge behind digital libraries makes the challenge quite interesting. In our quest we argue that we need to consider the quality dimension as our guiding force. Therefore, our initiative Qualimeta will undertake this problem from the perspective of deep semantics to assess the value of different aspects of digital libraries. We argue that we need to go beyond classical bibliographic metadata if we aim at searching and indexing mechanisms that can help us to extract and thus deliver valuable results.

We share the idea that the value of the knowledge of scientific publications goes down into the content that is discussed within the papers but we argue that we need to find ways to ease the metadata generation of the specific aspects where the real value of the paper lies. Indeed, finding the relevant entities in scientific papers, interlinking them, connecting them to external resources as well as the knowledge that is assumed by the authors that the reader should have are some of the issues we will explore.

Project Goals

Modeling the knowledge of any particular domain is a challenge that goes beyond keyword-based query. In particular, we believe that digital libraries need and deserve a better approach for current user needs. Therefore, our motivation for doing a better job at identifying and connecting content in order to ease the understanding of a given topic within the setting of Digital Libraries. Qualimeta will explore, develop, test, and implement state of the art algorithms to solve current challenges in Open Acess Journals. In our quest we will explore how to model the "deep semantics" of Digital Libraries and we will look at the intersection of Machine Learning, Natural Language Processing and Information Retrieval fields. We are currently exploring highly specialized scientific digital libraries trying to answer fundamental questions such as:

  • How does searching in scientific digital libraries look like?
  • What are the metadata challenges and quaility issues when you look at the plethora of documents being produced every year?
  • How can we connect digital libraries to each other in related fields of study?
  • What about the objects hiding from current indexing mechanisms: (gene sequences, formulae, theorems, chemical substances, etc.)? can we model them to facilitate metadata annotation? Is there a "semantic" model waiting for discovery to ease the burden of librarians to go beyond classical bibliographic metadata?.

Within this scenario, we are currently looking at:

  • Semantic modeling of relevant objects in Digital Libraries.
  • Linking of digital libraries across disciplines.
  • Synergy of of bottom-top and top-down approaches for metadata enhancement.
  • Assessing the quality of collaborative tagging from the Digital Library perspective.



PhD Students


For more information about the project, please contact: M.Sc. José María González Pinto


We thank the German Research Foundation (DFG) for supporting this work


Pinto, J. M. G., and W. - T. Balke, "Demystifying the Semantics of Relevant Objects in Scholarly Collections: A Probabilistic Approach", ACM/IEEE Joint Conference on Digital Libraries (JCDL), Knoxville, TN, USA, 06/2015. Abstract  Download: jcdl_2015_jmgp_wtb.pdf (1.36 MB)