The study is presented in two companion papers that each provides a different perspective of the analysis.The first paper describes the corpus and presents an overall analysis of the number of papers, authors, gender distributions, co-authorship, collaboration patterns and citation patterns.

Their experiments are based on a published dataset of annotated references from a corpus of publications on the historiography of Venice (books and journal articles in Italian, English, French, German, Spanish and Latin) published from the nineteenth century to 2014.

In the evaluation the authors show the relative positive contribution of their character-level word embeddings.

The Research Topic on “NLP-enhanced Bibliometrics” aims to promote interdisciplinary research in bibliometrics, Natural Language Processing (NLP) and computational linguistics in order to enhance the ways bibliometrics can benefit from large-scale text analytics and sense mining of papers.

The objectives of such research are to provide insights into scientific writing and bring new perspectives to the understanding of both the nature of citations and the nature of scientific papers and their internal structures.

In the paper “The Termolator: Terminology Recognition Based on Chunking, Statistical and Search- Based Scores,” Meyers et al.

propose an open-source high-performing terminology extraction system called Termolator which utilizes a combination of knowledge-based and statistical on a deep learning architecture for the detection, extraction and classification of references within the full text of scholarly publications.The authors explore word and character-level word embeddings, different prediction layers (Softmax and Conditional Random Fields) and multi-task over single-task learning components.More than 36,000 papers in environmental sciences, retrieved from the ISTEX database, were processed to observe the trends in the GEM score over an 80-year period of time.The results show that abstracts tend to be more generous in recent publications and there seems to be no correlation between the GEM score and the citation rate of the papers.In the paper “Temporal Representations of Citations for Understanding the Changing Roles of Scientific Publications,” He and Chen propose an analysis of the temporal characteristics of citations in order to represent the dynamic role of scientific publications.For this purpose, they study and compare different types of citation contexts in order to identify articles that play important role in the development of science.The second paper investigates the research topics and their evolution over time, the key innovative topics and the authors that introduced them, and also the reuse of papers and plagiarism.Together, the two papers provide a survey of the literature in NLP and SLP and the data to understand the trends and the evolution of research in this research community.The further developments in this field of study need producing annotated corpora and shared evaluation protocols in order to enable the comparison between different tools and methods.The development of such resources is an important step to making scientific reproducibility possible.


