Author: Dr Eftim Zdravevski STSM Period: 2019-03-11 – 2019-04-12
Hosting institution: New Bulgarian University. Sofia, Bulgaria
From ITC: Yes
With the increasing number of scientific publications, the analysis of the trends and the state-of-the-art in a certain scientific field is becoming a very time-consuming and tedious task. In response to urgent needs of information, for which the existing systematic review model does not well, several other review types have emerged, namely the rapid review and scoping reviews.
Within this STSM we aim to implement some modules of an NLP powered tool that automates most of the review process by automatic analysis of articles indexed in the IEEE Xplore, PubMed, and Springer digital libraries. We will demonstrate the applicability of the toolkit by analyzing articles related to solutions for healthy ageing at work, in accordance with the PRISMA surveying methodology. With this case study, the goals are two-fold. First, we intend to identify what are the current trends and what are the scientific communities focusing on. Second, we want to validate that an NLP-based software can ease and speed up the scoping process and show valuable insights from the surveyed articles even without manually reading most of the articles. Instead, we intend to implement the software in such a way that it would pinpoint the most relevant articles which contain more properties and therefore, significantly reduce the manual work. Additionally, visualization is an important aspect, so we intend to implement modules for automatically generating informative tables, charts and graphs.
Proposed contributions to the scientific objectives of the action
Given that we plan to perform a rapid review in solutions for healthy ageing at work, we will get an overview of the state-of-the-art techniques and approaches, including:
• Identification of architectures that embed sensors into furniture and clothing items, thus enhancing the working environments without being intrusive to end users
• Identify approaches that are related to smart working environments
• Data management in these approaches
• Applied technologies, protocols and algorithms
• Social aspects of healthy ageing at work
• Application domain, such as accident detection, activity recognition, diet monitoring, exercise recommendations and feedback, etc.
Meanwhile, the COST action is focused on integrating ICT solutions into habitats, along with improved building design, aiming to allow patients to live at home and stay active and productive for longer despite cognitive or physical impediments.
Therefore, improving accessibility, functionality, and safety at home, at work and in society in general, is key for achieving this. However, it requires combining many disciplines together to develop solutions that integrate ICT, ergonomics, healthcare, building and community design.
I believe that a review of the current state-of-the-art in this area help in identifying gaps in what is currently implemented and point out opportunities for new products or improvements of existing ones are in line with the main objectives of the COST action. This can lead to enhancing the working environment by suitable algorithms, approaches and architectures while considering the social aspects and psychology.
The methodology used for the selection and processing of the research articles would be based on “Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement”. The first part is gathering articles based on certain criteria, in our case using the search keywords. After the articles are collected, the duplicates are removed and some of the articles are discarded for various reasons, such as relevance, missing meta-data, invalid publication period, etc. Finally, from the selected subset of articles, a qualitative analysis is performed and from those articles, only a certain number is selected for more thorough screening. With the NLP toolkit that will be implemented, we plan to automate most of the steps in the PRISMA approach to significantly reduce the number of articles that need to be manually screened.
Search input taxonomy
The user input is defined with the following parameters, which are further enhanced by proposing synonyms to the search keywords and properties:
• keywords – search terms or phrases that are used to query a digital library (e.g. IoT in work environments, healthy aging, etc.)
• properties – words or phrases that are being searched in the title, abstract or keywords section of the identified articles.
• start year – the start year (inclusive) of the articles that we are interested in.
• end year – the end year (inclusive) of the articles that we are interested in
Digital Libraries – Crawling, scraping and processing
We plan to analyze the following digital libraries (i.e. sources): IEEE Xplore, Springer and PubMed. For the results obtained from there, we plan to utilize WordNet to find synonyms to the searched properties.
The results of the processing and retained relevant articles will be aggregated by several criteria. The output will contain CSV files and charts for each of the following aggregate metrics:
• By source (digital library) and relevance selection criteria
• By publication year
• By source and year
• By search keyword and source
• By search keyword and year
• By property group and year
• By property and year, generating separate charts for each property group
• By number of countries, the number of distinct affiliations and authors, aiming to simplify the identification of multidisciplinary articles
In addition to that, we will perform graph visualization of the results, where nodes are the properties and the edges are the number of articles that contain the two properties it connects.
We aim to accomplish the goals of the proposed study by performing several steps:
1. Investigating literature and defining relevant keywords and properties for articles within solutions for healthy ageing at work. The articles would be evaluated by searching for these properties and their synonyms in the article’s abstracts. (2 days)
2. Defining taxonomy of analyzed properties and collected data from the crawled articles. (1 day)
3. Implementing crawling and scraping methods for IEEE Xplore, Springer and PubMed digital libraries. (10 days)
4. Implementation of generic visualization modules that generate charts, tables and graphs for illustration of aggregate results. (4 days)
5. Visualization and interpretation of results, discussions and brainstorming sessions for ideas of further improvement. (3 days)
6. Preparing a draft version of an article describing the performed work. (5 days)
7. Sharing experiences on the field of Enhanced Living Environments, Ambient Assisted Living and Smart Environments, based on the ongoing R&D of both entities: New Bulgarian University in Sofia, Bulgaria and the Institute of information systems, Faculty of computer science and engineering, University of Saints Cyril and Methodius in Skopje, North Macedonia. (in parallel to other activities)
As a result of this work, at least one article will be published in a conference, book chapter or journal, and should be considered as a deliverable of the action.