Cedefop is setting up a pan-European system for gathering and analysing information from online vacancies across all EU countries. To ensure the highest possible quality of the system, Cedefop is bringing together members of the ESSNet, representatives from Eurostat, the European Commission and independent researchers to discuss and fine-tune the methodology as well as various algorithms and rules for extracting vacancies and classifying the information they contain.
The workshop provided a platform for presenting and discussing the proposed methodology as well as for sharing the experience and knowledge of various experts in the meeting with the aim to fine-tune various algorithms and arbitrary set rules, related to:
- data ingestion (web scraping, crawling, APIs download);
- extraction, de-duplication and expansion of vacancies;
- classifications of occupations;
- extraction of skills.
Cedefop expert Vladimir Kvetan, who coordinates the project, stressed that the event represents an important milestone as the methodology gets exposed to external validation and can benefit from critical review before data collection starts. The ESSnet representative Nigel Swier highlighted that the network has been engaging in similar activities in the past three years making small steps forward and this meeting represents a big step to develop a concise system suitable for statistical purposes.
The meeting confirmed the project’s potential as well as the right direction taken in establishing the system and its infrastructure. Participants provided constructive feedback for further development.
During the concluding session, Eurostat’s Fernardo Reis said that the basics are set for a robust infrastructure to collect information from online job vacancies. Mr Swier noted the need to take concrete actions to expand collaboration between Cedefop, Eurostat and ESSnet in this area.
Agenda_real time LMI and skill requirements
Emilio Colombo - Current state of the art and next steps
Mario Mezzanzanica - Overview of methodological and technical
Ettore Colombo, Matteo Fontana, Andrea Scrivanti - Data ingestion
Alessandro Vaccarino, Mauro Pelucchi - Data pre-processing
Alessandro Vaccarino, Mauro Pelucchi - Text processing and information extraction
Matteo Fontana - General framework for data presentation
Fabio Mercorio - Research Activity on LMI
Cedefop/CRISP/ESSNet - Classifying enterprises by economic activity
Alena Zukersteinova - Data presentation and analysis
Nigel Swier - A quality framework for on-line job vacancy (OJV) data
Jiri Branka - Vacancy market landscape of the EU