Overview - Methods
This page gives an overview of methodological approaches in the field of OJA analysis. Each method is linked to its respective section in this guide.
Data Collection
Landscaping Data Sources and OJA Landscaping
Web Scraping Web Scraping
API Data Collection API Data Collection / Data Providers
Data Formats Job Posting Data Schema
Data Enrichment and Methods
Text Segmentation Text Segmentation
Duplicate Identification Identifying Duplicates
Natural Language Processing and Data Pre-Processing Pre-Processing and Embeddings
Rule-based Matching Rule-Based Matching
Supervised Document Classification Extraction Methods
Named-Entity Recognition Extraction Methods
Disambiguation/ Entity Linking Semantic Similarity
Evaluation and Quality Control
Gold Standard Annotation Gold Standard Annotation and Quality
Machine Learning Evaluation Evaluation
Taxonomies and Ontologies
Taxonomy Development Developing a Taxonomy
Taxonomy Evaluation Data Standards for Taxonomies and Ontologies
Dataset Curation and Representativity Analysis
Sampling Filtering Data
Deduplication Deduplication
Representativity Analysis Representativity Analysis
Last updated