# Overview - Methods

## Data Collection

* Landscaping [#data-sources-and-oja-landscaping](https://www.oja-guide.de/steps/data-collection#data-sources-and-oja-landscaping "mention")
* Web Scraping [#web-scraping](https://www.oja-guide.de/steps/data-collection#web-scraping "mention")
* API Data Collection [#api-data-collection-data-providers](https://www.oja-guide.de/steps/data-collection#api-data-collection-data-providers "mention")
* Data Formats [#job-posting-data-schema](https://www.oja-guide.de/steps/data-collection#job-posting-data-schema "mention")

## Data Enrichment and Methods

* Text Segmentation [#text-segmentation](https://www.oja-guide.de/steps/data-enrichment#text-segmentation "mention")
* Duplicate Identification [#identifying-duplicates](https://www.oja-guide.de/steps/data-enrichment#identifying-duplicates "mention")
* Natural Language Processing and Data Pre-Processing [#pre-processing-and-embeddings](https://www.oja-guide.de/steps/extraction-methods#pre-processing-and-embeddings "mention")
* Rule-based Matching [#rule-based-matching](https://www.oja-guide.de/steps/extraction-methods#rule-based-matching "mention")
* Supervised Document Classification [#supervised-classification](https://www.oja-guide.de/steps/extraction-methods#supervised-classification "mention")
* Named-Entity Recognition [#statistical-named-entity-recognition-token-classification](https://www.oja-guide.de/steps/extraction-methods#statistical-named-entity-recognition-token-classification "mention")
* Disambiguation/ Entity Linking [#semantic-similarity](https://www.oja-guide.de/steps/extraction-methods#semantic-similarity "mention")

## Evaluation and Quality Control

* Gold Standard Annotation [#gold-standard-annotation-and-quality](https://www.oja-guide.de/steps/evaluation-and-quality-control#gold-standard-annotation-and-quality "mention")
* Machine Learning Evaluation [#evaluation](https://www.oja-guide.de/steps/evaluation-and-quality-control#evaluation "mention")

## Taxonomies and Ontologies

* Taxonomy Development [#developing-a-taxonomy](https://www.oja-guide.de/steps/taxonomies-and-ontologies#developing-a-taxonomy "mention")
* Taxonomy Evaluation [#data-standards-for-taxonomies-and-ontologies](https://www.oja-guide.de/steps/taxonomies-and-ontologies#data-standards-for-taxonomies-and-ontologies "mention")

## Dataset Curation and Representativity Analysis

* Sampling [#filtering-data](https://www.oja-guide.de/steps/dataset-curation-and-representativity-analysis#filtering-data "mention")
* Deduplication [#deduplication](https://www.oja-guide.de/steps/dataset-curation-and-representativity-analysis#deduplication "mention")
* Representativity Analysis [#representativity-analysis](https://www.oja-guide.de/steps/dataset-curation-and-representativity-analysis#representativity-analysis "mention")
