Lifecycle
An introduction to the Online Job Ads Analysis Cycle.
Last updated
An introduction to the Online Job Ads Analysis Cycle.
Last updated
The process of analyzing OJAs is not sequential. However, at some point or another, online job ads undergo implicitly or explicitly a process from data collection to data analysis:
Unstructured data: At the beginning of the analysis cycle, an online job ad is published in a job portal, on a company website, or in a digital news publication. The data at this point consists of different web elements and is more or less unstructured.
Structured data: After collecting the data through, e.g., web scraping or calling an API, the next step is to structure the data by extracting relevant information from the job ad and organizing it. This may involve creating a spreadsheet or database with columns for different information such as job title, salary, full text, text segments, etc.
Enriched data: Once the data is structured, it can be enriched by adding additional information or extracting information from the text. We are often interested in analyzing these enriched concepts: Job titles, competencies, soft skills, education level, etc. Often these concepts need to be linked to a formalized taxonomy.
Filtered dataset: After the data has been structured and enriched, it can be used to create a sampling dataset. This step involves deduplication, representativity analysis, and other considerations.
Analysis: The final step is to analyze the sampling dataset to identify trends and patterns. Depending on the goals or the data structure, we can use tools such as inferential statistics, hypothesis testing or text mining.
These steps are almost never followed linearly. For example, the data preparation step may need to be automated or repeated if new job ads are added, or if you already have a structured dataset for your analysis. However, at some point or another, your final analysis will have included all of these steps.
Making these steps more transparent and highlighting decisions and trade-offs along the way is the goal of this publication.
The OJA analysis process presents various challenges at each step. However, there are multiple methods and approaches available to overcome these challenges. The visualization below illustrates the stages that an OJA dataset goes through, from being collected in an unstructured format to being transformed into a useful, cleaned, and enriched dataset ready for analysis, while highlighting the corresponding steps of the OJA analysis process.