Resources¶
Below are a number of resources related to data readiness and NLP.
Data readiness¶
- Data Readiness for Natural Language Processing, pre-print by Olsson & Sahlgren (2020).
- We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing, pre-print by Olsson & Sahlgren (2021).
- draviz, a tool for visualizing the data readiness for NLP Projects.
- Data Readiness
- Data Readiness Levels, white paper by Neil Lawrence (2017)
- FAIR, guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets.
Project processes and structure¶
Here are links to some common data science/analytics project processes.
- Cross-industry standard process for data mining (CRISP-DM)
- Team Data Science Process (TDSP) by Microsoft.
- 5i Framework by QuantumBlack.
- Data Science Project Management
Text annotation tools¶
If the solution to the problem involves supervised learning, or assessment of performance based on human labelled data, then you need to obtain annotated data. There are numerous software tools and services available to accommodate all sorts of textual annotations. Below is a list of annotation services and tools, as well as a link to a Google spreadsheet with a feature comparison matrix of the tools.