Título: | Named Entity Recognition For Humans and Species With Domain-Specific and Domain-Adapted Transformer Models |
---|---|
Autor: | Alejandro Vaca Serrano |
Año: | 2022 |
This work presents different solutions to the tasks proposed at the LivingNER challenge, as part of the Iberlef 2022 Conference, with a special focus on the NER task. For that, a general domain large model was adapted to the biomedical domain, showing that this process improves the posterior fine-tuning on a majority of tasks. However, although achieving similar results, it is not able to outperform two base size models specific of the biomedical domain. A careful analysis of the reason for this gap in performance is carried out, showing that the tokenizers’ vocabulary has a great impact on the aggregation of predictions both at the word level and the word group level. This highlights the effectiveness of using domain
specific models for tasks very specific to a concrete linguistic domain. Official test results show a very good performance on the NER task, where all the submissions made are clearly above the average results. However, results for tasks 2 and 3 are very poor, which indicates that a deeper understanding of the underlying nature of those tasks is needed.
Si te interesa esta publicación, puedes descargarla: