**

*Postdoctoral researcher in Computational Diachronic Semantics*

*

Labex EFL (Empirical Foundations of Linguistics, Paris, 
https://en.labex-efl.fr/)

Strand 5: Computational Semantic Analysis



Research area: interpretable computational models for automatic 
detection and monitoring of semantic evolutions: combination of 
Contextual Embeddings and Pattern Mining approaches

Contract duration: 18 months

Location: Paris

Research Laboratory: Sorbonne Paris Nord University, LIPN UMR7030 CNRS

Application deadline: November 15, 2021

Audition period: November 15-30, 2021

Job Starting date: from January 1, 2022



Context, Issues and research axes

Languages ​​are constantly evolving, driven by the need to adapt to 
socio-cultural and technological developments and to make communication 
more efficient and expressive. In particular, new words are forged or 
borrowed from other languages, some words become obsolete, others 
acquire new meanings or lose existing meanings.

In NLP, the study of language dynamics, especially from the lexical 
point of view, has gained audience in recent years, complementing 
synchronic approaches. The field of research is structuring itself, with 
recent state of the art (Monteirol et al., 2021; Tahmasebi et al., 2021) 
and several scientific events (International Workshop on Computational 
Approaches to Historical Language Change 2019 and 2021, ACL 2019 and 
2020). Two initial evaluation tasks have been proposed (Unsupervised 
Lexical Semantic Change Detection Task, SemEval2020) and reference sets 
have been set up for four languages ​​(English, Latin, Swedish and German).

Lexical change detection systems have followed advances in NLP methods: 
after the first systems essentially based on frequency changes (for 
example Gulordova & Baroni, 2011), systems used word embeddings (Kim et 
al., 2014, Schletchweg et al., 2019) and more recently contextual 
embeddings (Hu et al., 2019; Martinc et al., 2019; Giulianelli et al., 
2020). These latter systems generally proceed by grouping the contextual 
vector representations of the different uses into clusters of meaning, 
then detect changes according to different metrics (Monteirol et al. 
2021). Current systems still face many limitations. Mainly, the opacity 
of neural models does not make it possible to characterize these 
evolutions, in particular it is difficult, if not impossible, to link 
the semantic changes to linguistic morphological, syntactic or 
lexico-syntactic features, or to categorize the types of changes 
(extension, restriction, metaphor, metonymy, etc.). To this end, one 
avenue would be to combine neural approaches with Pattern Mining(Béchet 
et al. 2015) or collocation extraction approaches from corpus 
linguistics (for example Gries, 2012) which make it possible to extract 
the most salient lexico-syntactic patterns of a given meaning from a 
corpus of occurrences and thus identify the evolution. It would also be 
interesting to use the contextual information of the occurrences (date, 
type of source, domain, diatopic and diastratic features, etc.) to 
characterize and follow the evolution of usages.


The job main objective is therefore to set up a system combining these 
approaches to allow an automatic characterization of semantic 
evolutions. The first step will consist in experimenting with 
state-of-the-art models for detecting changes. The second step will 
then  try to combine contextual embeddings and pattern mining approaches 
/ collocation extraction to highlight the linguistic characteristics of 
each of the meaning clusters and their evolution. The studied corpora 
will be mainly in English and French. The postdoctoral fellow will work 
in collaboration with computer scientists and linguists from the Labex 
who are currently building a reference corpus of semantic evolutions for 
French (following the Durel methodology: Schlechtweg et al., 2018).

Other issues may also be addressed by the recruited person, and in 
particular: current systems do not take into account the graduality of 
evolutions, generally being limited to comparing two synchronic language 
states; to get the vector representation of a lexis in a context, it is 
possible to use one of the hidden layers or a combination of them. There 
is currently no consensus on the most adequate layer to take into 
account to obtain the most adequate semantic representation.

The recruited person will join the strand 5 (“Computational Semantics”) 
of the Labex, specifically the research team working on the “Semantic 
Variation and Change” operation which aims to:

  *

    develop new models and methods for the automatic detection of
    lexical semantic changes, the typology of changes from intra- and
    extra-linguistic points of view;

  *

    develop a reference dataset of semantic evolutions in contemporary
    French, based on available diachronic corpora.

Candidate profile

- PhD in computer science specialised in Computational Linguistics and 
Machine Learning

- deep learning methods and language models attested training and experience

- working language: French and / or English

Application

Please send :

     • a cover letter

     • a description of the research project related to the research 
questions

     • a CV with a list of publications and 3 representative 
publications (pdf or link),

     • letters of recommendation or names of two referees.

to [log in to unmask] and 
[log in to unmask] before November 15, 2021. The 
auditions of the pre-selected candidates will take place at the end of 
November 2021.


Références

Béchet N., Cellier P., Charnois T. and Crémilleux B. (2015). “Sequence 
mining under multiple constraints”. In Proceedings of the 30th Annual 
ACM Symposium on Applied Computing (SAC 2015),ACM Press, Salamanca, 
Spain, pages. 908--914.

Giulianelli, M., Tredici, M.D., & Fernández, R. (2020). “Analysing 
Lexical Semantic Change with Contextualised Word Representations”. 
Proceedings of the 58th Annual Meeting of the Association for 
Computational Linguistics, pages 3960–3973 July 5 - 10, 2020. 
https://www.aclweb.org/anthology/2020.acl-main.365.pdf 
<https://www.aclweb.org/anthology/2020.acl-main.365.pdf>

Gries Stefan Th. (2012). "Behavioral Profiles: a fine-grained and 
quantitative approach in corpus-based lexical semantics". In Gonia 
Jarema, Gary Libben, Chris Westbury (eds.), Methodological and analytic 
frontiers in lexical research, 57-80. Amsterdam Philadelphia: John 
Benjamins.

Montariol, S. (2021). Models of diachronic semantic change using word 
embeddings. (Modèles diachroniques à base de plongements de mot pour 
l'analyse du changement sémantique). PhD Thesis, Paris-Saclay. 223 pages 
https://tel.archives-ouvertes.fr/tel-03199801/document 
<https://tel.archives-ouvertes.fr/tel-03199801/document>

Montariol S., Doucet A. and Allauzen A. (2021). “Etat de l’art du 
changement sémantique à partir de plongements contextualisés”. In Coria 
2021, http://coria.asso-aria.org/2021/articles/court_27/main.pdf 
<http://coria.asso-aria.org/2021/articles/court_27/main.pdf>

Montariol, S., Martinc, M., & Pivovarova, L. (2021). “Scalable and 
Interpretable Semantic Change Detection”. Proceedings of the 2021 
Conference of the North American Chapter of the Association for 
Computational Linguistics: Human Language Technologies, pages 4642–4652 
June 6–11, 2021. . 
https://www.aclweb.org/anthology/2021.naacl-main.369.pdf 
<https://www.aclweb.org/anthology/2021.naacl-main.369.pdf>

Schlechtweg, D., McGillivray, B., Hengchen, S., Dubossarsky, H., & 
Tahmasebi, N. (2020). “SemEval-2020 Task 1: Unsupervised Lexical 
Semantic Change Detection”. Proceedings of the 14th International 
Workshop on Semantic Evaluation, pages 1–23 Barcelona, Spain (Online), 
December 12, 2020. https://www.aclweb.org/anthology/2020.semeval-1.1.pdf 
<https://www.aclweb.org/anthology/2020.semeval-1.1.pdf>

Schlechtweg, D., & Walde, S.S. (2020). “Simulating Lexical Semantic 
Change from Sense-Annotated Data”. In Ravignani, A. and Barbieri, C. and 
Martins, M. and Flaherty, M. and Jadoul, Y. and Lattenkamp, E. and 
Little, H. and Mudd, K. and Verhoef, T. (Eds.): The Evolution of 
Language: Proceedings of the 13th International Conference 
(EvoLang13).http://brussels.evolang.org/proceedings/paper.html?nr=9 
<http://brussels.evolang.org/proceedings/paper.html?nr=9>

Tahmasebi, N., Borin, L., & Jatowt, A. (2018). “Survey of Computational 
Approaches to Lexical Semantic Change”. Computational Linguistics, vol. 
1, n°1, https://arxiv.org/pdf/1811.06278.pdf 
<https://arxiv.org/pdf/1811.06278.pdf>

Tahmasebi N., Borin L., Jatowt A., Xu Y. and Hengchen S. (éds, 2021). 
Computational approaches to semantic change, Language Science Press, 
396p. https://langsci-press.org/catalog/book/303 
<https://langsci-press.org/catalog/book/303>

Schlechtweg D., Schulte im Walde S. and Eckmann S. (2018). Diachronic 
usage relatedness (DURel): A framework for the annotation of lexical 
semantic change. In Proceedings of the 2018 Conference of the North 
American Chapter of the Association for Computational Linguistics: Human 
Language Technologies, Volume 2 (Short Papers), pages 169–174, New 
Orleans, Louisiana. Association for Computational Linguistics. 
https://www.aclweb.org/anthology/N18-2027.pdf 
<https://www.aclweb.org/anthology/N18-2027.pdf>




*

-- 
Emmanuel Cartier
Enseignant-chercheur en linguistique informatique
LIPN - RCLN UMR7030 CNRS / Pléiade EA 7338
Université Sorbonne Paris Nord
99 avenue Jean-Baptiste Clément
93430 Villetaneuse
+33 (0)6 46 79 12 86
[log in to unmask]


############################

Unsubscribe:

[log in to unmask]

If you don't already have a password for the LISTSERV.ACM.ORG server, we recommend
that you create one now. A LISTSERV password is linked to your email
address and can be used to access the web interface and all the lists to
which you are subscribed on the LISTSERV.ACM.ORG server.

To create a password, visit:

https://LISTSERV.ACM.ORG/SCRIPTS/WA-ACMLPX.CGI?GETPW1

Once you have created a password, you can log in and view or change your
subscription settings at:

https://LISTSERV.ACM.ORG/SCRIPTS/WA-ACMLPX.CGI?SUBED1=MM-INTEREST