top of page

Avi Shmidman
Bar Ilan University

Probing Hebrew linguistic phenomena encoded in raw BERT embeddings

One of the most exciting advances of the last few years in the world of Natural Language Processing was the release of the BERT algorithm ("Bidirectional Encoder Representations from Transformers"). To what extent does this algorithm provide the means to understand nuances of Hebrew text, and how can this algorithm help us with the specific challenges of Hebrew text processing? In order to evaluate these questions, we will examine the multidimensional contextual embeddings produced by Hebrew BERT models, and we will explore the extent to which the embedding space maps distinctions between Hebrew homographs, grammatical anomalies, lexicographical distinctions, and more.

bottom of page