Ladino
Ladino
The dictionary
About Ladino
-
Ladino
-
Judeo-Espanyol
-
Judeo-Spanish
-
Judezmo
-
Espanyol (our spanish)
-
Haketia (in Marokko)
Origin of Ladino
-
Romance language based on old Spanish and a lot of other languages:
-
Castillian Spanish
-
Portugueses
-
Aragonese
-
Catalan
-
Italian
-
French
-
Turkish
-
Greek
-
Bulgarian
-
Serbo-Croatic
-
Arabic
-
Hebrew
-
...
Expulsion of Sefardic Jews
-
1492 from Spain
-
1496 from Portugal as well
-
Spread to a lot of places in Western Europe, North Africa and the Ottoman Empire
Writing system
-
Latin letters
-
Rashi script
-
Solitreo
-
See Ladinotype
Who am I and how I learned Ladino
- Gabor Szabo
- Programmer (test automation, DevOps)
- Hungary => Israel
- Learning modern Spanish for 3.5 years. (Duolingo, italki, eBooks, podcasts, youtube videos)
- Learning Ladino for 6 months.
Goals
- Be able to communicate with other Ladino speaker.
- Help others in the same goal.
- Help preserve the language and the culture.
My site about Ladino
LibreLingo
LibreTranslate
- LibreTranslate
- Needs a huge corpus of text in two languages.
Ladino communities
Ladino academy
- Ladino academy (2018)
Existing Ladino dictionaries
The dictionary
Quick overview of the technology
The source of the dictionary
- Source Excel file Created by Güler Orgun, Ricardo Portal i Antonio Ruiz Tinoco 2003-2009
- Contributed to Ladinokomunita
- Converted to YAML files automatically and then manual cleanup
YAML representation
-
Different spellings of the same word
-
Different words for the same thing (e.g. in different locations)
-
Gender
-
Singular/plural
-
Conjugation of verbs (regular, irregular) (we have 600 verbs now)
config.yaml
- categories
- acceptable values of certain fields (origines, grammar, gender, etc.)
- lists of words
- list of extra pages (in markdown format)
Processing in Python
-
Read the config file and all the YAML files.
-
Verify what that the fields have correct values (from the list of valid values)
-
Verify certain other things.
-
Generate HTML files for each word
-
Generate JSON files to be used by the front-end.
-
Generate some extra files
CI/CD
- Pushing to any of the repositories will run some local test.
- Trigger the action of the "generated" repo.
- gitHub pages gets updated.
Front-end
- JQuery
- The main page only (and the Konfig and the Game)
Tests for the code
- Using a few real words.
- Using a few YAML files with specific issues.
- Comparing the results to the previous runs.
Issues to handle
-
Conjugations of verbs
-
Multi-word expressions? "me ambezo"
-
Connection between words (e.g. plural of ..., conjugation of ...)
-
Dictionary is getting big (1 Mb)
-
Make it easy for people to suggest changes?!
Questions
-
Which words to include? (today: enlase, atadijo, link)
-
Which spelling(s) to include? (dia, diya) (djueves, djugeves, djugueves)
-
Which one(s) to recognize and which ones to recommend?
-
When is a word "part of the language"?
QA session
- Thank you - Questions?