Here are my slides building up why social data that users add on learning resources is important for the discovery of learning resources in a multilingual context. This, of course, has ramifications also for what kind of datasets I would like to see in the future!
(Oops, I don't seem to be able to embed content here!)
For example, to pursue this type of work, it would be imporant that the dataset would indicate:
- the country of the user, his/her spoken languages (including mother tongue)
- the origin of the learning resource (which can be inferred from the provider) and its language(s)
- the id, etc., but also the language of the tag