Interactions of cultures and top people of Wikipedia in 24 language editions

par Clément Sire - 5 février 2015

Wikipedia is a huge global repository of human knowledge, that can be leveraged to investigate intertwinements between cultures. With this aim in mind, two scientists at the LPT (Young-Ho Eom, postdoc, and Dima Shepelyansky, Research Director at the CNRS) and their collaborators in Barcelona and Milano have applied two methods (Markov chains and Google matrix) for the analysis of the hyperlink networks of 24 Wikipedia language editions, and ranked all their articles using Google’s PageRank, 2DRank, and CheiRank algorithms (since 2009, LPT scientists were particularly involved in the development of the latter two).

Using automatic extraction of people names (and their place of birth and birthdate), they have obtained the top 100 historical figures for each edition and for each algorithm, and have investigated their spatial, temporal, and gender distributions in dependence of their cultural origins. This study demonstrates not only the existence of skewness with local figures, mainly recognized only in their own culture, but also the existence of global historical figures appearing in a large number of editions. The scientists have performed an analysis of the evolution of such figures through 35 centuries of human history for each language, thus recovering interactions and entanglement of cultures over time. They also obtained the distributions of historical figures over world countries, highlighting geographical aspects of cross-cultural links. Considering historical figures who appear in multiple editions as interactions between cultures, they have constructed an interaction network of cultures and identified the most influential cultures according to such network (and ranking algorithms).

This approach allows to analyze interactions of cultures on purely mathematical and statistical grounds, hence excluding any possible bias which could have resulted from the cultural preferences of the investigators. Note that the above mentioned algorithms at play are intimately and formally related (or “intricated” !) to the study of the quantum motion of electrons in a random medium. In 2009, this connection had initially motivated the Quantum Coherence (Quantware) group at the LPT (led by one of the authors of the present study) to address problems in the field of (mostly human) complex networks.

The obtained global list of top 100 influential people gives 43 % overlap with the historical list of Hart. All top historical figures are listed on the the Quantware webpage dedicated to this study, along with maps illustrating their density distribution over the globe (see also the original article).

Reference : Y.-H. Eom (postdoc LPT), P. Aragon (Barcelona), D. Laniado (Barcelona), A. Kaltenbrunner (Barcelona), S. Vigna (Milano), and D. L. Shepelyansky (LPT), Interactions of cultures and top people of Wikipedia from ranking of 24 language editions, PLoS ONE 10(3) : e0114825 (2015) (link to the article in open access)

  • Dima Shepelyansky is the project leader of the European FET Open project New tools and algorithms for directed network analysis (NADINE) which partly supported this work.