Language is a media of scientific communication. Language distribution of scientific communication reflects the status of global scientific power. The study, based on scientific tweets, has revealed the language distribution in informal scientific communication, and comparative analysis is done with language distribution of scientific literature, in order to help understand the function and influence of major languages in informal scientific communication in major countries around the world.
Firstly, bibliographic data were collected from Scopus for scientific publications of all languages published in June 2015, including country, discipline, citation count, title and DOI. Secondly, Python program was used to match records in Altmetric.com dataset (October 2011 to June 2016) by matching DOI and title. Scientific tweets of these publications were collected, including geographic coordinates and full text of tweet content. Thirdly, descriptive statistics were used for analyzing the data and comparison was conducted.
According to Scopus, 183 699 scientific publications of 25 languages were published in June 2015. These publications obtained 451 982 scientific tweets. 1) From the perspective of general distribution of scientific tweets, English scientific tweets have the highest percentage. Languages of scientific tweets are concentrated in English (91%), Japanese (24%) and Spanish (17%). In comparison, languages of scientific publications are concentrated in English (942%), Chinese (43%) and Turkish (04%), and the concentration is getting more skewed than before. 2) From the perspective of language distribution in disciplinary level, there exists disciplinary difference in informal scientific communication, reflecting the different level of attention they have for different disciplines. In comparison, language distribution of scientific publications in disciplinary level reflects that certain countries have advantage in certain disciplines. 3) From the perspective of language distribution in country level, English has undoubtedly become lingua franca for informal scientific communication. Countries around the world, regardless of whether the native language is English or not, have English scientific tweets in the dominant position while their native language scientific tweets rank the second, except for Saudi Arabia. For all countries, the top three languages have occupied over 95% of scientific tweets. The distribution also shows some scientific and cultural difference, for example, in Japan, Japanese scientific tweets have similar percentage as English scientific tweets. 4) From the perspective of language distribution for publications of different languages, scientific tweets are concentrated for publications of English, Germany, Japanese, French, Portuguese and Spanish, while Chinese and Turkish publications of high percentage receive no scientific tweets, indicating publications of these two languages obtain low attention in informal scientific communication.
Although scientific tweet is an important form of informal scientific communication on Twitter, it cannot represent informal scientific communication from all scientists, and there are also other forms of scientific communication on Twitter. China has restricted the use of Twitter so China is not in the scope of analysis of language distribution in country level.
The study quantitatively measures the language distribution of informal scientific communication.With detailed statistical analysis from country level, disciplinary level and language level, the author points out that English has also become the lingua franca in informal scientific communication. 3 figs. 3 tabs. 15 refs.