最新动态
科研助力丨新冠疫情语料库
2024-11-10 16:08

科研助力丨新冠疫情语料库

简介:

新冠疫情语料库旨在成为2020年及以后新冠疫情(COVID-19)产生的社会、文化和经济影响的记录。该语料库显示了人们在20个不同英语国家的在线报纸和杂志上的实际言论。该语料库(于2020年5月首次发布)目前约有9.96亿字,并且每天继续增加300-400万字。

 

该语料库可以让你看到自2020年1月以来以10天为单位的单词和短语的频率,如  social distancing, flatten the curve, WORK * home, Zoom, Wuhan, hoard*, toilet paper, curbside, pandemic, reopen, defy, anti-mask*。

 

你还可以查看搭配,以了解关于某个主题的说法,例如与virus搭配的动词、与ban、stockpile、disinfect或remotely搭配的任何词。你还可以看到自2020年1月以来每10天内一个词的同义词(例如:stockpile)。

 

该语料库还可以让你看到一个词出现的模式,如stay-at-home, social, economic, or hoard*。

 

你还可以在不同的时间段进行比较,看看人们对事物的看法是如何随时间变化的,可以在语料库中的20个国家之间进行比较。

 

The Coronavirus Corpus is designed to be the definitive record of the social, cultural, and economic impact of the coronavirus (COVID-19) in 2020 and beyond.

Unlike resources like Google Trends (which just show what people are searching for), the corpus shows what people are actually saying in online newspapers and magazines in 20 different English-speaking countries.

The corpus (which was first released in May 2020) is currently about 1457 million words in size, and it continues to grow by 3-4 million words each day.

The Coronavirus Corpus allows you to see the frequency of words and phrases in 10-day increments since Jan 2020, such as social distancing, flatten the curve, WORK * home, Zoom, Wuhan, hoard*, toilet paper, curbside, pandemic, reopen, defy, anti-mask*.

You can also look at "collocates" (nearby words) to see what is being said about a certain topic, such as (verbs near) virus, or any word near ban (v), stockpile, disinfect*, or remotely. And you can even see the collocates of a word in each 10-day period since Jan 2020 (e.g. stockpile).

The corpus also allows you to see the patterns in which a word occurs, as with stay-at-home, social, economic, or hoard*.

You can also compare between different time periods, to see how our view of things have changed over time. (And you can even compare between the 20 countries in the corpus). Interesting comparisons over time might include phrases like social * or economic * that were more common in Jan/Feb than in Apr/May, or words near BAN or OBEY that were more common in Apr-May than in Jan-Feb.

Click on any of the links in the search form to the left for context-sensitive help, and to see the range of queries that the corpus offers (LIST discusses the search syntax). You might pay special attention to the virtual corpora, which allow you to create personalized collections of texts related to a particular area of interest.

Finally, the corpus is related to many other corpora of English that we have created. These corpora were formerly known as the "BYU Corpora"), and they offer unparalleled insight into variation in English.

 

链接直达:(或点击 阅读原文)

    以上就是本篇文章【科研助力丨新冠疫情语料库】的全部内容了,欢迎阅览 ! 文章地址:http://sjzytwl.xhstdz.com/quote/3515.html 
     行业      资讯      企业新闻      行情      企业黄页      同类资讯      网站地图      返回首页 物流园资讯移动站 http://mip.xhstdz.com/ , 查看更多   
发表评论
0评