微軟刪除全球最大的人臉識(shí)別數(shù)據(jù)庫(kù)
作者:Fortune
2019-06-20 15:00
Until April, Microsoft boasted of having the largest collection of faces that anyone could use to train facial-recognition algorithms. Since then, the once publicly-available dataset has quietly disappeared.
直到四月,微軟都吹噓擁有最大的人臉數(shù)據(jù)庫(kù),任何人都可以使用它來(lái)訓(xùn)練面部識(shí)別算法。而那之后,曾經(jīng)公開(kāi)可用的數(shù)據(jù)集已經(jīng)悄然消失。
As the Financial Times reports, Microsoft quietly deleted the dataset after the paper called attention to privacy and ethical issues, including use of the dataset by military researcherss.
正如英國(guó)《金融時(shí)報(bào)》報(bào)道的那樣,在該報(bào)引發(fā)了關(guān)于隱私和道德問(wèn)題的關(guān)注之后(包括軍事研究人員和中國(guó)監(jiān)管公司使用數(shù)據(jù)集),微軟悄然刪除了數(shù)據(jù)集。
Microsoft did not immediately respond to a request for comment from Fortune. But it told the Financial Times: “The site was intended for academic purposes. It was run by an employee that is no longer with Microsoft and has since been removed.”
微軟沒(méi)有立即回復(fù)《財(cái)富》雜志的評(píng)論請(qǐng)求。但它告訴英國(guó)《金融時(shí)報(bào)》:“該網(wǎng)站是為了學(xué)術(shù)目的設(shè)立的。它由一名不再受雇于微軟的員工運(yùn)營(yíng),并且已經(jīng)被刪除?!?/div>
(圖片來(lái)源:視覺(jué)中國(guó))
The now-deleted dataset contained more than 10 million faces culled from websites like Flickr, which host photographs uploaded under a Creative Commons license—meaning many can be used free of copyright concerns.
現(xiàn)已刪除的數(shù)據(jù)集中包含超過(guò)1000萬(wàn)張面孔,這些面孔來(lái)自Flickr等網(wǎng)站,這些網(wǎng)站儲(chǔ)存的是根據(jù)知識(shí)共享許可上傳的照片——這意味著許多都可以免費(fèi),但可能有版權(quán)問(wèn)題。
The name of the Microsoft dataset, MS Celeb, was chosen because many of the images it contains are famous people who live public lives. Many of the other faces in the set, however, belong to people who are not celebrities—including journalists and privacy researchers—and who were not aware their images had been included.
這個(gè)微軟的數(shù)據(jù)集叫MS Celeb,之所以選擇這個(gè)名稱,是因?yàn)樗脑S多圖像都是過(guò)著公開(kāi)生活的名人。然而,該集中的許多其他面孔屬于不是名人的人——包括記者和隱私研究人員——并且他們不知道他們的圖像被包括在內(nèi)。
Microsoft is hardly the only company to assemble large datasets by scraping photos from the open Internet. In January, IBM announced it was sharing a collection of 1 million faces in the name of promoting more diversity in artificial intelligence. Meanwhile, a website called Megapixels identifies several other massive collections as part of a bid to halt what it describes as a “growing crisis of authoritarian biometric surveillance.”
微軟并不是唯一一家通過(guò)從開(kāi)放的互聯(lián)網(wǎng)上抓取照片來(lái)組裝大型數(shù)據(jù)集的公司。今年1月,IBM宣布它正在以促進(jìn)人工智能更多樣化的名義共享100萬(wàn)張面孔。與此同時(shí),一個(gè)名為Megapixels的網(wǎng)站確定了另外幾個(gè)大型集合,以此來(lái)阻止它所謂的“威脅性的生物識(shí)別監(jiān)視危機(jī)”。
While many of the facial recognition sets are culled from public websites like Flickr, that is not the only way companies obtain pictures of faces. As a recent Fortune investigation revealed, startups have been using photo collection apps to surreptitiously collect millions of faces, while other companies have been scanning public collections of mug shots.
雖然像Flickr這樣的公共網(wǎng)站很多都剔除了面部識(shí)別裝置,但這并不是公司獲取面部圖片的唯一方式。最近《財(cái)富》調(diào)查顯示,創(chuàng)業(yè)公司一直在使用照片收集應(yīng)用程序暗中收集數(shù)百萬(wàn)張面孔,而其他公司則一直在掃描大量的大頭照。
?
翻譯:能貓
您感興趣的課程有優(yōu)惠啦,快去看看
節(jié)日推薦
-
《神奇動(dòng)物在哪里》的全部15種神奇動(dòng)物! 2022-12-28萬(wàn)眾期待的《神奇動(dòng)物在哪里》終于在上個(gè)周末上映了!提前一周搶票居然差點(diǎn)沒(méi)搶到! 哈迷們應(yīng)該都知道,這部電影來(lái)源于HP原作里的同名課本。其實(shí)羅琳推出過(guò)那個(gè)課本的實(shí)體版,而且有中文版,不過(guò)可惜好多年
- 雙語(yǔ)美文:堅(jiān)持夢(mèng)想,是唯一的選擇 2022-12-26
- 在中國(guó)職場(chǎng),哪些外語(yǔ)證書(shū)更有用? 2022-12-05
- 美國(guó)新娘婚禮前墓地探亡父 跪地痛哭照令人動(dòng)容 2022-11-27
- 如何在90天內(nèi)學(xué)會(huì)一門(mén)語(yǔ)言? 2022-11-03
- 外媒評(píng)HBO史上最好20部美?。旱?竟然是…… 2022-10-08