Facebook has announced new improvements to its artificial intelligence (AI) technology that is used to generate descriptions of photos posted on the social network for visually impaired users.
The technology, called automatic alternative text (AAT) to improve the experience of visually impaired users. With AAT, visually impaired users have been able to hear things like image may contain three people, smiling, outdoors.
It was first introduced by Facebook in 2016 and then visually impaired users who checked their Facebook newsfeed and came across an image would only hear the word photo and the name of the person who shared it.
The company said, the latest iteration of AAT, the company has been able to expand the number of concepts that the AI technology can detect and identify in a photo, as well as provide more detailed descriptions to include activities, landmarks, food types, and types of animals, like a selfie of two people, outdoors, the Leaning Tower of Pisa instead of an image of two people. The increased number of concepts that the technology can recognise from 100 to more than 1,200 was made possible through training the model on a weekly basis using samples that it claimed are both more accurate, and culturally and demographically inclusive.
The company further said, in order to provide more information about position and count, the company trained its two-stage object detector using an open-source platform developed by Facebook AI Research. We trained the models to predict locations and semantic labels of the objects within an image. Multilabel/multidata set training techniques helped make our model more reliable with the larger label space.
Comentários