Oxylabs employs AI to detect harmful images
A unique AI-driven web scraping tool created by Oxylabs helps the Communications Regulatory Authority of Lithuania (RRT) detect illegal content related to child sexual abuse material or pornography online. During the first months of its use, 19 websites were identified as violators of national or EU laws, 11 complaints to the Inspector of Journalist ethics were registered, 8 police reports were filed, 2 pre-trial investigations were started.
Oxylabs, a leading provider of public web data gathering solutions, created a tool for the Lithuanian institution pro bono to support their mission of making the internet cleaner and safer. A dedicated tool automatically scans Lithuanian’s IP address space in search for potentially harmful images. The suspicious content units are then forwarded to the hotline for RRT specialists to review.
“While it’s physically impossible to monitor the whole country’s web space manually, web scraping technology makes it easier. The RRT specialists can then move straight to the analysis part and take prompt actions against violators. This is a perfect illustration of how web scraping, mostly used by businesses, can also benefit the public sector in their wider societal goals”, says Juras Juršėnas, COO at Oxylabs.
In the first two months of its use, the AI-driven web scraping tool scanned around 288 000+ Lithuanian websites. After careful investigation of reported images, 19 websites were identified as violators of national or EU laws, and specific punitive measures were taken.
Communications Regulatory Authority of Lithuania (RRT) is a national institution regulating the electronic communications, postal, rail markets under the European Union directives and the laws of the Republic of Lithuania. One of the RRT missions is safeguarding the internet from illegal or harmful content. To detect violators, RRT has long relied on a special internet hotline “Clean internet”, where regular internet users would voluntarily report the illicit content, they stumbled upon while browsing.
The Oxylabs-created tool reports the detected harmful images to the same hotline, but it allows RRT for more proactivity in the process. The tool monitors the web in the background and thus the reports are constant and do not depend that much on the changing habits of volunteers.
“Voluntary reports through the hotline is the most common way worldwide to collect complaints. This measure is extremely valuable for us too. However, we do not have to depend on such reports fully anymore and can take the front seat in the process. We do hope that AI-based image recognition system, used together with hash comparison method, will help make it easier to identify illegal content, especially in cases when illegal images are not yet included into known illegal content hash databases”, RRT representative Vaidotas Ramonas, Director of digital services department, says.
While first tested in Lithuania, the AI-based tool could easily be replicated in other countries. RRT plans to share the experience of using it with their partner institutions in other countries.
Oxylabs came into a pro bono partnership with RRT after winning a govtech hackathon, where RRT challenged the participants to create an automated tool to help them in their mission. The tool was developed in several weeks and then vigorously tested, trained, and constantly improved. As of 2022, RRT has fully employed the tool in its daily operations.
Oxylabs sees employing web scraping technology for the better good as part of their mission. The company has previously partnered with numerous universities on pandemic research. For information on pro bono partnership opportunities, please contact email@example.com.