Skip to main content

Likes, shares and drug deals: WVU researchers create model that detects illicit drug trafficking on social media

An illustration of an orange pill bottle with an instagram post on the label. The instagram post is of a pike of blue and white pills.

WVU researchers have developed a model that detects drug trafficking on social media platforms. (WVU Illustration/Sheree Wentz)

Download full-size

Social media can be much more than political rants and snapshots of scrumptious meals or furry friends.

West Virginia University researchers have found that social networking platforms can serve as a direct-to-consumer marketing tool for drug dealers to sell illicit drugs.

Professor Xin Li, in the Lane Department of Computer Science and Electrical Engineering, and Chuanbo Hu, post-doctoral fellow, said that detection of online illicit drug trafficking has become critical to combat the drug trade on cyber platforms.

“Illicit drug trades have co-evolved with technology in the past decades,” Li said. “Social media has become a tool exploited by not only common people but also drug dealers.”

One popular service drug dealers have exploited is Instagram, which Li said replaced Twitter as the primary platform for illicit drug trafficking around 2019. Compared to other social media platforms, including Tik-Tok, the algorithms associated with Instagram allow more personalized content to be aimed directly at people expressing interest in certain posts and hashtags.

Anyone who follows a dealer’s account or likes a dealer’s post will prompt the Instagram algorithm to fill the person's feed with more of those posts for drugs.

To combat this, Li, Hu and their team conducted the first systematic study on fine-grained detection of illicit drug trafficking events on Instagram. They proposed a deep multimodal multilabel learning approach to detect IDTEs and demonstrate its effectiveness on a newly-constructed dataset called multimodal IDTE. 

This model takes text and image data as the input and combines multimodal information to predict multiple labels of illicit drugs. Their research was presented at an international conference in late 2021.

“The multimodal-IDTE is the designed AI model to automatically detect illicit drug trafficking on Instagram,” Hu said. “Compared with previous work, this method fully considers the drug trafficking clues provided by images and text and realizes the fine-grained classification of drug trafficking events.”

According to Li and Hu, drug dealers generally use hashtags on Instagram to extend their reach and engage their audience, which can be attached to posts. The dealers almost always link many drug-related hashtags to improve the visibility of their posts.

Accurate detection of IDTEs from social media has become increasingly more challenging due to inconsistency of drug legislation, the vastness of social media and the ambiguity in what drugs are being posted and for what reason.

“They promote illicit drug trades in two ways: by posting a message, like an ad for a certain drug, and by replying to an existing post,” Li said. “They use slang, street names of drugs, or other ways like misspelling, to evade being caught.”

“Some drug dealer accounts never post images, but only comment on some hot posts to improve the visibility of their ads,” Hu said.

Li said detecting IDTE is like finding a needle in a haystack due to the enormous size of social media data.

“Drug dealers use various tricks to evade detection,” Li said. “The boundary between IDTE and regular events is not always clear. For example, someone’s grandmother might be using a certain opioid as the prescription drug for pain management.” 

Unlike existing works on drug dealer detection or drug use detection from aggregated information, Li and Hu focus on detecting activities related to suspect IDTEs. Their work also focuses on an approach that detects not only illicit drugs, but also their specific types in each suspect IDTE.

Specifically, Li and Hu’s model takes in text and image data associated with suspect IDTEs and composites the multimodal information to predict multiple labels of an illicit drug.

“We refined the classification of IDTEs into nine categories, where most of them contain the widely used illegal drugs,” Hu said. “Taking a post with image and caption or comment as an input, the proposed model can learn the drug features based on multimodal analysis to determine if it is illicit drug related and which category it belongs to.”

The researchers have manually constructed a large-scale multimodal IDTE (MM-IDTE) dataset for the purpose of illicit drug detection. The MM-IDTE dataset, containing nearly 4,000 posts and more than 6,000 comments, represents the largest multimodal (text and image) illicit drug detection dataset to date.

To construct such a large-scale dataset, the researchers have designed an automatic data crawling system for Instagram that jointly uses hashtag and image information to guide the data collection.

“Collecting Instagram data manually is an impossible task,” Li said. “My team has developed a data crawling system to automatically download all data (posted texts and images) from Instagram. It collects raw materials to support our data mining research.”

“The automatic data crawling system is designed to collect many training samples, which are very useful to help the model automatically learn discriminative representations for classification,” Hu said. “In other words, the larger and more diverse the data is, the more accurate and robust the model will be.

“This data crawling system automatically retrieves drug dealer posts based on the collected drug-related hashtags,” Hu said. “To improve retrieval efficiency, the system automatically filters out some irrelevant posts through the AI ​​model.”

The newly-constructed MM-IDTE dataset will be made publicly available to support the research related to illicit drug trafficking activities.

Li and Hu propose a deep multimodal and multilabel learning (DMML) framework to detect illicit drug trafficking events because it can realize fine-grained classification of IDTEs considering drug legalization differences.

“The proposed method can automatically learn discriminative features from multi-model data, and it can detect different drug trafficking patterns based on the proposed comment-based detection unit,” Hu said.

Li and Hu said that their method can successfully identify some challenging cases difficult for untrained eyes such as special symbols and style changes attempting to evade detection. The developed system could also facilitate the disruption of illicit drug trade by law enforcement. 

Li and Hu believe that their research can help identify different types of drug trafficking events on social media platforms.

“Our approach is not limited to Instagram,” Hu said. “It is a general tool to detect illicit drug trafficking events by fusing multimodal data, such as images and text. The reason for this design is to flexibly expand to other platforms.”

“Further extensions into human, natural resources and virtual products trafficking are possible,” Li said.

Minglei Yin, Lane Department of Computer Science and Electrical Engineering doctoral student, joined Li and Hu on the study.

Citation: https://dl.acm.org/doi/pdf/10.1145/3459637.3481908

-WVU-

af/04/06/22

CONTACT: Paige Nesbit
Statler College of Engineering and Mineral Resources
304-293-4135; Paige.Nesbit@mail.wvu.edu

Call 1-855-WVU-NEWS for the latest West Virginia University news and information from WVUToday.

Follow @WVUToday on Twitter.