Algorithm tool works to silence online chatroom sex predators
WEST LAFAYETTE, Ind. — An algorithm tool developed by Purdue Polytechnic Institute faculty will help law enforcement filter out and focus on sex offenders most likely to set up face-to-face meetings with child victims.
The Chat Analysis Triage Tool (CATT) was presented last week by principal investigator Kathryn Seigfried-Spellar, assistant professor of computer and information technology, at the International Association of Law Enforcement Intelligence Analysts Conference in Anaheim, California.
Seigfried-Spellar said law enforcement officers are inundated with cases involving the sexual solicitation of minors – some interested in sexual fantasy chats, with others intent on persuading an underage victim into a face-to-face meeting.
CATT allows the officers to work through the volume of solicitations and use algorithms to examine the word usage and conversation patterns by a suspect. Seigfried-Spellar said data was taken from online conversations provided voluntarily by law enforcement around the country.
"We went through and tried to identify language-based differences and factors like self-disclosure," she said. Self-disclosure is a tactic in which the suspect tries to develop trust by sharing a personal story, which is usually negative, such as parental abuse.
"If we can identify language differences, then the tool can identify these differences in the chats in order to give a risk assessment and a probability that this person is going to attempt face-to-face contact with the victim," Seigfried-Spellar said. "That way, officers can begin to prioritize which cases they want to put resources toward to investigate more quickly."
Other standout characteristics of sexual predators grooming victims for a face-to-face meeting is that the chats will often go on for weeks or even months until a meeting is achieved. Those involved in sexual fantasy chatting move on from one youth to another quickly.
The project started as a result of a partnership with Ventura County Sheriff's Department in California.
Seigfried-Spellar said the research discovered tactics like self-disclosure is used early in a predator's talks with a potential victim.
"Meaning that we could potentially stop a sex offense from occurring because if law enforcement is notified of a suspicious chat quickly enough, CATT can analyze and offer the probability of a face-to-face," she said. "We could potentially prevent a child from being sexually assaulted."
Seigfried-Spellar worked in developing CATT with two co-principal investigators, associate professor Julia Taylor Rayz, who specializes in machine learning and natural language processing, and computer and information technology department head Marcus Rogers, who has an extensive background in digital forensics tool development.
CATT algorithms examine only the conversation factors and do not take the sex of either suspect or victim into consideration, at this time.
The project began with initial research done by Seigfried-Spellar and former Purdue professor Ming Ming Chiu. The exploratory study examined more than 4,300 messages in 107 online chat sessions involving arrested sex offenders, identifying different trends in word usage and self-disclosure by fantasy and contact sex offenders using statistical discourse analysis.
The trends determined through this research formed the basis for CATT. The research, "Detecting Contact vs. Fantasy Online Sexual Offenders in Chats with Minors: Statistical Discourse Analysis of Self-Disclosure and Emotion Words," has been accepted and will be published in the journal "Child Abuse and Neglect."
Initial plans are to turn the tool over to several law enforcement departments for a test run. Seigfried-Spellar said CATT could be handling data from active cases as early as the end of the year.
The conversation analysis provides the basis for future law enforcement tools as well, she said.
"What if there is a chat online and you don't know if you're chatting with an offender or someone who is 15 years old pretending to be 30," she said. "Maybe then, this tool can analyze the differences in an actual 13-year-old versus someone who is pretending to be 13 or an actual adult versus someone who is pretending to be an adult.
"So, you can then start trying to figure out, language wise, who this person is I'm chatting with."
At some point, she believes CATT could even teach officers to better portray a 10-year-old victim by perfecting constantly changing factors like language, emojis and acronyms.
"In these types of operations, our goal isn't to entrap people," she said. "In these, the offender is initiating, and as they do that, law enforcement is simply responding.
"If officers can respond in a way that speeds up the process, that gets the person off the street sooner compared to waiting eight months to allow a trust relationship to develop."
The CATT project was funded through a grant issued in 2017 from the Purdue Polytechnic Institute.
According to the National Center for Missing and Exploited Children (NCMEC, 2014), online solicitation of minors falls into three categories: 1) sexual, request to engage in unwanted sexual activates or sexual talk; 2) aggressive, involved actual and/or attempted offline contact; and 3) distressing, youths stated they were afraid after the incident. Although there is a general decline in all forms of online solicitation of minors, with only 3% of youth in 2010 reporting an aggressive solicitation (NCMEC, 2014), the FBI estimates that 750,000 adults seek sex with youths daily (Rodas, 2014). In 2015 alone, Internet Crimes Against Children (ICAC) task forces arrested more than 60,000 internet sex offenders (see http://www.ojjdp.gov). Investigating crimes against children, specifically sexual solicitations, are complicated because not all offenders are contact-driven, meaning they want to meet the minor for sex in the physical world; instead, some offenders are fantasy-driven, in that they are more interested in cybersex and less interested in meeting the minor in the physical world. Thus, the sheer volume of sexual solicitations online makes it difficult for law enforcement to determine whether an offender is contact-driven vs. fantasy-driven. In order to assist law enforcement with their ability to prioritize cases in which the offender is more likely to be contact-driven, Seigfried-Spellar and colleagues (2017) identified language-based differences in the online chats between minors and arrested contact-driven and fantasy-driven offenders. Based on these results, they developed a digital forensic tool for the automatic analysis of chats between offenders and minors. This digital forensic tool conducts a language-based analysis to determine the likelihood that the offender is contact-driven vs. fantasy-driven. It is their hope that this tool will assist law enforcement authorities by enabling them to allocate their limited resources to cases at high risk for contact-driven offenses. This tool is available to law enforcement officers investigating cases involving internet crimes against children and will be presented at this LEIU/IALEIA workshop.
Brian L. Huchel