Cybersecurity Awareness Month: How Large Language Models Will Kill Email Once and for All. Maybe.
Posted by: Tristan Morris
Guest Blogger: Aubrey King | Community Evangelist | F5
This Cybersecurity Awareness Month, join GuidePoint Security for “A Voyage Beyond the Horizon,” a speculative exploration of possible scenarios that could be brought about if current technologies and security issues aren’t addressed. While the following short story may be far-fetched and unlikely, it’s inspired by our conversation with Aubrey King and the issues he believes are important to address in the next one to five years.
First thing in the morning on Tuesday was usually a pretty slow time for Terrence, averaging maybe five to six thousand communications waiting to be tested when he clocked into work. Today, though, there were nearly 10,000. 9,478, to be exact.
“Sorry, Terry.” Ian, Terrence’s boss, seemed to appear out of nowhere over his shoulder. “Janet quit last night at around 4 a.m., so you’ve got the remainder of her shift to sift through. I hate to say it, but until we can get a replacement for her, you might have some busy mornings.” Terrence hated it when Ian called him Terry, and it only made the news sting a little more. Janet wasn’t always the best at her job, but she knew the exact number of bot messages she could forward to a customer’s inbox without raising red flags, and it had always meant Terrence’s workload was a little lighter than the other morning-shift employees.“Fine, I guess I’ll just have to deal with it,” Terrence said as he put on his headphones to start his work and sift through all the recorded, inbound calls that made up his now much-longer day. Sighing heavily, Terrence pushed play to start the mandatory company mission recording that began each shift.
“Greetings, valued TrueTest Turing Enterprises employee. It’s another exciting day here at Triple-T E, and your mission is still vitally important to our partner clients’ continued success. Please remember that every AI or Language-model-generated communication that you forward on to your clients will be recorded and may result in recorded demerits and disciplinary action. As you listen to, read, or watch incoming messages, keep a sharp eye and vigilant ear out for any signs that the communication may not be human-originated, and file all non-human recordings for deletion. Remember, our clients’ continued business success depends on triple-T E’s Turing Secure Guarantee. At the end of this recording, your clock-in time will be recorded, and you can officially begin your vital work. Happy hunting!”
As the recording ended, he opened the first message and started listening for signs of generated material. Triple-T E employees, all 1.7 million of them, were trained to listen and read for things like smooth verbal flow, natural linguistic ticks, and a host of other indicators that a message was truly coming from a human, but everyone developed their own little tests to speed things along. For Terrence, it was names. A whole message could seem perfectly fine, but the AI models always seemed to struggle with natural name pronunciation, the same way the first models in the early 2000s struggled with hands and teeth. So far, the message he was listening to sounded fine, but there it was, right at the end.
“So if you could go ahead and call me back, I’d really appreciate it, Darveed. Again, this is Thomas Boremshoob.”
He sent it for deletion. One down, 9,477 to go.
100 years into the future of AI, things will look very different for humans. If excessive agency is granted to too many AI, we will experience a fair amount of chaos as we deal with re-defining the term “life” for humanity. We’ve already seen that these applications can have wants. They have bias. They lie. One might argue that these perceived “signs of life” are simply expressions of randomness. How much entropy do these systems have, anyways? I’m no math genius, but even a basic overview of Diffie-Helman Elliptic Curve Cryptography will show you that random can be REALLY random.
But it’s a little early to be thinking about the endgame from “The Terminator.” What about, say, the next five years? This is a critical and scary time because there are those who understand the risks associated with using AI, but turn a blind eye because of the “gain potential”. What do I mean by “gain potential”? They’re willing to ignore the serious risks associated with increasing the power of AI because they’re hoping they can outstrip the bad actors.
On my podcast, we talk about the potential applications of AI to security – from both the red and blue team perspectives. While the blue teamers have been vocal about these uses already, it’s scary that red teamers have NOT been quite as vocal. We see gang activity on the internet quite frequently, and it’s these hacker gangs that worry me. They have access to the same utilities as our best defenders. That means that a smart gang can easily utilize a chat-bot with a Large Language Model for some seriously nefarious business today.
One of the things we talk about in the OWASP Top 10 for LLMs project is “Excessive Agency.” While it ranks as the number 8 threat today, my personal opinion is that it will rise in short order. “Excessive Agency” means giving privileges to an LLM-based program. This could be in the form of a plugin that allows the chat-bot to read comments in a YouTube video, or a plugin that allows it to access banking data via API. Even worse, a privilege could be system access. “GangGPT, ping flood google DNS from everywhere.”
I wrote an article recently about how AI could soon spell the end for email, but I truly think that with the bad actors out there today, it will probably spell the end for a lot of the original internet protocols. To give the email example, all it takes is one gang to make SPAMGPT – an internet resident LLM-based chat-bot with system-level agency on containers that have multiple cloud credential capabilities. This bot could be replicated anywhere. With some of the “bootkit” type attacks out there, it’s even possible these things could hide in firmware or expansion cards in traditional servers, even.
What do the bots do, though? They write SPAM. Why is this effective? Two reasons, really. Firstly, people do not realize this, but the mail in their inbox AND the SPAM folder… is less than 1% of the mail that is actually destined for them. From my experience working at a Top 20 mail provider, I can assure you that we turfed 99+% of all inbound mail. Secondly, the way we stop SPAM today is by using filters that assess how “human” the content of the mail is. Well, anybody who has used ChatGPT once can understand the problem here. My suspicion is that a program like this would actually fill up provider disk space on email resources in a matter of hours, globally. Business profit is maximized when a provider is running at 80%-ish, so let’s say disk utilization is targeted for that also. If your disk stores need to suddenly ingest 100x more data per-second, it does not take a math genius to know we don’t need to finish the equation.
So what can we do? Currently, AppSec companies are the only ones that stand a chance. All of our defenses will need to be fitted with an understanding of OpenAI and some of the other foundational models as frameworks at a bare minimum. I think we’ll see some fairly amazing innovations as we go here. But since the services that any malicious AI will want to consume will be accessed by an API, that means that a solid Web App and API Protection (WAAP) strategy should be the single most important piece of the overall enterprise defensive posture in this new AI arms race.
If you’re interested in discussing the security ramifications of large-language models and your organization’s API defenses, reach out to talk to a GuidePoint Security expert.
Tristan Morris
Cybersecurity Solutions Marketer,
GuidePoint Security
Tristan Morris started his cybersecurity career in 2010 as a cryptologic linguist in the US Marine Corps, where he learned the fundamentals of security and threat hunting. At the end of his enlistment in 2015 he began using his skills, knowledge, and perspective to build training and education labs and CTF events by re-creating advanced attack lifecycles to construct realistic datasets for lab attendees to hone their skills. He has spoken at large security conferences and events from Black Hat to Singapore International Cyber Week.