What I Have Learned - Part 5 Building What No One Wants

Anthony S. Clark

13 Feb 2023 • 4 min read

In around 2007 I had this idea for a tool that could tell if a file was malware or not. Instead of using static signatures like most anti-virus at the time I would do the following:

Collect file format features from PE, ELF, MachO, PDF, etc. files.
Things like file size, type, if signed, entropy, API usages, file header values, etc.
Each of these would be given a score and then added together and, if the score exceeded a certain threshold, then it was likely to be malicious.
ML would be employed to improve the scoring and we would use something like a Neyman-Pearson classifier or a Support Vector Machine to help with the classification.

I had some PHD mathematicians and CS majors help me validate the idea and write the paper and eventually ended up building a prototype that worked pretty well. If I had left it at that I probably would have had a pretty good product and in fact companies like Cylance came along later and did very similar things to great effect and financial success. The difference was that they kept it simple, and targeted the largest market possible.

No alt text provided for this image — Original Malware Classifier Paper Snippet

However, it didn't feel like enough for me. I was a malware reverse engineer at the time and I cared about two things:

Speeding up my triage and RE work.
Being seen as cool and valued by the infosec / RE community.

So I started adding to it. I had already created a website called Offensive Computing in 2005 that automated a lot of the RE process such as collecting file header values, scanning for packing / obfuscation, mass AV scanning, pulling out strings, etc. so I combined the two. (Think VirusTotal before it existed)

I made a pretty cool web application, with lots of help from other coders, that would do tons of the RE process for you, both static and dynamic, and was tailored for use to reverse engineers.

I then started to try to sell this through the infosec world, attending and speaking at conferences like Blackhat and Defcon, etc. There were several mistakes to this approach:

The reverse engineering community at that time was tiny, had no budget, and was full of people trying to create their own automation solutions. While they might have thought what I was doing was cool, there was no market there to buy it.
I focused my marketing and sales on the security industry, which at that time was smaller, and was essentially people also trying to sell security solutions.

What I should have done was to make the tool as simple as possible, strip it down to its bare essentials such as file classification, and then attend trade shows for non-infosec industries that could have benefited from it such as medical and manufacturing. Even large scale networking would have been better.

This was a trap I saw many researchers in infosec fall into:

Making a tool that you think is cool, and that your peers appreciate, but that there is no market for at the time
Try to sell it exclusively to other people who are trying to do the same thing in a tiny market.

There were several other products I worked on that suffered similar fates such as an APT simulator, a post-exploitation tool factory, a cloud process memory protector, and others.

Another interesting question to think about is: Is your intended use case is other people's use case? Sometimes people use your product in ways you didn't expect. One of the best examples of this that I can think of is Metasploit.

I first started contributing to Metasploit back in 2002 or 2003 when it was written in Perl. The original idea for Metasploit was to make an exploit development factory. By building a bunch of libraries for communication protocols, shellcode, encoders, NOP sleds, etc. it would greatly accelerate the time it took to write an exploit. (https://www.blackhat.com/html/bh-dc-09/train-bh-dc-09-te.html)

In order to facilitate testing, and because it was fun, interfaces were added to make it easy to launch the exploits. A year or two later Skape (Matt Miller) added meterpreter which gave Metasploit a payload and C2 capability.

Once Metasploit reached the mainstream, most people used it for running penetration tests rather than for building exploits, as was originally intended. It turns out it was pretty good for that use case.

The Takeaway

If you want to build a product to generate revenue it's helpful to think about the following:

Is this just a tool you think is cool? Or is it something the masses would use?
Are you marketing to people who might actually buy it, or just to your own competitors and peers?
Don't be afraid to venture outside of your industry, you can learn a lot from non-infosec people.
Do you understand the widest use case? Do you need to collect more input from users so you are building towards what they want versus what you are interested in?

Thanks for listening,

Sign up for more like this.