Hand writing regex is a huge waste of time when you need scalability and things like Elasticsearch (i.e. Super Apache Lucene) exist off the shelf. I heard "Hand written regex in Perl" and I wondered if this was 1995
TBH hardware is just going to be a bandaid for what sounds like a kludge. If the dude isn't even familiar with OO programming which was the latest and greatest in the early 90's, no way he's fluent in highly scalable (scale-out) modern tech like the various NoSQL solutions, Elasticsearch, etc. Even just swapping a perl script for the Node.js version of it would likely speed it up substantially, and would be very easy to translate. Even more so if you translate it to Python and then use any of the many C-in-Python libraries to drop into C for performance sensitive portions and even down to assembly since you can do that in C.
Being able to hand write stuff is great and all but these days Facebook, Twitter, Google, Amazon, etc. all have written massively scaleable, battle-tested libraries for various common high performance tasks and they pay a small army of very, very smart people to write things like that. I'll take the open source cream of the crop over handwritten perl any day.
Textual analysis is usually pretty to parallelize at least in part, provided its not a single contiguous thing. Good ol fashion chop / analyze / recombine, very basic. But I dont know the workload so it could be more complicated than that
IMO buying faster hardware worked in the 90s and early 00's but the paradigm has changed where software is expected to be scalable now. As you see, you can't barely buy faster single threaded performance no matter how much money you have to throw at it. There is no end-game with buying faster hardware anymore, there is only buying more hardware and scaling across it. For example, if you got your text search working via Elasticsearch you could instantly scale that out to as many machines as you need and you'd instantly get speed and redundancy. Change your MySQL to an ES-Hadoop stack and you now have a highly redundant, highly scalable platform
OC'd 6700k would certainly help some in the meantime while you figure out how to get to your 30 minute target