My experience of these things is mighty expense, tricky programmability and most of the performance coming from fixed functions that can be called from the custom bits. I did an image compression algorithm on an FPGA a decade qo now and it worked well enough with a lot of source code tweaking but I have only ever considered it useful in very specialist cases.