Performance problems using named pipes (embedded C)

Markey

Junior Member
Sep 28, 2010
20
0
0
Hello,

I'm trying to write an application that uses named pipes to transmit messages from one thread to another, but I'm running into performance issues on the platform I'm using. Here's what I have so far:

The initialization thread creates the pipes and registers them with the OS.
The thread that needs to send a message knows the name of the pipe, but doesn't have a pointer to the pipe.
The OS provides a function call to return the pointer to the pipe, but it takes 1.2ms on average to look this up, (largely due to my application having around 80 pipes and the OS has to do string compares to find the pipe) and that is prohibitively long for my application.

I dug into the OS source code, and it's using a doubly linked list to keep track of all the registered pipes. When you look up a pipe by name, it starts at the head and does a string compare on the name you pass in to what's in the pipe control block. If it matches, it returns a pointer to the pipe. If it doesn't match, it follows the next pointer and tries again until you find the correct pipe.

I was thinking about creating a new module in the OS that holds an array of structs; each struct holding the name of a pipe and the pointer to the pipe. The OS would keep the array ordered such that the names are alphabetical and then I could write an accessor function that does a binary search on the names, one char at a time until the correct pointer was found. But If I do this, I'm not sure I'll realize much of a speedup since I'm still basically doing a string compare.

I also was thinking that maybe I should use some kind of integer handle to the pipes, but I'm not sure what this would look like because there's no way for the application to know the pipe's handle unless he did some kind of lookup with the name of the pipe.

Performance is top priority in this problem as this is an embedded application. Any suggestions?
 

Markey

Junior Member
Sep 28, 2010
20
0
0
Can you just do the lookup once and cache the pointer?

I wish I could, but due to the complexity of the program, it's unknown which pipe I'm going to connect to when I start a worker thread. The only thing that's known (as of current implementation) is the name of the pipe. For the system to function correctly, the worker thread needs to start and stop within 20ms, and there's a lot of work to do in between. A typical worker thread needs to make 7 pipe connections, so that eats up on average 8.4ms, almost half of my work time.
 

Schmide

Diamond Member
Mar 7, 2002
5,726
1,013
126
I was thinking about creating a new module in the OS that holds an array of structs; each struct holding the name of a pipe and the pointer to the pipe. The OS would keep the array ordered such that the names are alphabetical and then I could write an accessor function that does a binary search on the names, one char at a time until the correct pointer was found. But If I do this, I'm not sure I'll realize much of a speedup since I'm still basically doing a string compare.

Any suggestions?

If you only have a name or string maybe implement a trie like structure to hash the names quickly. In terms of dictionary searches I don't think anything beats a trie.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
If you only have a name or string maybe implement a trie like structure to hash the names quickly. In terms of dictionary searches I don't think anything beats a trie.

Depends on the number of named pipes.

I would suggest:
1. Hash all pipe names. Compare hashes instead of strings. If the hash matches, then do the strcmp() to make sure.

2. Make hashtable instead of an array -- especially if memory overheads are manageable (i.e., if you can make the hashtable large relative to the number of named pipes).
 

Markey

Junior Member
Sep 28, 2010
20
0
0
Both are excellent ideas! I don't know why I didn't think of using a hash table. I have never heard of a trei data structure before, so I looked it up online. It looks like it's basically a tree of linked lists.

It would be interesting to try both implementations and see which one gives better performance. I have a feeling the hash table would win. However, I'm not good at implementing hash functions, and I don't want to steal any code, so I'm leaning towards the trei implementation.

I'll give it a shot tomorrow, and if I get enough time to do the whole implementation, I'll update with benchmarks. :) This is going to be fun.
 

iCyborg

Golden Member
Aug 8, 2008
1,344
61
91
Both are excellent ideas! I don't know why I didn't think of using a hash table. I have never heard of a trei data structure before, so I looked it up online. It looks like it's basically a tree of linked lists.
A trie? It's a normal tree, just the nodes are special - they don't store keys directly, the path from the root is the key.
Tries are the way to go with very large dictionaries, IMO you'd be better with an array based hash.
 

Schmide

Diamond Member
Mar 7, 2002
5,726
1,013
126
A trie is basically a finite state machine. It stores all possible pathways/combinations in a tree structure. As far as I know every spell checker out there is some sort of trie.

A radix tree/trie, which is a space compressed trie, is a little harder to implement but has some serious cache and memory optimizations for large strings.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
A trie is basically a finite state machine. It stores all possible pathways/combinations in a tree structure. [snip]

This is an important distinction. Fast tries aren't made with pointers like a general graph-based data structure. They're often flattened into matrices, and they're blazing fast if you have the (cache) memory for them.

That said, I would still start with the per-string hashing, then try a non-linked-list data structure if the search is still too slow.
 

Markey

Junior Member
Sep 28, 2010
20
0
0
So, I've been working on this and tweaking my implementation, and I've been able to get the average lookup time to be about 60% faster by using a trie. YAY.

The implementation of the thread still doesn't meet the timing requirements, but I'm looking at optimizations in other areas of the code (such as the actual processing of the data.) since it's right on the edge of working (currently at around 22ms, and I need to get it under 20.)

Thanks for your help guys!
 

Schmide

Diamond Member
Mar 7, 2002
5,726
1,013
126
If you need a quick speed bump you could make a quick hash table on the first character of the string and have each hash bucket point to an independent trie. Basically replace the null node with a hash table.

One of the big optimizations for tries is alphabet reduction*. If you're dealing with only characters, you can easily reduce it to 64 or 32 bits, then place a bitwise field within the nodes to quickly do include/exclude checks, and if your child nodes are in order, immediately proceed to the right node without checking each node.

Other optimizations involve variable node sizes, balance, order and concatenation, but one could write a dissertation on that.

* some of the best performing tries are 1 bit in place implementations.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Given that this is embedded C, do what makes sense for your platform. If you have lots of memory, make a big hash. Use intrinsics if they're available. Etc.