blogWhen using expanded classes is not always the best performance option

finnianr's picture
Contents

Expanded class V 64 bit integer

While working on my current project I discovered an interesting fact that for certain kinds of data, expanded classes don't provide the best performance. My project involves processing a lot of sub-strings and requires a representation of a sequential list of integer intervals. The obvious choice is to use an instance of ARRAYED_LIST [INTEGER_INTERVAL], but as performance is critical this would not be fast enough for this project. So instead I devised a class to represent integer intervals as 64 bit integers and use "bit twiddling" to get and set the upper and lower bounds. The design was intended to minimise garbage collection and array indexing.

 

Although I was satisfied with it's performance, it occurred to me that I might be able to simplify the code without sacrificing anything by using an expanded class to represent the integer interval as follows:

 

My expectation was that this would give similar performance to the ARRAYED_LIST [INTEGER_64] implementation, but I was completely wrong as the following benchmark shows. The first test takes almost double the time using the expanded class implementation.

Benchmark INTEGER_64 implementation

 

Benchmark expanded class implementation

 

Each test is the average time of 100 repetitions in the finalised executable compiled with GCC. Inlining_size is set to 2. A full garbage collect is forced during each test iteration

64 bit integer source listing

 

Expanded class source listing

 

Comments

This is interesting - can you

schoelle's picture

This is interesting - can you also tell us what the benchmarks do, and what the numbers mean?

Also, did you try to store it as just two play ARRAY[INTEGER] ? Just as I prefer to avoid bit-shifting.

Last but not least, one might inspect the C code to understand where the speed differences come from - did you have a look?

Comparison of C output

finnianr's picture

Briefly the benchmarks compare the performance and memory efficiency of STRING_32 and a new string type: alias ZSTRING. ZSTRING uses a hybrid of latin and unicode encodings to give improved memory efficiency and performance in non-Asian language environments. Think of it as a compressed STRING_32.

The EL_SEQUENTIAL_INTERVALS class is used as an intermediate singleton buffer to record the encoding overflows, viz character substrings which could not be encoded with a latin character set. The buffer is then used to allocate a SPECIAL [NATURAL] array within ZSTRING to store any unencodeable characters.

 

As the unencoded_intervals argument is a singleton which is constantly being emptied and refilled, both INTEGER_INTERVAL and ARRAY [INTEGER] will cause a lot of unnecessary garbage collection. ARRAY doubly so as each one comprises two objects, Current and area. So no, I haven't tried it.

The test labeled "concatenate chinese characters" will cause the most calls to unencoded_intervals, as non of the characters can be encoded with a latin character set. The expanded class version takes twice as long as the INTEGER_64 version

If you want to compare C output, {EL_SEQUENTIAL_INTERVALS}.item_has might offer a clue to the performance difference.

The Eiffel code   correlates with the C code  

and the Eiffel code   correlates with the C code  

It might also be useful to

finnianr's picture

It might also be useful to compare the C output for this routine  

INTEGER_64 version

 

Expanded class version

 

Syndicate content