IDictionary options - performance test - SortedList vs. SortedDictionary vs. Dictionary vs. Hashtable

Here is a simple performance test to compare four different implementations of the IDictionary interface. I have measured insertion time, search time and memory allocated by each one. The test included the following code:

    private static readonly int searchIndex = 88888;
    private static readonly int numberRepetition = 500000;
    private static readonly SortedDictionary<string, string> sortedDictionary = 
          new SortedDictionary<string, string>();
    private static readonly SortedList<string, string> sortedList = 
          new SortedList<string, string>();
    private static readonly Dictionary<string, string> dictionary = 
          new Dictionary<string, string>();
    private static readonly Hashtable hashtable = 
          new Hashtable();

    private static void Insert_Test(string name, IDictionary dict){
      Console.WriteLine("--------Insert {0}--------", name);
      string[] letters = {"A","B","C","D","E","F","G","H","I","J"};
      long memoryStart = System.GC.GetTotalMemory(true);
      Stopwatch watch = new Stopwatch();
      watch.Start();
      Random rand = new Random( );
      for(int i = 0; i < numberRepetition; i++){
        string key = GetRandomLetter(letters, rand, i)+"_key"+i;
        string value = "value"+i;
        
        dict.Add(key, value);
      }
      long memoryEnd = System.GC.GetTotalMemory(true);
      watch.Stop();
      Console.WriteLine("Memory Allocated by {0} is: {1}bytes", 
                        name, memoryEnd - memoryStart);
      PrintResults(watch);
    }
    private static void Search_Test(string name, IDictionary dict){
      Stopwatch watch = new Stopwatch();
      Console.WriteLine("--------Search {0}--------", name);
      watch.Start();
      Console.WriteLine("Found:{0}", dict["A_key"+searchIndex]);
      watch.Stop();
      PrintResults(watch);
    }
    private static void PrintResults(Stopwatch watch){
      Console.WriteLine("Elapsed: {0}",watch.Elapsed);
      Console.WriteLine("In milliseconds: {0}",watch.ElapsedMilliseconds);
      Console.WriteLine("In timer ticks: {0}",watch.ElapsedTicks);
    }
    private static string GetRandomLetter(string[] letters, Random rand, int i){
      if(i == searchIndex){
        return "A";
      }
      return letters[rand.Next(0, 10)];
    }

Then for each particular test I recompile the program and execute it as

    Insert_Test("SortedDictionary", sortedDictionary);
    Search_Test("SortedDictionary", sortedDictionary);

or

    Insert_Test("SortedList", sortedList);
    Search_Test("SortedList", sortedList);

or

    Insert_Test("Dictionary", dictionary);
    Search_Test("Dictionary", dictionary);

or

    Insert_Test("Hashtable", hashtable);
    Search_Test("Hashtable", hashtable);

I insert 500,000 strings and I use GetRandomLetter method in order to ensure that the new coming strings are not in alphabetic order and the ordered dictionaries would not benefit from that.

Then I perform 5 tests in different order. I don't expect that order would play any role but just in case there is a pattern. Test 1 in order: SortedDictionary, SortedList,Dictionary, Hashtable. Test 2 in order: SortedList, Hashtable, SortedDictionary, Dictionary. Test 3 in order: Hashtable, Dictionary, SortedList, SortedDictionary. Test 4 in order: Dictionary, SortedDictionary, Hashtable, SortedList. Test 5 in order: Dictionary, Hashtable,SortedList,SortedDictionary.

From statistical point of view 5 tests are not enough to consider the results significant but they are enough to see the major trends.

As you can see SortedList allocates about 15-20% less memory than the other IDictionary implementors but it comes at a cost of 185 times slower inserts.

Here you can see a common view of the memory consumption. Area A is during the insertion phase when more and more memory has been allocating. Area B is from the end of the insertion until the garbage collection of the object.

And the distribution of the allocated memory is shown in the chart below.

And you can see the enormous insert time difference.

Dictionary and Hashtable perform much better then SortedDictionary.

We can see also difference in the key search but these values are within the statistical error and I do not think they can be considered as significant. It would be nice to see much higher number of tests in order to compare the search performance of these collections. Here you can see my results:

Here you can see the raw results:

  Test 1 Test 2 Test 3 Test 4 Test 5 Mean
SortedDictionary - Memory Allocated 53,921,060 53,921,060 53,921,060 53,921,060 53,921,060 53,921,060
SortedDictionary - Insert Milliseconds 3,940 3,930 3,883 3,913 3,997 3,933
SortedDictionary - Insert Ticks 56,418,516 56,281,158 55,604,986 56,040,332 57,243,429 56,317,684
SortedDictionary - Search Ticks 3,001 3,832 3,359 3,097 3,270 3,312
SortedList - Memory Allocated 44,115,516 44,115,516 44,115,516 44,115,516 44,115,516 44,115,516
SortedList - Insert Milliseconds 223,366 223,198 223,309 223,676 223,566 223,423
SortedList - Insert Ticks 3,198,205,105 3,195,802,915 3,197,387,653 3,202,634,644 3,201,065,572 3,199,019,178
SortedList - Search Ticks 3,229 3,391 3,486 3,468 4,127 3,540
Dictionary - Memory Allocated 53,376,620 53,376,620 53,376,620 53,376,620 53,376,620 53,376,620
Dictionary - Insert Milliseconds 1,174 1,183 1,257 1,174 1,234 1,204
Dictionary - Insert Ticks 16,822,259 16,949,806 18,002,619 16,820,756 17,671,199 17,253,328
Dictionary - Search Ticks 3,227 3,586 4,622 3,240 2,889 3,513
Hashtable - Memory Allocated 49,608,792 49,608,792 49,608,792 49,608,792 49,608,792 49,608,792
Hashtable - Insert Milliseconds 1,549 1,584 1,566 1,555 1,552 1,561
Hashtable - Insert Ticks 22,180,188 22,689,084 22,427,621 22,270,058 22,235,823 22,360,555
Hashtable - Search Ticks 3,086 3,049 3,208 3,058 2,979 3,076

Feedback

Posted on 1/3/2008 11:45:36 AM

Thanks for these interesting measurements!
I was exactly looking for this kind of information.
People considering which collection to use, might also be interested in the MSDN page "SortedList and SortedDictionary Collection Types" (http://msdn2.microsoft.com/en-us/library/5z658b67.aspx), which tells us the upper bounds of some operations on these collections in big O notation.

Posted on 4/25/2008 1:17:46 AM

Thanks for the above test!
I was also looking for this option.
I guess i will go with HashTable, as it looks the fastest of the lot

Posted on 6/11/2008 5:18:08 AM

Thanks! I just started to think about what to use and then found your artikle. It's good!

Posted on 6/11/2008 5:55:46 AM

You forgot to add measure units (bytes) for memory allocation chart.

Posted on 6/11/2008 8:15:29 AM

Thanks Flavius, I've added that now.

Posted on 8/12/2008 8:37:12 PM

I found your results useful. However, it may be important to point out that the hashtable does not support generics.

Posted on 8/25/2008 6:23:02 AM

Thanks for an article with immediate practical use. It helped in finalizing on which data structure to use for my app.
As Gary has pointed out above, the generic Dictionary has the advantage of type safety so it would seem that we should also factor in the unboxing overhead for the Hashtable's value object.

In my test for retrieval time using the hashtable with unboxing vs. generic dictionary the hashtable still performed three times as fast as the generic dictionary.

--Time in ticks--
Time taken to fetch all elements of Dictionary<> was 10559
Time taken to fetch all elements of hashtable without unboxing was 1431
Time taken to fetch all elements of hashtable WITH unboxing was 2846

Please post your comments:

Name:  
Email (optional): Your email address will not be posted.
URL (optional):
Comments: HTML will be ignored, URLs will be converted to hyperlinks
Use [code] if(true) alert("OK"); [/code] for source code
 
Enter the text you see in the box: