43: Collections: Hash Table.

Take Up Code - A podcast by Take Up Code: build your own computer games, apps, and robotics with podcasts and live classes

Categories:

You’ll need to be able to work with groups or collections of items. A game that only has one character with one quest and with one shop that sells one item isn’t going to be very fun. You have choices and this episode continues more than a week long exploration of collection types available. You’ll learn when to use each type of collection and why. Up today is the hash table. I’ll explain what hash tables are and then give you some guidance on when to use them. Hash tables require a method that knows how to turn some information into an index that we can use to place the item in an array. The array size is fixed and we usually end up placing items not directly in the array but in a linked list. There’s a different linked list for each location in the array. It works like this assuming we have an item with a property called name that we’ll use to generate a hash value: item.name => hash method => index array[maximum index] -> list array[…] -> list array[index + 1] -> list array[index] -> list -> item array[index – 1] -> list array[…] -> list array[0] -> list You can see here that the hash method is used to generate a hash value that will be used for the index. then each index in the array starting from zero all the way up to the maximum indeed value all point to linked lists. The item we are interested in gets added to the list at its index value. Listen to the full episode or you can also read the full transcript below. Transcript Sometimes all you need is to be able to add and remove items in a collection and then find them later. If you’re not concerned about order, then a hash table is a good choice. Let’s say that you know exactly where you want to add an item based on a number. Each item has its own number. You could allocate an array big enough to hold all the items and then put each one in its own spot. Think of a parking lot where everybody has a reserved spot. You don’t need to search for a spot. You just drive your car directly to your reserved space. If you’re not using the parking lot, then your space remains empty. This is actually a very good solution if all the spaces are normally full and you can control where everything goes. But what do you do if your parking lot can hold up to two thousand cars but there’s usually only a hundred or so at any given time? You’re just wasting all that space. If this was a computer program, you’d be wasting memory. A better solution would be to reserve less space and calculate the position an object will occupy based on some property of the object. Taking the parking lot example, maybe we could add up all the numbers and letter values on the license plate, then divide that by the total number of parking spaces. The division probably won’t be exact but even if it is, that doesn’t matter because what we’re really interested in is the remainder. If all the parking spaces are numbered, then the remainder will identify the making space to be used. If you forget where you parked your car, just go through the process again to calculate the remainder and that’s where your car will be. The actual name for the final spot in the collection is the hash. And a hash table is sort of like this parking lot. Whatever method is used to calculate the location or the hash, it should have some important properties. What happens if two cars arrive at the parking lot with different license plate numbers but they just happen to arrive at the same hash? This is called a collision and that’s exactly what happens if you try to park two cars in the same spot. The first property is that the method should produce hash results as random as possible. This randomness should result in a large difference in the hash even with a small difference in the properties. Just think of two cars arriving with only a single number different between their two license pla