I would like to properly understand hashes in Perl. I've had to use Perl intermittently for quite some time and mostly whenever I need to do it, it's mostly related to text processing.
And everytime, I have to deal with hashes, it gets messed up. I find the syntax very cryptic for hashes
A good explanation of hashes and hash references, their differences, when they are required etc. would be much appreciated.
A simple hash is close to an array. Their initializations even look similar. First the array:
Now let's represent the same information with a hash (aka associative array):
Although they have the same name, the array
@last_nameand the hash%last_nameare completely independent.With the array, if we want to know Archie's last name, we have to perform a linear search:
With the hash, it's much more direct syntactically:
Say we want to represent information with only slightly richer structure:
Before references came along, flat key-value hashes were about the best we could do, but references allow
Internally, the keys and values of
%personal_infoare all scalars, but the values are a special kind of scalar: hash references, created with{}. The references allow us to simulate "multi-dimensional" hashes. For example, we can get to Wilma viaNote that Perl allows us to omit arrows between subscripts, so the above is equivalent to
That's a lot of typing if you want to know more about Fred, so you might grab a reference as sort of a cursor:
Because
$fredin the snippet above is a hashref, the arrow is necessary. If you leave it out but wisely enableduse strictto help you catch these sorts of errors, the compiler will complain:Perl references are similar to pointers in C and C++, but they can never be null. Pointers in C and C++ require dereferencing and so do references in Perl.
C and C++ function parameters have pass-by-value semantics: they're just copies, so modifications don't get back to the caller. If you want to see the changes, you have to pass a pointer. You can get this effect with references in Perl:
Without the backslash,
add_barneywould have gotten a copy that's thrown away as soon as the sub returns.Note also the use of the "fat comma" (
=>) above. It autoquotes the string on its left and makes hash initializations less syntactically noisy.