Data Types

Scalars I Arrays I Hashes I References I Data Structures

Scalars

Scalar variables have single values and names prefixed with $.

my ($a, $b, $c, $d, $e);  # declares null scalars $a .. $e
$a = 5;           # decimal integer
$b = 0x5a;        # hex integer
$c = 2.7e-3;      # float
$d = 'string\n';   # single-quoted string (non-interpolating)
$e = "string\n";  # double-quoted string (interpolating)

The type of a scalar variable is weakly constrained. In principle a single variable, at different points in a program, could be assigned an integer, float, string or reference (pointer) value, though it's not generally a good idea to play fast and loose.

All arithmetic is done with double precision:

my ($a, $b, $c);
$a = 2;
$b = 3;
$c = $a/$b;  # 0.6666666666667

Numeric string values are converted to numbers if used in arithmetic expressions:

my ($a, $b, $c);
$a = '2';
$b = "3";
$c = $a/$b;  # 0.6666666666667

This makes it easier to extract numeric data from an ascii file.

There are different types of string literals. Double-quoted strings are subject to variable and backslash interpolation, while single-quoted strings are left alone:

my ($a, $b, $c);
$a = 'spherical tokamak';
$b = "MAST is a $a\n";  # 'MAST is a spherical tokamak' followed by a newline
$b = 'MAST is a $a\n';  # 'MAST is a $a\n'

'Here' document literal string syntax is the same as provided by Unix shells.

my ($a, $b);
$a = 3;
$b = <<EOT;
Line 1
Line 2
Line $a
EOT
print($b);  prints 'line 1
                    line 2
                    line 3'

The label (EOT here) is arbitrary. You can quote it with single or double quotes to control interpolation. Default behaviour is to interpolate.

Arrays

Top I Previous I Next

Array variables have list values and names prefixed with @. In contrast, array elements are scalars and have names prefixed with $. Again, typing is weak, and a single array can contain mixed element types if appropriate.

my (@a);  # declares empty array @a
@a = (1, 2.1, 'four');
print($a[0]);  # 1
print($a[1]);  # 2.1 
print($a[2]);  # four 
$a[0] = 'one';  # $a is now ('one', 2.1, 'four')

push() and pop() treat an array as a stack:

my (@a);
push(@a, 1, 2.1, 'four');

This appends the list to the end of the array.

Note that scalars, arrays and hashes each live in a different namespace. $a[0] is the first element of array @a, but $a (without square brackets) is an unrelated variable. Normally you should avoid this in variable naming, but it can occasionally be useful:

my ($a, @a);  # declares null scalar $a and empty array @a
@a = ('one', 'two', 'three');
$a = join(', ', @a);  # 'one, two, three'

Hashes

Top I Previous I Next

The hash is the jewel in Perl's data type crown. A hash is an associative array, indexed by a string instead of a numeric index as with normal arrays. As the name suggests, Perl implements them using hash tables for fast lookup. Hash names have a % prefix. Again, individual hash values are scalars, with names prefixed with $. Hash indices use curly braces to distinguish them from ordinary array indices.

my ($b, $k, %a);  # declares null scalars $b, $k and empty hash %a
$a{'this'} = 1;
$a{'that'} = 'two';
$b = 'this';
print($a{$b});  # 1;
$b = 'that';
print($a{$b});  # two;
$b = 'them';
$a{$b} = 7.93e-5;
$k = join(', ', keys(%a));  # them, that, this

Note that hashes are unordered. If you walk them you see the key/value pairs in apparently random order. If this matters there are workarounds.

You can write hash literals in other ways. A hash is really just an ordinary array of key/value pairs with a fast lookup mechanism, so you can do:

my ($b, %a);
%a = ('this', 1, 'that', 'two', 'them', 7.93e-5);
$b = join(', ', values(%a));
print($b);  # 7.93e-5, two, 1

But you might prefer this:

%a = (this => 1, 
      that => 'two', 
      them => 7.93e-5);

Note that quotes around hash keys are optional.

References

Top I Previous I Next

Perl references play the role that pointers do in C. They make compound data structures possible because they are scalars, and so arrays and hashes can have them as values.

my ($ar, @a, @b);
@a = (1, 2, 3, 4);
$ar = \@a;  # $ar is now a reference to @a
@b = @$ar;  # copies @a to @b
print($$ar[0]);  # 1

I have to admit this syntax takes some getting used to, though in lots of cases it's transparent because of other syntactic wizardry.

You can also create a reference to an anonymous array using square brackets, or to an anonymous hash using curlies.

my ($ar, $br);
$ar = [1, 2, 3, 4];
print($$ar[0]);  # 1
$br = {this => 1, that => 'two', them => 7.93e-5};
print($$br{that});  # 'two'

Data Structures

Top I Previous I Next

How can you implement a C struct in Perl?

struct s {
   int a;
   float b;
   char *c;
};
struct s t = {1, 2.5, "string"};

With a hash:

my (%t);
$t{a} = 1;
$t{b} = 2.5;
$t{c} = 'string';

What about an array of C structs?

struct s t[100];
t[0].a = 2;

With an array of hashes:

my (@t);
$t[0]{a} = 2;

How about a hash of arrays? Later you'll see a dictionary (hash of words) where the values are the positions in a text where the word is found. As each word may appear more than once, and array of positions is required:

my (%d);
$d{dog}[0] = 34;

Alternatively you can operate explicitly on one of the sub-arrays:

push(@{$d{dog}}, 34);

Compound data structures like these, and more complicated ones, use references to achieve their effect. Perl dereferencing syntax can be a little daunting, but it's worth fighting through. There are more examples in code which follows.