Advanced Perl
Perl References
Uses cases for references
- Creating multi-dimensional and other deep data structures (arrays of arrays, hashes of arrays, etc.)
- Passing complex data structures to functions (e.g. arrays, hashes)
References
\$x # reference to scalar
\@y # reference to array
\%z # reference to hash
\&f # reference to function
Array References
- @array
- ARRAY
To solve the problem we will use references. Prefixing the array with a back-slash \ creates a reference to it.
my $names_ref = \@names;
print $names_ref; # ARRAY(0x703dcf2)
@$names_ref
but it will be probably more readable to write
@{$names_ref}
or even
@{ $names_ref }
Once we know all this we can pass a reference to a function, within the function we can dereference the array and we get back the original array.
#!/usr/bin/env perl
use strict;
use warnings;
my @names = qw(Foo Bar Baz);
my $names_ref = \@names;
print "$names_ref\n"; # ARRAY(0x703dcf2)
print "@$names_ref\n"; # Foo Bar Baz
print "@{ $names_ref }\n"; # Foo Bar Baz
print "$names[0]\n"; # Foo
print "$names_ref->[0]\n"; # Foo
Hash References
- %hash
- HASH
Similarly to the Array references one can create references to Hashes as well.
my $phones_ref = \%phones;
Hash Hash Reference
%phones %{ $phones_ref }
$phones{Foo} ${ $phones_ref }{Foo}
$phones_ref->{Foo}
keys %phones keys %{ $phones_ref }
Using the print function to print out the content of an Array was quite OK. Printing a Hash was a disaster. The same happens once you dereference a reference to an array or hash.
#!/usr/bin/env perl
use strict;
use warnings;
my %phones = (
Barney => 123,
Fred => 456,
Wilma => 789,
);
my $phones_ref = \%phones;
print $phones_ref, "\n"; # HASH(0x703fed1)
my @names = sort keys %$phones_ref};
print "@names\n"; # Barney Fred Wilma
print "$phones{Fred}\n"; # 456
print "$phones_ref->{Fred}\n"; # 456
Scalar references
- scalar references
Similar to array references and hash references one can also create references to scalars.
#!/usr/bin/env perl
use strict;
use warnings;
my $name = "Foo";
my $name_ref = \$name;
print "$name_ref\n"; # SCALAR(0x562516e5f140)
print "$name\n"; # Foo
print "$$name_ref\n"; # Foo
Subroutine references
- subroutine references
- function references
#!/usr/bin/env perl
use strict;
use warnings;
sub add {
my ($x, $y) = @_;
return $x+$y;
}
my $add_ref = \&add;
print "$add_ref\n"; # CODE(0x564e85424148)
print &{ $add_ref }(2, 3), "\n"; # 5
print $add_ref->(2, 3), "\n"; # 5
Passing values to a function
Let's see a simple example of passing values to a function.
#!/usr/bin/bin perl
use strict;
use warnings;
sub add {
my ($x, $y) = @_;
return $x+$y;
}
print add(2, 3), "\n";
Add two (or more) arrays
- @_
Let's extend it so it will be able to take two vectors (arrays) and add
them pair-wise. (2, 3) + (7, 8, 5) = (9, 11, 5)
#!/usr/bin/env perl
use strict;
use warnings;
my @left = (2, 3);
my @right = (7, 8, 5);
add(@left, @right);
sub add {
my ($first, $second) = @_;
print "$first\n";
print "$second\n";
}
# 2
# 3
addx(@left, @right);
sub addx {
my (@first, @second) = @_;
print "First: @first\n";
print "Second: @second\n";
}
# First: 2 3 7 8 5
# Second:
The problem is, @_ in the add function will get (2, 3, 7, 8, 5); We cannot know where does the first array end and where the second begins.
(I am sure one can come up with a complex way of prefixing the real data by meta information that describes the data, but why work so hard if Perl already has the tools you need?)
Add two arrays
#!/usr/bin/env perl
use strict;
use warnings;
my @first = (2, 3);
my @second = (7, 8, 5);
my @res = add(\@first, \@second); # passing two references
print "@res\n";
sub add {
my ($one_ref, $two_ref) = @_;
my @one = @{ $one_ref }; # dereferencing and copying each array
my @two = @{ $two_ref };
my @result;
foreach my $i (0..@one-1) {
if (defined $two[$i]) {
push @result, $one[$i] + $two[$i];
}
else {
push @result, $one[$i];
}
}
foreach my $i (@one..@two-1) {
push @result, $two[$i];
}
return @result;
}
Passing two arrays to a function
Array References
Of course if the arrays are big, copying them is waste of time and memory. Let's see how can we access the individual elements of an array when using a reference to that array. Then we won't have to copy them within the function.
my $names_ref = \@names;
Array Array Reference
@names @{ $names_ref }
$names[0] ${ $names_ref }[0]
$names_ref->[0]
$#names $#$names_ref
You really don't want to use this $#$names_ref
among people...
Process arrays without copying
This is similar to the previous solution but without copying the arrays
#!/usr/bin/env perl
use strict;
use warnings;
my @first = (2, 3);
my @second = (7, 8, 5);
my @res = add(\@first, \@second);
print "@res\n";
sub add {
my ($one_ref, $two_ref) = @_;
my @result;
foreach my $i (0..@{ $one_ref }-1) {
if (defined $two_ref->[$i]) {
push @result, $one_ref->[$i] + $two_ref->[$i];
}
else {
push @result, $one_ref->[$i];
}
}
foreach my $i (@{ $one_ref }..@{ $two_ref }-1) {
push @result, $two_ref->[$i];
}
return @result;
}
Scope of variables
- scope
Let's see an example where we create an array and a reference to it in some arbitrary {} scope. The array is defined within the scope while the variable holding the reference is defined outside the scope.
#!/usr/bin/env perl
use strict;
use warnings;
my $names_ref;
{
my @names = qw(Foo Bar);
$names_ref = \@names;
}
print "$names_ref->[0]\n"; # Foo
# print "@names\n"; # Global symbol "@names" requires explicit package name
#!/usr/bin/env perl
use strict;
use warnings;
use Devel::Refcount qw(refcount);
my $names_ref;
{
my @names = qw(Foo Bar);
$names_ref = \@names;
}
print "$names_ref->[0]\n"; # Foo
print(refcount($names_ref), "\n"); # 1
After the closing } @names went out of scope already but $names_ref still lets us access the values.
As $names_ref still holds the reference to the location of the original array in the memory perl keeps the content of the array intact.
Reference counting
- reference counting}
As you already know perl automatically allocates memory when you add a new element to an array or hash. It also frees up memory (although only internally) when you remove the element from the array or hash.
Similarly when you create an array or a hash, perl sets up a counter (called reference counter) and sets it to 1. When the array or hash goes out of scope the counter is reduced by 1. If the counter reaches 0 perl frees the memory allocated to that array. Without references this is always the case.
When you create a reference to an array and store it in a variable perl increases the reference counter of the original array by 1 (to 2). Now if the array goes out of scope and the ref count is reduced by 1 it still does not reach 0 so perl does not free up the location of the array.
Only if the reference stops referring to this memory location will the counter reach 0 thereby freeing up the memory allocated to the array.
More Reference Counting
my @names = qw(Foo Bar Baz); # cnt = 1
my $names_ref = \@names; # cnt = 2
my $other_ref = \@names; # cnt = 3
my $x_ref = $names_ref; # cnt = 4
$other_ref = undef; # cnt = 3
Process arrays without copying even the return values
In the previous solution we passed the references to the function but returned a full array, thereby copying all the values. If we care about memory and speed we might eliminate this copying by returning the reference to the resulting array.
The cost is a small (?) inconvenience as now we have to dereference the resulting array reference in the calling code.
#!/usr/bin/env perl
use strict;
use warnings;
my @first = (2, 3);
my @second = (7, 8, 5);
my $res_ref = add(\@first, \@second);
print "@{ $res_ref }\n";
sub add {
my ($one_ref, $two_ref) = @_;
my @result;
foreach my $i (0..@{ $one_ref }-1) {
if (defined $two_ref->[$i]) {
push @result, $one_ref->[$i] + $two_ref->[$i];
}
else {
push @result, $one_ref->[$i];
}
}
foreach my $i (@{ $one_ref }..@{ $two_ref }-1) {
push @result, $two_ref->[$i];
}
my $ret = \@result;
return $ret;
}
Debugging (pretty printing)
- Data::Dumper
But once we have those references we have a better tool to print out their content.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my @names = qw(Foo Bar Baz);
my $names_ref = \@names;
my %phones = (
Foo => 123,
Bar => 456,
Baz => 789,
);
my $phones_ref = \%phones;
print Dumper $names_ref, $phones_ref;
$VAR1 = [
'Foo',
'Bar',
'Baz'
];
$VAR2 = {
'Bar' => 456,
'Baz' => 789,
'Foo' => 123
};
Actually you can use the Dumper on the references themself without putting them in scalar variables.
print Dumper \@names, \%phones;
Change values in a reference
If you create a copy of an array then the two arrays are separated. Change to any of the arrays is not reflected in the other array.
#!/usr/bin/env perl
use strict;
use warnings;
my @names = qw(Foo Bar);
my @copy_names = @names;
$copy_names[0] = 'Baz';
print "$names[0]\n"; # Foo
print "$copy_names[0]\n"; # Baz
When you create a reference to an array, then the referenced array has the same memory location, hence change in either one of them is a change in both of them.
#!/usr/bin/env perl
use strict;
use warnings;
my @names = qw(Foo Bar);
my $names_ref = \@names;
$names_ref->[0] = 'Baz';
print "$names[0]\n"; # Baz
print "$names_ref->[0]\n"; # Baz
That means you can pass to a function an array reference, then from within the function it is easy to change the content of the original array.
Exercise: double numbers
Create a function that gets an array reference and multiplies each value in it by 2;
my @numbers = (2, 4, 7);
multiply_by_two(\@numbers);
print "@numbers\n"; # 4 8 14
Exercise: Add many arrays
Pick up the examples/references/add_arrays.pl script that can add two arrays and change it to accept any number of array references.
Extra exercise: add parameters that will control to stop the addition at the shortest array or the longest array.
my @a = (2, 3);
my @b = (4, 5, 6);
add('shortest', \@a, \@b); # returns (6, 8)
add('longest', \@a, \@b); # returns (6, 8, 6)
Exercise: Function to compare two hashes
Create a function that given two hashes, returns a report showing missing keys or keys with different values.
#!/usr/bin/env perl
use strict;
use warnings;
my %x = (
foo => 1,
bar => 2,
baz => 3,
zoo => 6,
foobar => undef,
moose => undef,
);
my %y = (
foo => 1,
bar => 4,
moo => 5,
zoo => undef,
foobar => 9,
moose => undef,
);
my @report = compare_hashes(\%x, \%y);
print join "\n", @report;
print "\n";
The value of 'bar' differs: '2' and '4'
The key 'baz' does not exist in second hash
The value of 'zoo' is '6' in the fist has and undef in the second hash
The value of 'foobar' is '9' in the second has and undef in the first hash
The key 'moo' does not exist in first hash
Solution: Double numbers
#!/usr/bin/env perl
use strict;
use warnings;
my @numbers = (2, 4, 7);
multiply_by_two(\@numbers);
print "@numbers\n"; # 4 8 14
sub multiply_by_two {
my ($ref) = @_;
foreach my $number (@$ref) {
$number *= 2;
}
}
Solution: Add many arrays
#!/usr/bin/env perl
use strict;
use warnings;
my @first = (2, 3);
my @second = (7, 8, 5);
my @third = (9, 10, 2, 4);
my $res = add(\@first, \@second, \@third);
print "@$res\n"; # 18 21 7 4
sub add {
my @args = @_;
return [] if @args == 0;
my $first = shift @args;
if (@args) {
my $rest = add(@args);
return add2($first, $rest);
} else {
return [@$first];
}
}
# is the same as was the add() in add_arrays_nocopy_return.pl
sub add2 {
my ($one_ref, $two_ref) = @_;
my @result;
foreach my $i (0..@{ $one_ref }-1) {
if (defined $two_ref->[$i]) {
push @result, $one_ref->[$i] + $two_ref->[$i];
}
else {
push @result, $one_ref->[$i];
}
}
foreach my $i (@{ $one_ref }..@{ $two_ref }-1) {
push @result, $two_ref->[$i];
}
return \@result;
}
Solution: Add many arrays
#!/usr/bin/perl
use strict;
use warnings;
my @first = (2, 3);
my @second = (7, 8, 5);
my @third = (9, 10, 2, 4);
my $res = add(\@first, \@second, \@third);
print "@$res\n"; # 18 21 7 4
sub add {
my @args = @_;
return [] if @args == 0;
my $longest = 0;
foreach my $r (@args) {
if ($longest < @$r) {
$longest = @$r;
}
}
my @result;
foreach my $i (0..$longest-1) {
foreach my $r (@args) {
$result[$i] += (defined $r->[$i] ? $r->[$i] : 0);
}
}
return \@result;
}
Solution: Function to compare two hashes
#!/usr/bin/perl
use strict;
use warnings;
my %x = (
foo => 1,
bar => 2,
baz => 3,
zoo => 6,
foobar => undef,
moose => undef,
);
my %y = (
foo => 1,
bar => 4,
moo => 5,
zoo => undef,
foobar => 9,
moose => undef,
);
my @report = compare_hashes(\%x, \%y);
print join "\n", @report;
print "\n";
sub compare_hashes {
my ($first, $second) = @_;
my @report;
foreach my $k (keys %{ $first }) {
if (not exists $second->{$k}) {
push @report, "The key '$k' does not exist in second hash";
} elsif (not defined $first->{$k} and not defined $second->{$k}) {
# ok, neither is defined
} elsif (defined $first->{$k} and not defined $second->{$k}) {
push @report,
"The value of '$k' is '$first->{$k}' in the first hash"
. " and undef in the second hash";
} elsif (not defined $first->{$k} and defined $second->{$k}) {
push @report,
"The value of '$k' is '$second->{$k}' in the second hash"
. " and undef in the first hash";
} elsif ($first->{$k} ne $second->{$k}) {
push @report,
"The value of '$k' differs: '$first->{$k}' and '$second->{$k}'";
}
}
foreach my $k (keys %{ $second }) {
if (not exists $first->{$k}) {
push @report, "The key '$k' does not exist in first hash";
}
}
return @report;
}
Anonymous Arrays
- anonymous array
- []
Occasionally we are not interested in the array @names, just in a reference to it. We can use the scoping trick we saw earlier to force the array out of scope immediately once it was used.
my $names_ref;
{
my @names = ("Foo", "Bar", "Baz");
$names_ref = \@names;
}
# @names is not in scope here
# $names_ref is still is
We can reach the same result without the temporary array and the whole scoping issue by creating what we call an "anonymous array". That is an array that does not have a real name, (@name) just a reference to it.
my $other_names_ref = ["Foo", "Bar", "Baz"];
my $yet_another_names_ref = [ qw(Foo Bar Baz) ];
We got used to seeing [] as an indicator of an array element. The same visual clue will help us remember that the above creates an array reference.
Array of Arrays
- AoA
#!/usr/bin/perl
use strict;
use warnings;
my $first_names_ref = [ qw(Foo Bar Baz) ];
my $family_names_ref = [ qw(Moo Zorg) ];
my @names = ($first_names_ref, $family_names_ref);
print "$first_names_ref\n"; # ARRAY(0x703dcf2)
print "@{ $first_names_ref }\n"; # Foo Bar Baz
print "$names[0]\n"; # ARRAY(0x703dcf2)
print "@{ $names[0] }\n"; # Foo Bar Baz
print "@{ $names[1] }\n"; # Moo Zorg
print "$first_names_ref->[0]\n"; # Foo
print "$names[0]->[0]\n"; # Foo
print "$names[0][0]\n"; # Foo
That already looks like a two dimensional array. It is not, but it can be used like one.
Array of Arrays (AoA)
Instead of naming the internal array references we can use them within the creation of the larger array.
#!/usr/bin/env perl
use strict;
use warnings;
my @names = (
[ qw(Foo Bar Baz) ],
[ qw(Moo Zorg) ],
);
print "$names[0]\n"; # ARRAY(0x703dcf2)
print "@{ $names[0] }\n"; # Foo Bar Baz
print "@{ $names[1] }\n"; # Moo Zorg
print "$names[0]->[0]\n"; # Foo
print "$names[0][0]\n"; # Foo
Many dimensional arrays
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my @dim;
$dim[0] = "zero";
$dim[1][0] = "one-zero";
$dim[1][1][1] = "one-one-one";
$dim[2][0][0] = "two-zero-zero";
print Dumper \@dim;
$VAR1 = [
'zero',
[
'one-zero',
[
undef,
'one-one-one'
]
],
[
[
'two-zero-zero'
]
]
];
Anonymous hashes
- anonymous hash
- {}}
- HoH}
#!/usr/bin/perl
use strict;
use warnings;
my $phones_ref = {
Foo => 123,
Bar => 456,
Moo => 789,
};
print "$phones_ref->{Foo}\n"; # 123
Hash of Hashes (HoH)
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %distance;
$distance{'New York'}{'London'} = 5_027;
$distance{'New York'}{'Bejin'} = 10_100;
$distance{'Paris'}{'London'} = 350;
print Dumper \%distance;
$VAR1 = {
'Paris' => {
'London' => 350
},
'New York' => {
'Bejin' => 10100,
'London' => 5027
}
};
More complex data structures
$grade{Name}{Subject}[index] = Exam-Grade;
$grade{Name}{Subject} = Final-Grade;
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %grades;
$grades{"Foo Bar"}{Mathematics} = 97;
$grades{"Foo Bar"}{Literature} = 67;
$grades{"Peti Bar"}{Literature} = 88;
$grades{"Peti Bar"}{Mathematics} = 82;
$grades{"Peti Bar"}{Art} = 99;
$grades{"Foo Bar"}{Chemistry}[0] = 30;
$grades{"Foo Bar"}{Chemistry}[1] = 48;
$grades{"Foo Bar"}{Chemistry}[2] = 72;
$grades{"Foo Bar"}{Chemistry}[3] = 80;
print Dumper \%grades;
$VAR1 = {
'Peti Bar' => {
'Art' => 99,
'Literature' => 88,
'Mathematics' => 82
},
'Foo Bar' => {
'Chemistry' => [
30,
48,
72,
80
],
'Literature' => 67,
'Mathematics' => 97
}
};
Memory leak with cross references
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
# memory leak
# try running with 5,000,000
make_match('Foo', 'Bara', 'dump');
my $n = $ARGV[0] || 0;
for (1..$n) {
make_match('Foo', 'Bara');
}
if (@ARGV) {
print "Please press ENTER to finish...";
<STDIN>;
}
sub make_match {
my ($man_name, $woman_name, $dump) = @_;
my $man = {
name => $man_name,
};
my $wife = {
name => $woman_name,
};
$man->{wife} = $wife;
$wife->{man} = $man;
if ($dump) {
print Dumper $man, $wife;
}
}
$VAR1 = {
'wife' => {
'man' => $VAR1,
'name' => 'Bara'
},
'name' => 'Foo'
};
$VAR2 = $VAR1->{'wife'};
Run the script and when it displays the prompt, check the memory usage. Passing 100,000 on the command line made it use 39 Mb memory.
Memory leak with cross references - weaken
- weaken
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Scalar::Util qw(weaken);
# memory leak
# try running with 5,000,000
make_match('Foo', 'Bara', 'dump');
my $n = $ARGV[0] || 0;
for (1..$n) {
make_match('Foo', 'Bara');
}
if (@ARGV) {
print "Please press ENTER to finish...";
<STDIN>;
}
sub make_match {
my ($man_name, $woman_name, $dump) = @_;
my $man = {
name => $man_name,
};
my $wife = {
name => $woman_name,
};
$man->{wife} = $wife;
$wife->{man} = $man;
weaken $wife->{man};
if ($dump) {
print Dumper $man, $wife;
}
}
Read CSV file
- CSV
{% embed include file="src/examples/references/data.csv)
We would like to read in that file and be able to access the fname of row 5
as $data[3]{fname}
# the fname on line 5 is in index 3 because:
# line 1 is the header
# line 2 is element 0 in the array
# ...
# line 5 is element 3 in the array
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $filename = shift
or die "Usage: $0 FILENAME ( examples/references/data.csv )\n";
my @data;
open my $fh, '<', $filename or die;
my $header = <$fh>;
chomp $header;
my @header = split /,/, $header;
while (my $line = <$fh>) {
chomp $line;
my %row;
my @values = split /,/, $line;
foreach my $i (0..@header-1) {
my $field = $header[$i];
$row{$field} = $values[$i];
}
# using hash slices:
# @row{@header} = split /,/, $line;
push @data, \%row;
}
print Dumper \@data;
$VAR1 = [
{
'lname' => 'Bar',
'fname' => 'Boo',
'phone' => '123'
},
{
'lname' => 'Baz',
'fname' => 'Foo',
'phone' => '456'
},
{
'lname' => 'Zorg',
'fname' => 'Moo',
'phone' => '789'
}
];
Exercise: read in an ini file
Read in an ini file and create a two dimensional hash.
First dimension - section name. Second dimension - key name.
print $data{Colors}{blue}; # dark
{% embed include file="src/examples/references/data.ini)
Exercise: improve the csv reader
Instead of an array create a hash in which the key is
the fname and the value is the hash reference.
To get the phone of 'Foo' we could write:
$data{Foo}{phone}
- First you can assume fname values are unique.
- Then improve it by checking if the fname is indeed unique and warn if not.
- Further improve it by removing the fname from the internal hashes.
- Further improve by passing the name of the key field (currently fname) as a parameter.
- Bonus: try to deal with the case when there are duplicate values for the key field without just giving an error message.
Solution: Read ini file
Read in an ini-file and create a two dimensional hash.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $filename = shift
or die "Usage: $0 FILENAME ( examples/references/data.ini )\n";
my %ini;
my $section = '';
open my $fh, '<', $filename or die;
while (my $line = <$fh>) {
chomp $line;
next if $line =~ /^\s*(#.*)?$/;
if ($line =~ /^\[(.*)\]/) {
$section = $1;
next;
}
if ($line =~ /^(\w+)\s*=\s*(.*)$/) {
$ini{$section}{$1} = $2;
next;
}
}
print Dumper \%ini;
Solution: improve the csv reader
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $filename = shift
or die "Usage: $0 FILENAME ( examples/references/data.csv )\n";
my %data;
open my $fh, '<', $filename or die;
my $header = <$fh>;
chomp $header;
my @header = split /,/, $header;
while (my $line = <$fh>) {
chomp $line;
my %row;
@row{@header} = split /,/, $line;
my $key = $row{fname};
$data{$key} = \%row;
}
print Dumper \%data;
autovivification
- autovivification
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $data;
$data->{Foo}{Age} = 23;
print Dumper $data;
if ($data->{Bar}{Age} > 18) {
print "Your are too old\n";
}
print Dumper $data;
delete $data->{Moo}{Age};
print Dumper $data;
Output:
$VAR1 = {
'Foo' => {
'Age' => 23
}
};
$VAR1 = {
'Bar' => {},
'Foo' => {
'Age' => 23
}
};
$VAR1 = {
'Bar' => {},
'Moo' => {},
'Foo' => {
'Age' => 23
}
};
Scalar references in Getopt::Long
- Getopt::Long
It is used less often than the other references but one of the prominent uses is the GetOptions function of Getopt::Long.
use Getopt::Long;
my $file = "default.txt";
my $debug;
GetOptions(
"file=s" => \$file,
"debug" => \$debug,
) or die;
Anonymous subroutines
#!/usr/bin/perl
use strict;
use warnings;
my $add_ref = sub {
my ($x, $y) = @_;
return $x+$y;
};
print &{ $add_ref }(2, 3), "\n";
print $add_ref->(2, 3), "\n";
Uses of Subroutine references
The find() function of File::Find.
Exercise: DNA Sequence Analyzer
Create a function for researching DNA sequences. Below, you can see the "source" file. There is all kind of extra information, that is currently not interesting to us. Implement the read_file function that goes over the given file extracting and returning the list of all the DNA sequences. The user of your function can now, create a foreach-loop going over the list of DNA sequences and call her own function that will analyze the sequence and print the results. See the input file, the skeleton of the solution in which I already added the analyze function the end user is going to write. The expected output can be also found. (Of course this read_file should work regardless of what the analyze function does.)
Some header text
-----
name: Name of the source
date: 1999.01.23
DNA: GAGATTC
name: Name of the source
date: 1999.01.24
DNA: GAGATCCTGC
name: Name of the source
date: 2007.01.24
DNA: CGTGAGAATCTGC
name: Name of the source
date: 2008.01.24
DNA: CGTGATCTGC
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'dna.txt';
my @dna_sequences = read_file($file);
foreach my $seq (@dna_sequences) {
analyze_dna($seq);
}
sub analyze_dna {
my ($dna) = @_;
if ($dna =~ /(.)\1/) {
print "$dna has double $1\n";
}
}
sub read_file {
...
}
GAGATTC has double T
GAGATCCTGC has double C
CGTGAGAATCTGC has double A
Solution: DNA Sequence Analyzer
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'dna.txt';
my @dna_sequences = read_file($file);
foreach my $seq (@dna_sequences) {
analyze_dna($seq);
}
sub analyze_dna {
my ($dna) = @_;
if ($dna =~ /(.)\1/) {
print "$dna has double $1\n";
}
}
sub read_file {
my ($filename) = @_;
open my $fh, '<', $filename or die;
my @results;
while (my $line = <$fh>) {
chomp $line;
if ($line =~ /^DNA:\s*([CGTA]+)/) {
push @results, $1;
}
}
return @results;
}
Exercise: DNA Sequence Analyzer with callback
The above can work, but if the file is huge, we might not be able to hold all the list in memory. Change the read_file function that to allow the user to supply the analyze function (or rather the reference to the analyze function) as a parameter. See the skeleton below. The output should be the same as above.
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'dna.txt';
my @dna_sequences = read_file(\&analyze_dna, $file);
sub analyze_dna {
my ($dna) = @_;
if ($dna =~ /(.)\1/) {
print "$dna has double $1\n";
}
}
sub read_file {
...
return;
}
Solution: DNA Sequence Analyzer with callback
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'dna.txt';
my @dna_sequences = read_file(\&analyze_dna, $file);
sub analyze_dna {
my ($dna) = @_;
if ($dna =~ /(.)\1/) {
print "$dna has double $1\n";
}
}
sub read_file {
my ($sub, $filename) = @_;
open my $fh, '<', $filename or die;
while (my $line = <$fh>) {
chomp $line;
if ($line =~ /^DNA:\s*([CGTA]+)/) {
$sub->($1);
}
}
return;
}
Exercise: DNA Sequence Analyzer with shortcut
What if you would like to provide the user the ability to stop the processing any time she wants even if the file has not finished yet? For example when she found the first match. This can be controlled by the return value of the analyze function. If it is true, stop processing. See the skeleton and the expected output:
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'dna.txt';
my @dna_sequences = read_file(\&analyze_dna, $file);
sub analyze_dna {
my ($dna) = @_;
if ($dna =~ /(.)\1/) {
print "$dna has double $1\n";
return 1
}
return 0;
}
sub read_file {
...
return;
}
GAGATTC has double T
Solution: DNA Sequence Analyzer with shortcut
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'dna.txt';
my @dna_sequences = read_file(\&analyze_dna, $file);
sub analyze_dna {
my ($dna) = @_;
if ($dna =~ /(.)\1/) {
print "$dna has double $1\n";
return 1
}
return 0;
}
sub read_file {
my ($sub, $filename) = @_;
open my $fh, '<', $filename or die;
while (my $line = <$fh>) {
chomp $line;
if ($line =~ /^DNA:\s*([CGTA]+)/) {
last if $sub->($1);
}
}
return;
}
Dispatch Table
#!/usr/bin/perl
use strict;
use warnings;
sub add {
my ($x, $y) = @_;
return $x+$y;
}
sub multiply {
my ($x, $y) = @_;
return $x*$y;
}
my %dispatch = (
'+' => \&add,
'*' => \&multiply,
);
my $op = '+';
print $dispatch{$op}->(2, 3), "\n";
Dispatch Table using symbolic references
#!/usr/bin/perl
use strict;
use warnings;
sub add {
my ($x, $y) = @_;
return $x+$y;
}
sub multiply {
my ($x, $y) = @_;
return $x*$y;
}
my %dispatch = (
'+' => 'add',
'*' => 'multiply',
);
my $op = '+';
# Generally not recommended to use symbolic references!!
# unless you really know what are you doing
{
no strict 'refs';
print $dispatch{$op}->(2, 3), "\n";
}
The ref() function
- ref
my $name = 'Foo';
ref($name); # undef
ref(\$name); # SCALAR
my $array_ref = [];
ref($array_ref); # ARRAY
my $hash_ref = {};
ref($hash_ref); # HASH
my $sub_ref = sub {};
ref($sub_ref); # CODE
Copy a data structure
Copy an array or hash referred to by a reference.
#!/usr/bin/perl
use strict;
use warnings;
my $names_ref = [qw(Foo Zorg)];
# copy the ARRAY reference, the content is the same location
my $names_other_ref = $names_ref;
$names_other_ref->[0] = 'Bar';
print "$names_other_ref->[0]\n"; # Bar
print "$names_ref->[0]\n"; # Bar
print "$names_ref\n";
print "$names_other_ref\n";
# copy the content of the ARRAY reference
my $names_yet_other_ref = [ @{$names_ref} ];
$names_yet_other_ref->[0] = 'Moo';
print "$names_yet_other_ref->[0]\n"; # Moo
print "$names_ref->[0]\n"; # Bar
print "$names_other_ref->[0]\n"; # Bar
print "$names_yet_other_ref\n";
Copy HASH reference
$other_phones_ref = { %{$phones_ref} };
Deep copy
- deep copy
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my $ini = {
'name' => {
Foo => 123,
Bar => 456,
}
};
my $other_ini = { %{ $ini } };
print "$ini->{name}{Foo}\n"; # 123
print "$other_ini->{name}{Foo}\n"; # 123
$ini->{phone}{Baz} = 678;
print "$ini->{phone}{Baz}\n"; # 678
print "$other_ini->{phone}{Baz}\n"; # undef as expected
print Dumper $ini, $other_ini;
$ini->{name}{Foo} = 999;
print "$ini->{name}{Foo}\n"; # 999
print "$other_ini->{name}{Foo}\n"; # 999 !!!!
print Dumper $ini, $other_ini;
123
123
678
$VAR1 = {
'name' => {
'Bar' => 456,
'Foo' => 123
},
'phone' => {
'Baz' => 678
}
};
$VAR2 = {
'name' => $VAR1->{'name'},
'phone' => {}
};
999
999
$VAR1 = {
'name' => {
'Bar' => 456,
'Foo' => 999
},
'phone' => {
'Baz' => 678
}
};
$VAR2 = {
'name' => $VAR1->{'name'},
'phone' => {}
};
Deep copy - Storable dclone
- dclone
#!/usr/bin/perl
use strict;
use warnings;
use Storable qw(dclone);
my $ini = {
'name' => {
Foo => 123,
Bar => 456,
}
};
my $other_ini = dclone($ini);
print "$ini->{name}{Foo}\n";
print "$other_ini->{name}{Foo}\n";
$ini->{phone}{Baz} = 678;
print "$ini->{phone}{Baz}\n";
print "$other_ini->{phone}{Baz}\n"; #undef
$ini->{name}{Foo} = 999;
print "$ini->{name}{Foo}\n"; # 999
print "$other_ini->{name}{Foo}\n"; # 123
Serialization
-
Data::Dumper
-
Storable
-
YAML
-
JSON
-
XML::Dumper
Data::Dumper
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper ();
my $data_structure = {
phones => {
Foo => 123,
Bar => 345,
Baz => 678,
}
};
open my $fh, '>', 'dumped.data' or die;
print $fh Data::Dumper->Dump([$data_structure], ['data_structure']);
close $fh;
$data_structure = {
'phones' => {
'Bar' => 345,
'Baz' => 678,
'Foo' => 123
}
};
#!/usr/bin/perl
use strict;
use warnings;
my $data_structure;
my $filename = 'dumped.data';
if (open my $fh, '<', $filename) {
local $/ = undef;
my $dump = <$fh>;
eval $dump;
} else {
die "Could not open '$filename' $!";
}
print "Foo: $data_structure->{phones}{Foo}\n"; # Foo: 123
print "Bar: $data_structure->{phones}{Bar}\n"; # Bar: 345
print "Baz: $data_structure->{phones}{Baz}\n"; # Baz: 678
Data::Dumper both dump and restore
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper ();
my $data_structure;
my $filename = 'dumped.data';
if (-e $filename and open my $fh, '<', $filename) {
local $/ = undef;
my $dump = <$fh>;
eval $dump;
} else {
# initialize
$data_structure = {
phones => {
Foo => 0,
Bar => 0,
Baz => 0,
}
}
};
# update data
$data_structure->{phones}{Foo} += 1 * int rand 5;
$data_structure->{phones}{Bar} += 3 * int rand 5;
$data_structure->{phones}{Baz} += 9 * int rand 5;
print "Foo: $data_structure->{phones}{Foo}\n";
print "Bar: $data_structure->{phones}{Bar}\n";
print "Baz: $data_structure->{phones}{Baz}\n";
open my $fh, '>', $filename or die;
print $fh Data::Dumper->Dump([$data_structure], ['data_structure']);
close $fh;
Storable
#!/usr/bin/perl
use strict;
use warnings;
use Storable qw(store);
my $data_structure = {
phones => {
Foo => 123,
Bar => 345,
Baz => 678,
}
};
store $data_structure, 'frozen.data' or die;
creates a binary file
#!/usr/bin/perl
use strict;
use warnings;
use Storable qw(retrieve);
my $data = retrieve 'frozen.data' or die;
print "Foo: $data->{phones}{Foo}\n"; # Foo: 123
print "Bar: $data->{phones}{Bar}\n"; # Bar: 345
print "Baz: $data->{phones}{Baz}\n"; # Baz: 678
Storable in memory freezing
#!/usr/bin/perl
use strict;
use warnings;
use Storable qw(freeze thaw);
my $data_structure = {
phones => {
Foo => 123,
Bar => 345,
Baz => 678,
}
};
my $frozen = freeze $data_structure;
# here send it over the network
# or save it in a database
my $new_data_structure = thaw $frozen;
print "$new_data_structure->{phones}{Foo}\n"; # 123
print "$new_data_structure->{phones}{Bar}\n"; # 345
print "$data_structure\n"; # HASH(0x4e8144)
print "$new_data_structure\n"; # HASH(0x27b4a3c)
print "$data_structure->{phones}\n"; # HASH(0x4e7fe4)
print "$new_data_structure->{phones}\n"; # HASH(0x1ffe294)
YAML
---
people:
Bar:
phones:
- 345
- 678
title: CEO
Baz: NA
Foo:
phone: 123
#!/usr/bin/perl
use strict;
use warnings;
use YAML qw(DumpFile);
my $data_structure = {
people => {
Foo => {
phone => '123',
},
Bar => {
phones => [
'345',
'678',
],
title => 'CEO',
},
Baz => 'NA',
}
};
DumpFile 'data.yml', $data_structure;
#!/usr/bin/perl
use strict;
use warnings;
use YAML qw(LoadFile);
my $data = LoadFile('data.yml');
print "$data->{people}->{Foo}->{phone}\n"; # 123
print "$data->{people}->{Bar}->{phones}->[0]\n"; # 345
print "$data->{people}->{Baz}\n"; # NA
YAML in one file
#!/usr/bin/perl
use strict;
use warnings;
use YAML qw(DumpFile LoadFile);
my $data;
my $yml_file = 'data.yml';
if (-e $yml_file) {
$data = LoadFile($yml_file);
} else {
$data = {
people => {
Foo => {
phone => '123',
},
Bar => {
phones => [
'345',
'678',
],
title => 'CEO',
},
Baz => 'NA',
}
};
}
print "Foo: $data->{people}{Foo}{phone}\n"; # Foo: 123
print "Bar: $data->{people}{Bar}{phones}->[0]\n"; # Bar: 345
print "Baz: $data->{people}{Baz}\n"; # Baz: NA
DumpFile $yml_file, $data;
JSON in one file
{
"phones" : {
"Bar" : "21",
"Baz" : "72",
"Foo" : "12"
}
}
#!/usr/bin/perl
use strict;
use warnings;
use JSON qw(from_json to_json);
my $data;
my $json_file = 'data.json';
if (-e $json_file) {
open my $fh, '<', $json_file or die;
local $/ = undef;
my $json = <$fh>;
$data = from_json($json);
} else {
# initialize
$data = {
phones => {
Foo => 0,
Bar => 0,
Baz => 0,
}
};
}
# update data
$data->{phones}{Foo} += 1 * int rand 5;
$data->{phones}{Bar} += 3 * int rand 5;
$data->{phones}{Baz} += 9 * int rand 5;
print "Foo: $data->{phones}{Foo}\n";
print "Bar: $data->{phones}{Bar}\n";
print "Baz: $data->{phones}{Baz}\n";
open my $out, '>', $json_file or die;
print $out to_json($data, { pretty => 1, utf8 => 1, });
Main uses of references
- Creating complex data structure (2 or more dimensional arrays, hashes and more complex structures)
- Passing arrays, hashes and more complex data structure to and from functions
Exercise: save ini and csv as YAML
Take the ini-file reader and the csv reader and save the resulting data structures in YAML format.
Exercise: Create a cache for NetSlow
You are using a module call NetSlow.pm and its sole function compute();
This function can get two numbers and after a remote procedure call
(that takes a lot of time) returns a single value. Create a local cache
of the given input values and results so you won't need to access the
remote machine for identical calls. (Make sure your cache is persistent
between execution of your script.
examples/references/NetSlow.pm
Here is how we can use the module:
#!/usr/bin/perl
use strict;
use warnings;
use lib 'examples/references';
use NetSlow qw(compute);
die "Need 2 numbers\n" if @ARGV != 2;
print compute(@ARGV), "\n";
Later on you find out that the results are changing over time.
You don't want to drop the caching and you decide you can live with
certain lack of accuracy. You decide you fetch the value again if
the last call to that specific set of input values was computed more
than 5 seconds ago. That is you expire the cache after 5 seconds.
Exercise: create a function that generates numbers multipliers
Create a function that given a number returns a subroutine reference. Calling that subroutine with an array reference will then multiply each value by the originally given number.
my $duplicate = multiplier_generator(2);
my @numbers = (2, 4, 7);
$duplicate->(\@numbers);
print "@numbers\n"; # 4 8 14
Solution: save ini and csv as YAML
Just add the following code:
use YAML qw(DumFile);
DumpFile 'ini.yml', \@data; # in read_csv_file_hash.pl
DumpFile 'ini.yml', \%ini; # in read_ini_file.pl
DumpFile 'csv.yml', \%data; # in read_csv_file_hash.pl
Solution: Create a cache for NetSlow
#!/usr/bin/perl
use strict;
use warnings;
use lib 'examples/references';
use NetSlow qw(compute);
use YAML qw(DumpFile LoadFile);
die "Need 2 numbers\n" if @ARGV != 2;
my $cache_file = "netslow_cache.yml";
my $cache;
if (-e $cache_file) {
$cache = LoadFile($cache_file);
}
my ($x, $y) = @ARGV;
my $result;
if (not defined $cache->{$x}{$y}) {
$cache->{$x}{$y} = compute($x, $y);
}
DumpFile($cache_file, $cache);
print $cache->{$x}{$y}, "\n";
Solution: NetSlow cache with timeout
#!/usr/bin/perl
use strict;
use warnings;
use lib 'examples/references';
use NetSlow qw(compute);
use YAML qw(DumpFile LoadFile);
die "Need 2 numbers\n" if @ARGV != 2;
my $cache_file = "netslow_cache_timeout.yml";
my $cache;
if (-e $cache_file) {
$cache = LoadFile($cache_file);
}
my ($x, $y) = @ARGV;
my $result;
my $TIMEOUT = 5;
if (not defined $cache->{$x}{$y} or
$cache->{$x}{$y}{timestamp} < time() - $TIMEOUT ) {
$cache->{$x}{$y}{value} = compute($x, $y);
$cache->{$x}{$y}{timestamp} = time;
}
DumpFile($cache_file, $cache);
print $cache->{$x}{$y}{value}, "\n";
Resources
Perl documentation:
- perlreftut - Mark Jason Dominus's very short tutorial about references
- perllol - Arrays of Arrays
- perldsc - Data Structure Cookbook
- perlref - Perl references and nested data structures
- perldata - Perl data types
Books:
- Perl Objects, References & Modules (Intermediate Perl)
- Advanced Perl Programming 1st edition !
- Object Oriented Perl
Processes and signals
Signals and the kill function
- $SIG
- %SIG
- kill
List of signals on Linux: man -S 7 signal
To send a signal use kill SIG, LIST
(process IDs)
For example: kill 9, $pid;
use strict;
use warnings;
my ($sig, @process) = @ARGV;
die "Usage: $0 SIGNAL PROCESS-ID\n" if not @process;
kill $sig, @process;
Catch signals
#!/usr/bin/env perl
use strict;
use warnings;
$SIG{KILL} = sub {
print "KILL received. kill -9 We cannot catch it.\n";
};
$SIG{INT} = sub {
print "INT received. kill -2 or Ctrl-C\n";
};
$SIG{TERM} = sub {
print "TERM received. kill -15\n";
};
$SIG{TSTP} = sub {
print "TSTP received. kill -20 or Ctrl-Z\n";
};
print "Please press Ctrl-C or Ctrl-Z\n";
print "(or type kill -15 $$ on the command line)\n";
for my $i (1..100) {
sleep 1;
print "$i\n";
}
print "done\n";
$SIG{$name} = sub {}; # do something when signal received
$SIG{$name} = 'IGNORE'; # ignore it
$SIG{$name} = 'DEFAULT'; # reset to the default behavior
#!/usr/bin/env perl
use strict;
use warnings;
my $ctrl_c = 0;
$SIG{INT} = sub {
if ($ctrl_c) {
print "\nCtrl-C received twice\n";
exit();
}
$ctrl_c++;
print "\nPlease, press Ctr-C again if you really mean it\n";
};
print "Please press Ctrl-C to terminate this program:\n";
my $i = 0;
while (1) {
$i++;
print "$i\n";
sleep 1;
}
print "done\n";
Exercise: Catch ctrl-c and ask continue or terminate?
Take the examples/signals/catch_ctr_c.pl
and change it so when the user presses Ctrl-C the counting
stops and the user is asked if she really wants
to terminate the program. (y/n).
If yes is given then quit. If no is given continue.
If Ctrl-c is pressed again later then ask again.
Make sure you do as little as possible in the actual signal handle.
Solution: Catch ctrl-c and ask
#!/usr/bin/env perl
use strict;
use warnings;
my $ctrl_c;
$SIG{INT} = sub {
$ctrl_c = 1;
};
print "Please press Ctrl-C to terminate this program:\n";
for my $i (1..100) {
confirm_exit();
sleep 1;
print "$i\n";
}
print "done\n";
sub confirm_exit {
return if not $ctrl_c;
local $SIG{INT} = 'IGNORE';
$ctrl_c = 0;
print "Do you really want to terminate the application? (y/n) [n]";
chomp(my $input = <STDIN>);
exit if $input eq 'y' or $input eq 'Y';
}
See what happens if you don't set the SIG handler to IGNORE?
See what happens if you remove the word local?
Graceful termination of process
This process will catch both the INT and the TERM signals and in both cases it will flip a flag and based on that flag it will
stop he program. INT is received when the user presses Ctrl-C or if using a another terminal the user sends the kill -2
signal.
TERM is kill -15
.
When we run the program it will print out its process ID and instructions how to stop it.
kill -15 810344 or press Ctrl-C to stop
So Ctrl-C instead of just killing the process in mid-operation it will tell the process to "please stop" and the loop will finish whatever it was doing and exit the program.
#!/usr/bin/env perl
use strict;
use warnings;
print "kill -15 $$ or press Ctrl-C to stop\n";
my $continue = 1;
$SIG{TERM} = sub {
print "TERM received\n";
$continue = 0;
};
$SIG{INT} = sub {
print "INT received\n";
$continue = 0;
};
while ($continue) {
if (open my $fh, '>>', 'process.log') {
print $fh scalar localtime();
print $fh "\n";
}
sleep 1;
}
Advanced slides
Always use strict and warnings
- strict
- warnings
Always start your Perl code with:
use strict;
use warnings;
Example code that generates warning:
Add code everywhere to avoid warnings:
#!/usr/bin/perl
use strict;
use warnings;
my $total;
add();
print "$total\n";
sub add {
$total = $total + rand();
}
# triggers:
# Use of uninitialized value in addition (+)
# at examples/advanced/code_with_warnings.pl line 14.
# or in 5.10
# Use of uninitialized value $total in addition (+)
# at examples/advanced/code_with_warnings.pl line 14.
Avoid warnings
#!/usr/bin/perl
use strict;
use warnings;
my $total;
add();
print "$total\n";
sub add {
if (not defined $total) {
$total = 0;
}
$total = $total + rand();
}
A lot of work. Cannot be done easily to an existing application.
Turn off warnings selectively and in a small scope
- no warnings
#!/usr/bin/perl
use strict;
use warnings;
my $total;
add();
print "$total\n";
sub add {
no warnings 'uninitialized';
$total = $total + rand();
}
See
perldoc warnings for the use of warnings
perldoc perllexwarn for the warning categories
perldoc perldiag for a list of warnings and errors
Catch and log warnings
- $SIG
- WARN
#!/usr/bin/perl
use strict;
use warnings;
local $SIG{__WARN__} = sub {
my $message = shift;
logger('warning', $message);
};
my $total;
add();
print "$total\n";
sub add {
$total = $total + rand();
}
# Use of uninitialized value in addition (+)
# at examples/advanced/code_with_warnings.pl line 14.
sub logger {
my ($level, $msg) = @_;
if (open my $out, '>>', 'log.txt') {
chomp $msg;
print $out "$level - $msg\n";
}
}
%SIG holds all the signals perl can deal with with the two special signal handles WARN and DIE sub {} is an anonymous subroutine we will discuss later
#!/usr/bin/perl
use strict;
use warnings;
my %WARNS;
local $SIG{__WARN__} = sub {
my $message = shift;
return if $WARNS{$message}++;
logger('warning', $message);
}
my $counter;
count();
print "$counter\n";
$counter = undef;
count();
sub count {
$counter = $counter + 42;
}
# Use of uninitialized value in addition (+)
# at examples/advanced/code_with_warnings.pl line 14.
sub logger {
my ($level, $msg) = @_;
if (open my $out, '>>', 'log.txt') {
chomp $msg;
print $out "$level - $msg\n";
}
}
splain and use diagnostics
- splain
- diagnostics
perl examples/advanced-perl/code_with_warnings.pl 2> err.txt
splain err.txt
The output looks like this:
Use of uninitialized value $total in addition (+) at
examples/advanced-perl/use_diagnostics.pl line 15 (#1)
(W uninitialized) An undefined value was used as if it were already
defined. It was interpreted as a "" or a 0, but maybe it was a mistake.
To suppress this warning assign a defined value to your variables.
To help you figure out what was undefined, perl will try to tell you the
name of the variable (if any) that was undefined. In some cases it cannot
do this, so it also tells you what operation you used the undefined value
in. Note, however, that perl optimizes your program and the operation
displayed in the warning may not necessarily appear literally in your
program. For example, "that $foo" is usually optimized into "that "
. $foo, and the warning will refer to the concatenation (.) operator,
even though there is no . in your program.
Alternatively you could also insert the following in your code:
use diagnostics;
to get the explanations for every warning.
See also perldoc perldiag
for a detailed explanation of each warning and error.
Fatal warnings
- FATAL
use warnings FATAL => 'all';
Logging Exceptions
- DIE
$SIG{__DIE__} = sub { logger('error', $_[0]) };
#!/usr/bin/perl
use strict;
use warnings;
$SIG{__DIE__} = sub { logger('error', $_[0]); };
print "Before\n";
die 'Something bad';
print "After\n";
sub logger {
my ($level, $msg) = @_;
if (open my $out, '>>', 'log.txt') {
chomp $msg;
print $out "$level - $msg\n";
}
}
Always open files in the new way
- open
Old way:
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "data.txt";
open(FH, ">$filename") or die;
print FH "data";
close FH;
open(FH, $filename) or die;
my $line = <FH>;
close FH;
Recommended way:
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "data.txt";
open(my $wfh, '>', $filename) or die;
print $wfh "data";
close $wfh;
open(my $rfh, '<', $filename) or die;
my $line = <$rfh>;
close $rfh;
Security problems.
Being global, difficult to pass as parameter to functions.
Array slices
#!/usr/bin/perl
use strict;
use warnings;
my @letters = qw(a b c d e f g h);
print "@letters\n"; # a b c d e f g h
print "$letters[3]\n"; # d the element with index 3
print "@letters[3]\n"; # d (a one element array !)
# generates a warning:
# Scalar value @letters[3] better written as $letters[3]
# at examples/advanced/array_slices.pl line 9.
my @data1 = ($letters[3], $letters[2]); # d c
print "@data1\n";
my @data2 = @letters[3, 2]; # d c
print "@data2\n"; # d c
print "@letters[3..5]\n"; # d e f
print "@letters[3,3,2]\n"; # d d c
A few more examples
my @i = (3, 5, 7);
print "@i\n"; # 3 5 7
print "@letters[@i]\n"; # d f h
print "@letters[split ' ', '3 5 7']\n"; # d f h
Array slices on the fly
#!/usr/bin/perl
use strict;
use warnings;
my $line = "named:x:44:66:Nameserver Daemon:/var/named:/bin/bash";
my @fields = split ":", $line; # fetch 3 values like this
my $uid = $fields[2];
my $gid = $fields[3];
my $home = $fields[5];
($uid, $gid, $home) = @fields[2,3,5];
($uid, $gid, $home) = (split ":", $line)[2,3,5]; # or like this
print "uid: $uid\n";
print "gid: $gid\n";
print "home: $home\n";
Hash slices
#!/usr/bin/perl
use strict;
use warnings;
my %phone_of = (
Foo => '123-foo',
Bar => '123-bar',
Baz => '123-baz',
Moo => '123-moo',
);
print "$phone_of{Foo}\n";
my @phones = ($phone_of{Foo}, $phone_of{Bar});
@phones = @phone_of{'Bar', 'Baz'};
@phones = @phone_of{ qw(Bar Baz) };
my @selected_people = qw(Bar Baz);
@phones = @phone_of{ @selected_people };
print "@phones\n";
Hash slices in assignment
#!/usr/bin/perl
use strict;
use warnings;
my @fields = qw(fname lname phone);
my @values = qw(Peti Bar 12345);
my %h;
@h{@fields} = @values;
print "$h{fname}\n"; # Peti
print "$h{lname}\n"; # Bar
print "$h{phone}\n"; # 12345
splice
#!/usr/bin/perl
use strict;
use warnings;
my @names = qw(Foo Bar Baz Moo Qux Barney Hoppy Bammbamm Dino);
my @more_names = qw(Fred Wilma);
my @sublist = splice(@names, 2, 3);
print "@sublist\n"; # Baz Moo Qux
print "@names\n"; # Foo Bar Barney Hoppy Bammbamm Dino
my @zlist = splice(@names, 2, 3, @more_names);
print "@zlist\n"; # Barney Hoppy Bammbamm
print "@names\n"; # Foo Bar Fred Wilma Dino
Splice to slice and dice arrays in Perl
LIST and SCALAR context
- LIST
- SCALAR
- context
See the behavior of localtime()
.
#!/usr/bin/perl
use strict;
use warnings;
my $t = time;
my $lt = localtime($t);
print "$lt\n"; # Fri May 20 11:26:23 2011
my @time = localtime($t);
print "@time\n"; # 23 26 11 20 4 111 5 139 1
How could we implement something similar?
wantarray
When called within a function it will return undef, true or false depending on how was the function called.
undef if it was called in a void context like f();
true if it was called in a list context like @x = f(); or print f();
false if it was called in scalar context like $x = f(); or if($f()) {...}
#!/usr/bin/perl
use strict;
use warnings;
sub f {
if (not defined wantarray()) {
print "called in VOID context\n";
} elsif (wantarray()) {
print "called in LIST context\n";
} else {
print "called in SCALAR context\n";
}
}
f(); # called in VOID context
my @y = f(); # called in LIST context
my $x = f(); # called in SCALAR context
wantarray example
#!/usr/bin/perl
use strict;
use warnings;
sub count {
if (not defined wantarray) {
print "ERROR - function called in void context\n";
return;
}
my $count = @_;
if (not wantarray) {
#print "SCALAR\n";
return $count;
}
my $sum=0;
while (my $v = shift @_) {
$sum += $v;
}
#print "ARRAY\n";
return ($count, $sum);
}
count(); # call in void context, ERROR message
my (@x) = count(2,3,5); # LIST context
print "@x\n"; # 3 10
my $z = count(2,3,5); # SCALAR context
print "$z\n"; # 3
See also Want
and Contextual::Return
for even more options.
Slow sorting
The problem: bad performance
#!/usr/bin/perl
use strict;
use warnings;
my @files = glob "*.xml";
my @sorted_files = sort { -s $a <=> -s $b } @files;
print "@sorted_files\n";
Speed up sorting
Reduce the number of slow calls from n**2
to n
.
#!/usr/bin/perl
use strict;
use warnings;
my @files = glob "*.xml";
my @unsorted_pairs = map { [$_, -s $_] } @files;
my @sorted_pairs = sort { $a->[1] <=> $b->[1] } @unsorted_pairs;
my @quickly_sorted_files = map { $_->[0] } @sorted_pairs;
print "@quickly_sorted_files\n";
Schwartzian transformation
#!/usr/bin/perl
use strict;
use warnings;
my @files = glob "*.xml";
my @quickly_sorted_files =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, -s $_] }
@files;
print "@quickly_sorted_files\n";
Compilation phases: BEGIN, CHECK, INIT, END
- BEGIN
- CHECK
- INIT
- END
BEGIN {
# do something here
}
* You can have more than one of each one
BEGIN - execute as soon as possible (compiled)
CHECK - execute after the whole script was compiled
INIT - execute before running starts
END - execute as late as possible (after exit() or die())
When running perl -c script.pl both the BEGIN and CHECK blocks are executed.
perldoc perlmod
#!/usr/bin/perl
use strict;
use warnings;
END {
print "END\n";
}
print "BODY\n";
my $x = 0;
print 1/$x; # error the program stops working
print "AFTER ERROR\n";
BEGIN {
print "BEGIN\n";
}
print "After BEGIN block\n";
BEGIN
BODY
Illegal division by zero at examples/other/begin_end.pl line 11.
END
#!/usr/bin/perl
use strict;
use warnings;
BEGIN {
print "BEGIN\n";
}
CHECK {
print "CHECK\n";
}
INIT {
print "INIT\n";
}
END {
print "END\n";
}
print "BODY\n";
BEGIN
CHECK
INIT
BODY
END
AUTOLOAD
- AUTOLOAD
- $AUTOLOAD
#!/usr/bin/perl
use strict;
use warnings;
f("hello", "world");
Output:
Undefined subroutine &main::f called
If you call a function that does not exist when the call is made, Perl will raise an Undefined subroutine called
exception. If the exception is
not handled it will stop your program.
In the exception it will also give you the full name of the function that was missing. In the above case it will be &main::f
.
With all the other magic, Perl also provides you a tool that will help you deal with such situations. You can include a block called AUTOLOAD in your code and if AUTOLOAD {} is defined then it will be executed instead of every missing function. From that point your imagination is the only limiting factor on how to handle the situation.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
f("hello", "world");
AUTOLOAD {
our $AUTOLOAD;
print "$AUTOLOAD\n";
print Dumper \@_;
}
Output:
main::f
$VAR1 = [
'hello',
'world'
];
Static variable
#!/usr/bin/perl
use strict;
use warnings;
{
my $counter = 0;
sub inc {
$counter++;
return $counter;
}
}
print inc(), "\n";
print inc(), "\n";
print inc(), "\n";
print inc(), "\n";
Exercise: create counter
Create a script with a function called count() which is capable of maintaining several counters.
#!/usr/bin/perl
use strict;
use warnings;
print counter('a'), "\n"; # 1
print counter('b'), "\n"; # 1
print counter('a'), "\n"; # 2
print counter('c'), "\n"; # 1
print counter('a'), "\n"; # 3
print counter('a'), "\n"; # 4
print counter('b'), "\n"; # 2
Solution: create counter
#!/usr/bin/perl
use strict;
use warnings;
print counter('a'), "\n"; # 1
print counter('b'), "\n"; # 1
print counter('a'), "\n"; # 2
print counter('c'), "\n"; # 1
print counter('a'), "\n"; # 3
print counter('a'), "\n"; # 4
print counter('b'), "\n"; # 2
{
my %counter;
sub counter {
my $id = shift;
return ++$counter{$id};
}
}
Saved variable: local
- local
Slurp mode used local $/ = undef;
#!/usr/bin/perl
use strict;
use warnings;
our $global = 100;
our $my = 200;
our $local = 300;
print_variables();
# $global 100
# $my 200
# $local 300
set();
# $global 1
# $my 2
# $local 3
# inside print_variables():
# $global 1
# $my 200
# $local 3
print_variables();
# $global 1
# $my 200
# $local 300
sub set {
$global = 1;
my $my = 2;
local $local = 3;
print "$global\n";
print "$my\n";
print "$local\n";
print "---\n";
print_variables();
}
sub print_variables {
print "$global\n";
print "$my\n";
print "$local\n";
print "---\n";
}
100
200
300
---
1
2
3
---
1
200
3
---
1
200
300
---
use strict;
use warnings;
use Data::Dumper;
my %data = (
foo => 1,
bar => 1,
moo => 1,
);
_dump();
_local_sorted();
_dump();
_sorted();
_dump();
sub _local_sorted {
local $Data::Dumper::Sortkeys = 1;
_dump();
}
sub _sorted {
$Data::Dumper::Sortkeys = 1;
_dump();
}
sub _dump {
print Dumper \%data;
}
$VAR1 = {
'bar' => 1,
'moo' => 1,
'foo' => 1
};
$VAR1 = {
'bar' => 1,
'foo' => 1,
'moo' => 1
};
$VAR1 = {
'bar' => 1,
'moo' => 1,
'foo' => 1
};
$VAR1 = {
'bar' => 1,
'foo' => 1,
'moo' => 1
};
$VAR1 = {
'bar' => 1,
'foo' => 1,
'moo' => 1
};
autodie
- autodie
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
open my $fh, '<', 'data.txt';
print "ok\n";
Can't open 'data.txt' for reading: 'No such file or directory' at examples/advanced-perl/autodie.pl line 7
Modern::Perl
- Modern::Perl
use Modern::Perl;
Loads strict, warnings, the features of 5.10 and the c3 Method Resolution Order
Check out the Modern Perl book
Perl::Critic
- Perl::Critic
- perlcritic
Originally based on the Perl Best Practices by Damian Conway. Written by Jeffrey Thalhammer. Try Perl::Critic online.
perlcritic -h
perlcritic -5 file_name.pl
Perl::Tidy
- Perl::Tidy
- perltidy
perltidy
caller
- caller
#!/usr/bin/perl
use strict;
use warnings;
fib(4);
sub fib {
my $n = shift;
my ($package, $filename, $line) = caller(0);
print "$package $filename $line\n";
if ($n == 1 or $n == 2) {
return 1;
}
return fib($n-1) + fib($n-2);
}
Log::Dispatch
- Log::Dispatch
#!/usr/bin/perl
use strict;
use warnings;
use Log::Dispatch;
use Log::Dispatch::Screen;
my $logger = Log::Dispatch->new;
$logger->add( Log::Dispatch::Screen->new(
name => 'screen',
#min_level => 'warning',
min_level => 'debug',
));
fib(4);
fib(0);
sub fib {
my $n = shift;
$logger->debug("fib($n)");
if ($n < 1) {
$logger->error("fib($n) is invalid");
return;
}
if ($n == 1 or $n == 2) {
return 1;
}
return fib($n-1) + fib($n-2);
}
Log::Log4perl easy
- Log::Log4perl
use strict;
use warnings;
use 5.010;
use My::EasyApp;
use Log::Log4perl qw(:easy);
Log::Log4perl->easy_init($WARN);
FATAL "This is", " fatal";
ERROR "This is error";
WARN "This is warn";
INFO "This is info";
DEBUG "This is debug";
TRACE "This is trace";
my $app = My::EasyApp->new;
$app->run;
package My::EasyApp;
use strict;
use warnings;
use Log::Log4perl; #qw(get_logger);
sub new {
bless {}, shift;
}
sub run {
my $logger = Log::Log4perl->get_logger();
$logger->fatal("FATAL from EasyApp");
$logger->debug("DEBUG from EasyApp");
}
1;
2014/08/24 08:02:52 This is fatal
2014/08/24 08:02:52 This is error
2014/08/24 08:02:52 This is warn
2014/08/24 08:02:52 FATAL from EasyApp
Exercise: Log::Dispatch
Take the examples/advanced-perl/log_dispatch.pl and change it so only warnings are printed to the screen. debug messages are printed to a log file. The name of the file should be in the format YYYY-MM-DD.log The log line should include the name file name and the line number where it was called.
Check out the strftime of the POSIX module and see how Log::Dispatch allows you to provide a callback function.
Solution Log::Dispatch
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Log::Dispatch;
use Log::Dispatch::Screen;
use Log::Dispatch::File;
use POSIX ();
my $logger = Log::Dispatch->new;
$logger->add( Log::Dispatch::Screen->new(
name => 'screen',
min_level => 'warning',
));
$logger->add( Log::Dispatch::File->new(
name => 'file',
min_level => 'debug',
filename => POSIX::strftime("%Y-%m-%d", localtime) . '.log',
callbacks => \&callback,
));
fib(4);
fib(0);
sub fib {
my $n = shift;
$logger->debug("fib($n)");
if ($n < 1) {
$logger->error("fib($n) is invalid");
$logger->log(level => 'error', message => 'another');
return;
}
if ($n == 1 or $n == 2) {
return 1;
}
return fib($n-1) + fib($n-2);
}
sub callback {
my %arg = @_;
my ($package, $filename, $line) = caller(6);
return "$arg{message} in $filename line $line\n";
}
Functional Programming in Perl
Programming Paradigms
- Programming paradigm
- procedural
- object oriented
- declarative (SQL)
- functional
Functional programming
- Immutability (variables don't change)
- Separation of data and functions.
- Pure functions (no side-effects)
- First-class functions (you can assign function to another name and you can pass function to other functions and return them as well. We can also manipulate functions)
- Higher order functions: a functions that either takes a function as a parameter or returns a function as a parameter.
grep
- grep
- filter
The grep
keyword in Perl is a generalization of the Unix/Linux grep tool. Given a condition and a list of values it will return a, usually shorter, list
of elements that will return true if used in the expression. In other language the similar tool is called filter
.
In this example we have an array of numbers and an expression comparing $_
which holds the current value as grep iterates over the elements of the array. If the current value
is greater or equal than 5 then it will be passed to the left hand side, if it is less than 5 then it will be filtered out.
Note, there is no comma after the curly braces.
ARRAY = grep BLOCK LIST
#!/usr/bin/perl
use strict;
use warnings;
my @numbers = qw(8 2 5 3 1 7);
my @big_numbers = grep {$_ >= 5} @numbers;
print "@big_numbers\n"; # (8, 5, 7)
grep to filter files based on modification date
- grep
#!/usr/bin/perl
use strict;
use warnings;
my @files = glob "*.xml";
my @old_files = grep { -M $_ > 365 } @files;
print join "\n", @old_files;
Imitating the Unix/Linux grep command in Perl
- grep
#!/usr/bin/perl
use strict;
use warnings;
my $regex = shift;
print grep { $_ =~ /$regex/ } <>;
map
- map
map
will transform the content of a list.
Given a list of values that can come from an array or from calling the keys
function on a hash or in any other way, we can apply a transformation
on each element and then collect the transformed values on the left hand side of the assignment. e.g. in an array.
On each iteration the current element is placed in the $_
variable, the code in the block is executed, and the result is passed to the left-hand-side
that collects all the responses.
ARRAY = map BLOCK LIST
#!/usr/bin/perl
use strict;
use warnings;
my @numbers = (1, 4, 7, 3);
my @doubles = map {$_ * 2} @numbers;
print "@doubles\n"; # 2 8 14 6
Use map to filter values
I am not sure why would you do this instead of using grep
, but you can do this and this can bring use to another,
more usefule example.
#!/usr/bin/perl
use strict;
use warnings;
my @numbers = (1, 4, 17, 3, 28, 4);
my @big_numbers = map {$_ >= 10 ? $_ : ()} @numbers;
print "@big_numbers\n"; # 17 28
Map to add more elements
#!/usr/bin/perl
use strict;
use warnings;
my @numbers = (1, 4, 17);
my @big_numbers = map {($_, $_)} @numbers;
print "@big_numbers\n"; # 1 1 4 4 17 17
Use map to filter and enrich
#!/usr/bin/perl
use strict;
use warnings;
my @numbers = (1..5);
my @riches = map { $_ > 3 ? ($_+1, $_+2) : () } @numbers;
print "@riches\n"; # 5 6 6 7
Create a hash from an array using map
-
map
-
We have a list of values. We would like to build a fast look-up table to check for existence.
The time it takes to check if a value can be found in an array is proportional to the length of the array. The complexity is O(n).
If you need to do it a lot of times you might be better off building a hash where the keys are the items coming from the array. The values don't matter as we will check the existance of a key. (Alternatively you can set the values of the hash to be 1 and then you can check if the the value is there.) The time it takes to look up a key in a hash does not depend on the size of the hash. It is O(1). So once we have the hash the look-up will be much faster. Building the hash is proportional to the number of items in the array.
So if we need to look up a very small number of elements or if the original array is small then probably it is better to just use the array.
If we need a lot of look-ups and there are many elements in the original array then building a temporary look-up hash might be a good idea.
We use more memory but we can gain speed.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my @words = qw(Foo Bar Baz);
my %copy_paste = (Foo => 1, Bar => 1, Baz => 1);
my %lookup = map {$_ => 1} @words;
print Dumper \%copy_paste;
print Dumper \%lookup;
# $VAR1 = {
# 'Bar' => 1,
# 'Baz' => 1,
# 'Foo' => 1
# };
Unique values
- unique
You have a list of values with duplications, how can you create a unique list of the values?
#!/usr/bin/env perl
use strict;
use warnings;
my @data = qw(Earth Mars Earth Venus Earth Mars);
my @unique;
my %seen;
foreach my $value (@data) {
if (! $seen{$value}) {
push @unique, $value;
$seen{$value}++;
}
}
print "@unique\n"; # Earth Mars Venus
Unique values - improved
But actually we don't need to do it in two steps:
#!/usr/bin/env perl
use strict;
use warnings;
my @data = qw(Earth Mars Earth Venus Earth Mars);
my @unique;
my %seen;
foreach my $value (@data) {
if (! $seen{$value}++ ) {
push @unique, $value;
}
}
print "@unique\n"; # Earth Mars Venus
Unique values using grep
Of course there is an even shorter way to write it:
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
my @data = qw(Earth Mars Earth Venus Earth Mars);
#my %seen;
#my @unique = grep { !$seen{$_}++ } @data;
my @unique = do { my %seen; grep { !$seen{$_}++ } @data };
print "@unique\n"; # Earth Mars Venus
In this version you can even assign the values back to the original array writing:
@data = grep { !$seen{$_}++} @data;
Uniq with List::MoreUtils
- uniq
- distinct
- List::MoreUtils
There are several ways to implement this without using an external module, but why would we want to reinvent the wheel when there is already a good solution in the List::MoreUtils module?
The only problem, but we see it all over the programming world is that this function called "uniq" would return a list of distinct elements instead of the ones that were unique in the original list.
#!/usr/bin/env perl
use strict;
use warnings;
use List::MoreUtils qw(uniq);
my @data = qw(Earth Mars Earth Venus Earth Mars);
my @unique = uniq @data;
print "@unique\n"; # Earth Mars Venus
Closures
-
closure
A subroutine that generates subroutines based on some input values.
#!/usr/bin/perl
use strict;
use warnings;
sub incrementer {
my $old = $_[0];
$_[0] += $_[1];
return $old;
}
my $x = 23;
my $old = incrementer($x, 19);
print "$old $x\n"; # 23 42
sub incrementer_23 {
my $inc = 23;
my $old = $_[0];
$_[0] += $inc;
return $old;
}
#!/usr/bin/perl
use strict;
use warnings;
sub incrementer_generator {
my ($inc) = @_;
return sub {
my $old = $_[0];
$_[0] += $inc;
return $old;
}
}
my $inc19 = incrementer_generator(19);
my $x = 23;
my $old_value = $inc19->($x);
print "$x\n";
print "old after inc19: $old_value\n";
my $inc5 = incrementer_generator(5);
print "old after inc5: " . $inc5->($x) . "\n";
print "$x\n";
my $prev_value = $inc19->($x);
print "old after inc19: $prev_value\n";
print "$x\n";
Output:
42
old after inc19: 23
old after inc5: 42
47
old after inc19: 47
66
package Transformers;
use strict;
use warnings;
use Time::HiRes qw(time);
use Exporter qw(import);
our @EXPORT_OK = qw(show_elapsed_time);
sub show_elapsed_time {
my (@subs) = @_;
my $caller = caller();
for my $sub (@subs) {
my $name = "${caller}::$sub"; # fully qualified name
no strict 'refs';
my $subref = \&$name;
my $new = sub {
my $start = time;
my (@results, $result);
if (wantarray) {
@results = $subref->(@_);
} elsif (defined wantarray) {
$result = $subref->(@_);
} else {
$subref->(@_);
}
my $end = time;
my $elapsed = $end - $start;
print "Elapsed time of $name: $elapsed\n";
return wantarray ? @results : $result;
};
no warnings 'redefine';
*{$name} = $new;
}
return;
}
1;
use strict;
use warnings;
sub count {
my ($limit) = @_;
print "count till $limit\n";
my $counter = 0;
$counter++ while $counter < $limit;
}
sub show_elapsed {
my ($sub) = @_;
use Time::HiRes qw(time);
return sub {
my $start = time;
$sub->(@_);
my $end = time;
my $elapsed = $end - $start;
print "Elapsed time: $elapsed\n";
}
}
my $newcount = show_elapsed(\&count);
$newcount->(10000000);
#*count = $newcount;
count(10000000);
use strict;
use warnings;
use lib '.';
use Transformers qw(show_elapsed_time);
show_elapsed_time('count', 'add', 'calc');
sub count {
my ($limit) = @_;
print "count till $limit\n";
my $counter = 0;
$counter++ while $counter < $limit;
}
sub add {
my ($x, $y) = @_;
print "Add $x + $y\n";
return $x+$y;
}
sub calc {
my ($x, $y) = @_;
print "Calc $x $y\n";
return $x+$y, $x*$y;
}
count(10000000);
my $res = add(2, 3);
print "Res: $res\n";
my @res = calc(2, 3);
print "Res: @res\n";
Perl Libraries and Modules
Lack of code reuse
- common functions
- copy-paste
- Perl4 solution: create libraries
Perl library (perl4 style)
$base = 10;
sub add {
validate_parameters(@_);
my $total = 0;
$total += $_ for (@_);
return $total;
}
sub multiply {
}
sub validate_parameters {
die 'Not all of them are numbers'
if grep {/\D/} @_;
return 1;
}
1;
#!/usr/bin/perl
require "examples/modules/library.pl";
print add(19, 23);
print "\n";
print "$base\n";
The 1; at the end of the library is needed in order to make sure the compilation of library.pl returns a true value.
Otherwise one could get an error such as this one:
examples/modules/library.pl
did not return a true value
Problems with Libraries
- All or nothing
- Potential collision (redefine)
- Global variables
While libraries can help in code reuse they still lack many features.
Prefix everything with unique name
$calc_base = 10;
sub calc_add {
calc_validate_parameters(@_);
my $total = 0;
$total += $_ for (@_);
return $total;
}
sub calc_multiply {
}
sub calc_validate_parameters {
die 'Not all of them are numbers'
if grep {/\D/} @_;
return 1;
}
1;
But we always have to use this prefix. Even within the library.
Namespaces
Perl has a solution to this problem using namespaces, also called packages.
#!/usr/bin/perl
use strict;
use warnings;
package Calc;
use strict;
use warnings;
sub add {
my $total = 0;
$total += $_ for (@_);
return $total;
}
package main;
print Calc::add(3, 4), "\n";
# add(3, 4);
# would die with:
# Undefined subroutine &main::add called
# at examples\modules\namespaces.pl line 20.
Solution with namespace
#!/usr/bin/perl
use strict;
use warnings;
require "examples/modules/namespace_lib.pl";
print Calc::add(3, 4), "\n";
package Calc;
use strict;
use warnings;
my $base = 10;
sub add {
validate_parameters(@_);
my $total = 0;
$total += $_ for (@_);
return $total;
}
sub multiply {
}
sub validate_parameters {
die 'Not all of them are numbers'
if grep {/\D/} @_;
return 1;
}
1;
Here, within the Calc namespace you don't have to use the full name, only when you are using it outside.
Modules
We could have placed the package keyword and the code in the main script or we can put several packages in the same external file but the best approach is to put every package in a separate file having the same name as the package itself (case sensitive) and .pm as file extension.
Then we call it a Perl Module.
#!/usr/bin/perl
use strict;
use warnings;
use lib 'examples/modules';
require Calc;
print Calc::add(3, 4), "\n";
package Calc;
use strict;
use warnings;
my $base = 10;
sub add {
validate_parameters(@_);
my $total = 0;
$total += $_ for (@_);
return $total;
}
sub multiply {
}
sub validate_parameters {
die 'Not all of them are numbers'
if grep {/\D/} @_;
return 1;
}
1;
- How did perl find the file Calc.pm ?
- How could we use add() without the Calc:: ?
- Why did we write "require" instead of "use"?
Packages, @INC and Namespace hierarchy
perl -V
@INC:
C:/strawberry/test-perl/lib
C:/strawberry/test-perl/site/lib
C:\strawberry\perl\vendor\lib
.
require Calc;
- Calc.pm somewhere in @INC
require Math::Calc;
- Math/Calc.pm somewhere in @INC
require Math::Calc::Clever;
- Math/Calc/Clever.pm somewhere in @INC
use, require and import
require Math::Calc;
use Math::Calc qw(add);
BEGIN {
require Math::Calc;
Math::Calc->import( qw(add) );
}
use
is executed at compile time, just as aBEGIN
block.require is executed at run time so if we don't enclose it in a
BEGIN` block it will happen later.
if ($holiday) {
use Vaction::Mode;
}
The above does not make much sense as the use
will load the module
at compile time regardless of day.
if ($holiday) {
require Vacation::Mode;
Vacation::Mode->import;
}
And we don't even need to call import() if we don't care about that.
Export - Import
In order to eliminate the need to use the full name of a subroutine e.g. Calc::add() we can export it from the module and the user can import it into the namespace of her own code.
(we call it A::Calc as we already had a Calc.pm in the previous slide)
#!/usr/bin/perl
use strict;
use warnings;
use lib 'examples/modules';
use A::Calc;
print add(2, 3), "\n";
package A::Calc;
use strict;
use warnings;
use Exporter qw(import);
our @EXPORT = qw(add multiply);
my $base = 10;
sub add {
validate_parameters(@_);
my $total = 0;
$total += $_ for (@_);
return $total;
}
sub multiply {
}
sub validate_parameters {
die 'Not all of them are numbers'
if grep {/\D/} @_;
return 1;
}
1;
Exporter is a standard module that provides the 'import' function called by 'use Module' automatically.
This imports automatically every function (and variable) mentioned in the @EXPORT array. This is nice as the user (the script writer) does not have to think much.
On the other hand it might be bad as the user gets many functions she does not need, just because the module author thinks these functions are useful.
Restrict the import
The user (the script writer) can restrict the list of imported functions but the unsuspecting script write will still get all the functions.
#!/usr/bin/perl
use strict;
use warnings;
use lib 'examples/modules';
use A::Calc qw(add);
print add(2, 3), "\n";
On demand Import
package B::Calc;
use strict;
use warnings;
use Exporter qw(import);
our @EXPORT = qw(add);
our @EXPORT_OK = qw(multiply);
my $base = 10;
sub add {
validate_parameters(@_);
my $total = 0;
$total += $_ for (@_);
return $total;
}
sub multiply {
}
sub validate_parameters {
die 'Not all of them are numbers'
if grep {/\D/} @_;
return 1;
}
1;
#!/usr/bin/perl
use strict;
use warnings;
use lib 'examples/modules';
use B::Calc qw(add);
print add(2, 3), "\n";
Importing
use Module;
Every function (and variable) listed in the @EXPORT array is imported automatically.
use Module ();
Nothing is imported.
use Module qw(foo bar);
Functions foo() and bar() are imported, nothing else. Any function from the @EXPORT and @EXPORT_OK arrays can be requested to be imported. There is also an %EXPORT_TAGS that can be used to define groups of functions to be imported.
Modules - behind the scene
- The functionality of Cwd is implemented in a file called Cwd.pm
- @INC the list of libraries
- perl -V
- perldoc Cwd documentation
- perldoc -l Cwd location /usr/lib/perl5/5.8.6/i386-linux-thread-multi/Cwd.pm
- perldoc -m Cwd module content
Examples of modules that export functions. Look at their source code
- Cwd
- File::Basename
- File::Spec
- File::Spec::Functions
Tools for packaging and distribution
Packaging modules
The semi-standard directory structure of CPAN modules can be also very useful for any Perl application:
dir/
Makefile.PL
Build.PL
dist.ini
README
CHANGES
MANIFEST
MANIFEST.SKIP
META.yml
META.json
lib/
Application/Name.pm
Application/Name/...
script/
application.pl
t/
xt/
sample/
share/
templates/
views/
Makefile.PL of ExtUtils::MakeMaker
use strict;
use warnings;
use 5.008;
use ExtUtils::MakeMaker;
WriteMakefile
(
NAME => 'App',
VERSION_FROM => 'lib/App.pm',
PREREQ_PM => {
'File::Basename' => '0',
'Moose' => '0.24',
},
BUILD_REQUIRES => {
'Test::More' => '0.47'
},
EXE_FILES' => [
'script/app.pl'
],
);
Makefile.PL of Module::Install
use strict;
use warnings;
use 5.008;
use inc::Module::Install 1.00;
name 'App';
license 'perl';
author 'Gabor Szabo';
all_from 'lib/App.pm';
requires 'perl' => '5.008';
requires 'File::Basename'; # no version number
requires 'Win32' => '0.31' if $^O =~ /win32/i; # conditional requirement
test_requires 'Test::More' => '0.47';
homepage 'http://padre.perlide.org/';
bugtracker 'http://padre.perlide.org/trac/';
repository 'http://svn.perlide.org/padre/trunk/Padre/';
install_script 'script/app.pl';
# install_share; # will install the share/ directory from the distribution
# to be found by File::ShareDir::dist_dir('App') late on
WriteAll;
Build.PL of Module::Build
use strict;
use warnings;
use Module::Build;
use 5.008;
my $builder = Module::Build->new(
module_name => 'App',
license => 'perl',
dist_author => 'Gabor Szabo <gabor@szabgab.com>',
dist_abstract => 'Framework to do something',
create_makefile_pl => 0, #'traditional',
script_files => 'script/app.pl',
create_readme => 0,
requires => {
'File::Basename' => '0',
'Moose' => '0.24',
},
build_requires => {
'Test::More' => '0.47',
},
);
$builder->create_build_script();
Changes and README
v0.02 2007.11.23
Added feature for doing something
Fixed strange bug causing trouble (#7)
v0.01 2007.10.12
Releasing first version of Application
Application for bla
See the documentation of the module
by typing now perldoc lib/App/Module.pm
or after installation typing
perldoc App::Module
MANIFEST and MANIFEST.SKIP
Build.PL
CHANGES
Makefile.PL
MANIFEST
META.yml
README
script/app.pl
lib/App.pm
t/01-load.t
# Avoid version control files.
\bRCS\b
\bCVS\b
,v$
\B\.svn\b
\B\.cvsignore$
# Avoid Makemaker generated and utility files.
\bMakefile$
\bblib
\bMakeMaker-\d
\bpm_to_blib$
\bblibdirs$
^MANIFEST\.SKIP$
# Avoid Module::Build generated and utility files.
\bBuild$
\bBuild.bat$
\b_build
# Avoid Devel::Cover generated files
\bcover_db
# Avoid temp and backup files.
~$
\.tmp$
\.old$
\.bak$
\#$
\.#
\.rej$
# Avoid OS-specific files/dirs
# Mac OSX metadata
\B\.DS_Store
# Mac OSX SMB mount metadata files
\B\._
# Avoid archives of this distribution
\bApp-[\d\.\_]+
^MYMETA.yml$
A script
#!/usr/bin/perl
use strict;
use warnings;
use App qw(run);
run();
# If there is a script in the application
# or if the whole application is one or more scripts
# then this is an example of such a script.
A module
package App;
use strict;
use warnings;
use 5.008;
our $VERSION = '0.01';
sub add {
my ($x, $y) = @_;
return $x+$y;
}
=head1 NAME
App - application
=head1 SYNOPSIS
A quick example for the really inpatient.
=head1 DESCRIPTION
=head2 Methods
=over 4
=item method_a
=item method_b
=back
=head1 BUGS
Probably plenty but nothing I know of. Please report them to the author.
=head1 Development
Instructions to those who wish to participate in the development efforts.
E.g. where is the version control system, where is the development mailing
list or forum (if you have one).
=head1 Thanks
Potential thanks to people who helped you.
=head1 AUTHOR
Gabor Szabo <gabor@szabgab.com>
=head1 COPYRIGHT
Copyright 2006 by Gabor Szabo <gabor@szabgab.com>.
This program is free software; you can redistribute it
and/or modify it under the same terms as Perl itself.
See http://www.perl.com/perl/misc/Artistic.html
=cut
1;
Packaging with Makefile.PL
perl Makefile.PL
make
make test
make manifest
make dist
When installing you'll have to type make install
instead of the make manifest
and the make dist
.
Packaging with Build.PL
perl Build.PL
perl Build
perl Build test
perl Build manifest
perl Build dist
When installing you'll have to type perl Build install
instead of the perl Build manifest
and the perl Build dist
.
A test file
use strict;
use warnings;
use Test::More;
plan tests => 3;
use_ok('App');
ok(App::add(1, 1) == 2, "1+1 = 2");
is(App::add(2, 3), 5, "2+3 = 5");
Writing Tests
ok($condition, $name);
is($actual, $expected, $name);
like($actual, qr/$expected/, $name);
use_ok('Module::Name');
TODO: {
local $TODO = "Some excuse";
is(add(1,1,1), 3, "adding 3 numbers");
}
diag "Some message";
diag explain @data_structure;
Test::Most
Test::NoWarnings or Test::FailWarnings
Test::Exception or the newer Test::Fatal
Test::Warn
Test::Deep
Devel::Cover
Exercises: Multiply numbers
- Write a script with a function to multiply numbers sub multiply.
- Once it is ready move the multiply subroutine to the library.pl file that already has the add subroutine and write a script that can use it.
- Create a package called Exercise::Calc that will export the multiply function. Write a script that will use this module and the given functions.
- Write a few test cases for both the add and multiply functions. Including test cases where you try to add 3 or more numbers. As the tests are failing, mark them as TODO.
- Fix the add() function so it passes its new tests with 3 or more parameters.
- Add short user documentation that can be read by perldoc.
- Create the files necessary for distributions and make sure you can generate the tar.gz file.
Solution: Multiply numbers
sub sum {
my $sum = 0;
$sum += $_ for (@_);
return $sum;
}
sub multiply {
return 0 if not @_;
my $res = shift;
$res *= $_ for (@_);
return $res;
}
1;
Overview
- Code reuse
- Perl4 style was putting functions in library files and "require"-ing the files
- Danger of redefining other functions
- Perl Modules are namespaces
- Main name space is called main, separation is done using :: so variable $x in the main script are actually called $main::x
- Within a namespace you can use any function
- This does not redefine any function in the other namespaces
How to create a Perl Module for code reuse?
Error Handling
Returning error status or throwing exception?
foreach my $filename (@names) {
my $fh = open_file($filename);
my $data = read_file($fh);
}
What if open fails and the $fh is undef? Who will notice this problem? Only in read_file we will notice it.
- Developer must work in order to ignore the exception
- Propagating exception upwards is easy: don't do anything
eval block
- eval
#!/usr/bin/perl
use strict;
use warnings;
use English qw( -no_match_vars ) ;
my $mode = shift or die "Usage: $0 good|bad|ugly\n";
my $answer;
eval {
$answer = code($mode);
print "Answer received\n";
};
if ($EVAL_ERROR) { # $@
chomp $EVAL_ERROR;
if ($EVAL_ERROR eq 'bad code') {
warn "exception '$EVAL_ERROR' received\n";
} else {
warn "Unexpected exception '$EVAL_ERROR' received\n";
die $EVAL_ERROR;
}
} else {
print "The answer is $answer\n";
}
sub code {
my $mode = shift;
print "code: $mode\n";
if ($mode eq "bad") {
die "bad code\n";
} elsif ($mode eq 'ugly') {
die 'Some other error';
} else {
return 42;
}
}
- Simple
- built in the language
- but not powerful
- cannot pass context of the error, just a string
Exception handling with Try::Tiny
- Try::Tiny
- try
- catch
#!/usr/bin/perl
use strict;
use warnings;
use Try::Tiny;
my $mode = shift or die "Usage: $0 good|bad|ugly\n";
my $answer;
try {
$answer = code($mode);
print "Answer received\n";
} catch {
chomp $_; # not $@
if ($_ eq 'bad code') {
warn "exception '$_' received\n";
} else {
warn "Unexpected exception '$_' received\n";
die $_;
}
};
if (defined $answer) {
print "The answer is $answer\n";
}
sub code {
my $mode = shift;
print "code: $mode\n";
if ($mode eq 'bad') {
die "bad code\n";
} elsif ($mode eq 'ugly') {
die 'Some other error';
} else {
return 42;
}
}
Carp
- Carp
- carp
- croak
It is better to use croak and carp instead of die and warn.
#!/usr/bin/perl
use strict;
use warnings;
use Module;
Module::f("x"); # Parameter needs to be a number at Module.pm line 10.
Module::g("x"); # Parameter needs to be a number at carp.pl line 8
Module::h("x");
# Parameter needs to be a number at Module.pm line 24
# Module::h('x') called at carp.pl line 9
package Module;
use strict;
use warnings;
use Carp qw(carp cluck);
sub f {
my ($x) = @_;
warn "Parameter needs to be a number" if $x =~ /\D/;
return $x;
}
sub g {
my ($x) = @_;
carp "Parameter needs to be a number" if $x =~ /\D/;
return $x;
}
sub h {
my ($x) = @_;
cluck "Parameter needs to be a number" if $x =~ /\D/;
return $x;
}
1;
Carp::Clan
- Carp::Clan
#!/usr/bin/perl
use strict;
use warnings;
use App::Module;
App::Module::f("x");
# Parameter needs to be a number at App/Code.pm line 11.
App::Module::g("x");
# Parameter needs to be a number at App/Module.pm line 15
App::Module::h("x");
# App::Module::h(): Parameter needs to be a number at carp_clan.pl line 11
package App::Code;
use strict;
use warnings;
use Carp::Clan qw(^App::);
use Carp ();
sub f {
my ($x) = @_;
warn "Parameter needs to be a number" if $x =~ /\D/;
return $x;
}
sub g {
my ($x) = @_;
Carp::carp "Parameter needs to be a number" if $x =~ /\D/;
return $x;
}
sub h {
my ($x) = @_;
carp "Parameter needs to be a number" if $x =~ /\D/;
return $x;
}
1;
package App::Module;
use strict;
use warnings;
use Carp qw(carp cluck);
use App::Code;
sub f {
my ($x) = @_;
App::Code::f($x);
}
sub g {
my ($x) = @_;
App::Code::g($x);
}
sub h {
my ($x) = @_;
App::Code::h($x);
}
1;
The problem with die and croak
In both cases the error string defines the type of the error.
If we want to change the text, then in every place where we catch those
errors we have to update some regular expression.
Cannot easily propagate the context of the error as it is just a string.
Exception::Class
- Exception::Class
- throw
#!/usr/bin/perl
use strict;
use warnings;
use English qw( -no_match_vars ) ;
use Carp qw(croak carp);
use Lottery qw(add_lottery_numbers);
# Catching exceptions
eval {
add_lottery_numbers(@ARGV);
};
if ($EVAL_ERROR) {
if (Number::Missing->caught) {
print "$EVAL_ERROR\n";
print "Usage: $0 NUMBERs (at least 2 numbers are needed)\n";
exit;
}
if (Number::Small->caught() or Number::Big->caught) {
print "$EVAL_ERROR\n";
print "Numbers must be between 1-90\n";
exit;
}
# error that I don't know how to handle:
croak $EVAL_ERROR;
}
print "Number was ok\n";
package Lottery;
use strict;
use warnings;
use Lottery::Exceptions;
use Exporter qw(import);
our @EXPORT = ('add_lottery_numbers');
# Throwing exceptions
sub check_number {
my ($num) = @_;
if ($num < 1) {
Number::Small->throw(number => $num);
}
if ($num > 90) {
Number::Big->throw(number => $num);
}
}
sub add_lottery_numbers {
my (@numbers) = @_;
if (@numbers < 2) {
Number::Missing->throw();
}
foreach my $n (@numbers) {
check_number($n);
}
submit_numbers(@numbers);
}
sub submit_numbers {
}
1;
use Exception::Class (
'Number::Small' => {
fields => [ 'number' ],
},
'Number::Big' => {
fields => [ 'number' ],
},
'Number::Missing' => {},
);
sub Number::Small::full_message {
my ($self) = @_;
return "number " . $self->number . " is too small";
}
sub Number::Big::full_message {
my ($self) = @_;
return "number " . $self->number . " is too big";
}
sub Number::Missing::full_message {
my ($self) = @_;
return "a number is missing";
}
1;
Processing exceptions
Make sure the exceptions are checked in growing order of generality. First check for the most specific exceptions then for the more generic ones.
Exercise
Take the examples/error/exception.pl file and add an exception class that checks if the given value was a number or not.
Add an exception class called Number::Bad and make Number::Small and Number::Big subclasses of that class.
Solution
#!/usr/bin/perl
use strict;
use warnings;
use English qw( -no_match_vars ) ;
use Carp qw(croak carp);
use Exception::Class (
'Number::Small' => {
fields => [ 'number' ],
},
'Number::Big' => {
fields => [ 'number' ],
},
'Number::Missing' => {},
'Number::Invalid' => {
fields => [ 'input' ],
},
);
sub Number::Invalid::full_message {
my ($self) = @_;
return
sprintf("We received '%s' while the input must be a number.",
$self->input);
}
sub Number::Small::full_message {
my ($self) = @_;
return "number " . $self->number . " is too small";
}
sub Number::Big::full_message {
my ($self) = @_;
return "number " . $self->number . " is too big";
}
sub Number::Missing::full_message {
my ($self) = @_;
return "number is missing";
}
# Throwing exceptions
sub check_number {
my ($num) = @_;
if (not defined $num) {
Number::Missing->throw();
}
if ($num !~ /^\d+$/) {
Number::Invalid->throw(input => $num);
}
if ($num < 0) {
Number::Small->throw(number => $num);
}
if ($num > 100) {
Number::Big->throw(number => $num);
}
}
# Catching exceptions
eval {
check_number(@ARGV);
};
if ($EVAL_ERROR) {
if (Number::Missing->caught) {
print "$EVAL_ERROR\n";
print "Usage: $0 NUMBER\n";
exit;
}
if (Number::Small->caught() or Number::Big->caught) {
print "$EVAL_ERROR\n";
print "Number must be between 0-100\n";
exit;
}
# error that I don't know how to handle:
croak $EVAL_ERROR;
}
print "Number was ok\n";
Solution
#!/usr/bin/perl
use strict;
use warnings;
use English qw( -no_match_vars ) ;
use Carp qw(croak carp);
use Exception::Class (
'Number::Bad' => {
fields => [ 'number' ],
}
'Number::Small' => {
isa => 'Number::Bad',
},
'Number::Big' => {
isa => 'Number::Bad',
},
'Number::Missing' => {},
'Number::Invalid' => {
fields => [ 'input' ],
},
);
sub Number::Invalid::full_message {
my ($self) = @_;
return
sprintf("We received '%s' while the input must be a number.",
$self->input);
}
sub Number::Bad::full_message {
my ($self) = @_;
return "number " . $self->number . " is not correct";
}
sub Number::Small::full_message {
my ($self) = @_;
return "number " . $self->number . " is too small";
}
sub Number::Big::full_message {
my ($self) = @_;
return "number " . $self->number . " is too big";
}
sub Number::Missing::full_message {
my ($self) = @_;
return "number is missing";
}
# Throwing exceptions
sub check_number {
my ($num) = @_;
if (not defined $num) {
Number::Missing->throw();
}
if ($num !~ /^\d+$/) {
Number::Invalid->throw(input => $num);
}
if ($num < 0) {
Number::Small->throw(number => $num);
}
if ($num > 100) {
Number::Big->throw(number => $num);
}
}
# Catching exceptions
eval {
check_number(@ARGV);
};
if ($EVAL_ERROR) {
if (Number::Missing->caught) {
print "$EVAL_ERROR\n";
print "Usage: $0 NUMBER\n";
exit;
}
if (Number::Small->caught() or Number::Big->caught) {
print "$EVAL_ERROR\n";
print "Number must be between 0-100\n";
exit;
}
# error that I don't know how to handle:
croak $EVAL_ERROR;
}
print "Number was ok\n";
Introduction to Moose
Moose Constructor
use strict;
use warnings;
use v5.10;
use Person;
my $teacher = Person->new;
package Person;
use Moose;
1;
Test Moose Constructor
use strict;
use warnings;
use v5.10;
use Test::More tests => 1;
use Person;
my $p = Person->new;
isa_ok($p, 'Person');
Moose Attribute
- attribute
use strict;
use warnings;
use v5.10;
use Person;
my $teacher = Person->new;
$teacher->name('Foo');
say $teacher->name;
package Person;
use Moose;
has 'name' => (is => 'rw');
1;
use strict;
use warnings;
use v5.10;
use Person;
my $teacher = Person->new( name => 'Foo' );
say $teacher->name;
Test Moose Attribute
use strict;
use warnings;
use v5.10;
use Test::More tests => 5;
use Person;
my $p = Person->new;
isa_ok($p, 'Person');
is($p->name('Foo'), 'Foo', 'setter');
is($p->name, 'Foo', 'getter');
my $o = Person->new( name => 'Bar' );
isa_ok($o, 'Person');
is($o->name, 'Bar');
Moose Attribute Type
- type
use strict;
use warnings;
use v5.10;
use Person;
my $student = Person->new( name => 'Foo' );
$student->year(1988);
say $student->year;
$student->year('23 years ago');
package Person;
use Moose;
has 'name' => (is => 'rw');
has 'year' => (isa => 'Int', is => 'rw');
1;
Attribute (year) does not pass the type constraint because:
Validation failed for 'Int' with value "23 years ago"
at accessor Person::year (defined at lib/Person.pm line 5) line 4
Person::year('Person=HASH(0x19a4120)', '23 years ago')
called at script/person.pl line 13
Test Moose Attribute Type
use strict;
use warnings;
use v5.10;
use Test::More tests => 6;
use Test::Exception;
use Person;
my $p = Person->new;
isa_ok($p, 'Person');
is($p->name('Foo'), 'Foo', 'setter');
is($p->name, 'Foo', 'getter');
is($p->year(2000), 2000);
is($p->year, 2000);
my $def_err = qr{Attribute \(year\) does not pass the type constraint because:};
my $home_err = qr{Validation failed for 'Int' with value 23 years ago};
throws_ok { $p->year('23 years ago') } qr{$def_err $home_err}, 'exception';
Moose Attribute Type class
- type constraint
use strict;
use warnings;
use v5.10;
use Person;
use DateTime;
my $student = Person->new( name => 'Foo' );
$student->birthday( DateTime->new( year => 1988, month => 4, day => 17) );
say $student->birthday;
$student->birthday(1988);
package Person;
use Moose;
has 'name' => (is => 'rw');
has 'birthday' => (isa => 'DateTime', is => 'rw');
1;
Attribute (birthday) does not pass the type constraint because:
Validation failed for 'DateTime' with value 1988
at accessor Person::birthday (defined at lib/Person.pm line 5) line 4
Person::birthday('Person=HASH(0x2143928)', 1988)
called at script/person.pl line 14
Test Moose Attribute Type class
use strict;
use warnings;
use v5.10;
use DateTime;
use Test::More tests => 7;
use Test::Exception;
use Person;
my $p = Person->new;
isa_ok($p, 'Person');
is($p->name('Foo'), 'Foo', 'setter');
is($p->name, 'Foo', 'getter');
isa_ok(
$p->birthday( DateTime->new( year => 1988, month => 4, day => 17) ),
'DateTime');
my $d = $p->birthday;
isa_ok($d, 'DateTime');
is($d->year, 1988, 'year is correct');
my $default_err =
qr{Attribute \(birthday\) does not pass the type constraint because:};
my $homemade_err =
qr{Validation failed for 'DateTime' with value 1988};
throws_ok { $p->birthday( 1988 ) }
qr{$default_err $homemade_err}, 'Year as birthday';
Moose Attribute - create your own subtype
- subtype
use strict;
use warnings;
use v5.10;
use Person;
use DateTime;
my $student = Person->new( name => 'Foo' );
$student->sex('m');
say $student->sex;
$student->sex('male');
package Person;
use Moose;
use Moose::Util::TypeConstraints;
subtype 'Person::Type::Sex'
=> as 'Str'
=> where { $_ eq 'f' or $_ eq 'm' }
=> message { "($_) is not a valid sex. Valid values are 'f' and 'm'." };
has 'name' => (is => 'rw');
has 'birthday' => (isa => 'DateTime', is => 'rw');
has 'sex' => (isa => 'Person::Type::Sex', is => 'rw');
1;
Attribute (sex) does not pass the type constraint because:
(male) is not a valid sex.
Valid values are 'f' and 'm'. at script\person.pl line 13
Test Moose Attribute - create your own subtype
use strict;
use warnings;
use v5.10;
use Test::More tests => 8;
use Test::Exception;
use Person;
my $p = Person->new;
isa_ok($p, 'Person');
is($p->name('Foo'), 'Foo', 'setter');
is($p->name, 'Foo', 'getter');
is($p->sex('m'), 'm', 'set m');
is($p->sex('f'), 'f', 'set f');
throws_ok { $p->sex('male') }
qr{Attribute \(sex\) does not pass the type constraint because:};
throws_ok { $p->sex('M') }
qr{Attribute \(sex\) does not pass the type constraint because:};
throws_ok { $p->sex('other') }
qr{Attribute \(sex\) does not pass the type constraint because:};
Moose Attribute - coercion
- coercion
use strict;
use warnings;
use v5.10;
use Person;
use DateTime;
my $student = Person->new( name => 'Foo' );
$student->sex('m'); # should be accepted as 'm'
say $student->sex;
$student->sex('female'); # should be accepted as 'f'
say $student->sex;
$student->sex('other'); # should not be accepted
package Person;
use Moose;
use Moose::Util::TypeConstraints;
subtype 'Person::Type::Sex'
=> as 'Str'
=> where { $_ eq 'f' or $_ eq 'm' }
=> message { "The sex you provided ($_) is not valid. " .
"Valid values are 'f' and 'm'." };
coerce 'Person::Type::Sex'
=> from 'Str'
=> via { lc substr($_, 0, 1) };
has 'name' => (is => 'rw');
has 'birthday' => (isa => 'DateTime', is => 'rw');
has 'sex' => (isa => 'Person::Type::Sex', is => 'rw', coerce => 1);
1;
Attribute (sex) does not pass the type constraint because:
The sex you provided (o) is not valid. Valid values are 'f' and 'm'.
at accessor Person::sex (defined at lib/Person.pm line 16) line 8
Person::sex('Person=HASH(0x24ff918)', 'other')
called at script/person.pl line 16
Test Moose Attribute - coercion
use strict;
use warnings;
use v5.10;
use Test::More tests => 12;
use Test::Exception;
use Person;
my $p = Person->new;
isa_ok($p, 'Person');
is($p->name('Foo'), 'Foo', 'setter');
is($p->name, 'Foo', 'getter');
is($p->sex('m'), 'm', 'set m');
is($p->sex, 'm', 'get m');
is($p->sex('f'), 'f', 'set f');
is($p->sex, 'f', 'get f');
is($p->sex('M'), 'm', 'set M');
is($p->sex, 'm', 'get m');
is($p->sex('male'), 'm', 'set male');
is($p->sex, 'm', 'get m');
throws_ok { $p->sex('other') }
qr{Attribute \(sex\) does not pass the type constraint because:};
Moose Enumeration
package Person;
use Moose;
use Moose::Util::TypeConstraints;
enum 'Person::Type::Sex' => [ qw(f m) ];
coerce 'Person::Type::Sex'
=> from 'Str'
=> via { lc substr($_, 0, 1) };
has 'name' => (is => 'rw');
has 'birthday' => (isa => 'DateTime', is => 'rw');
has 'sex' => (isa => 'Person::Type::Sex', is => 'rw', coerce => 1);
1;
Moose Attributes Overview
# accessors, mutators, getters, setters
has 'x' => (is => 'rw');
has 'x' => (is => 'ro');
has 'x' => (is => 'bare');
# types
has 'x' => (is => 'rw', isa => 'Int');
has 'y' => (is => 'rw', isa => 'Str');
# required
has 'z' => (
is => 'ro',
isa => 'Str',
required => 1,
);
# default
has 'x' => (
is => 'rw',
isa => 'Int',
default => 42,
);
has 'names' => (
is => 'rw',
isa => 'HashRef',
default => sub { {} },
);
has 'names' => (
is => 'rw',
isa => 'HashRef',
builder => '_build_name',
);
sub _build_name {
my $self = shift;
# ...
return {};
}
Inheritance
Inheritance is declared in Moose using the "extends" keyword. Multiple inheritance is allowed.
use strict;
use warnings;
use v5.10;
use Person;
use Employee;
my $student = Person->new( name => 'Foo' );
say $student->name; # Foo
my $programmer = Employee->new( name => 'Bar', employer => 'Perl Corp' );
say $programmer->name; # Bar
say $programmer->employer; # Perl Corp
my $unemployed = Person->new( name => 'Foo', employer => 'Java Corp' );
say $unemployed->name; # Foo
say $unemployed->employer;
# Exception:
# Can't locate object method "employer" via package "Person"
package Employee;
use Moose;
extends 'Person';
has 'employer' => (is => 'rw');
1;
Testing Inheritance
use strict;
use warnings;
use v5.10;
use Test::More tests => 12+2+2;
use Test::Exception;
use Employee;
my $e = Employee->new;
isa_ok($e, 'Employee');
is($e->name('Foo'), 'Foo', 'setter');
is($e->name, 'Foo', 'getter');
is($e->sex('m'), 'm', 'set m');
is($e->sex, 'm', 'get m');
is($e->sex('f'), 'f', 'set f');
is($e->sex, 'f', 'get f');
is($e->sex('M'), 'm', 'set M');
is($e->sex, 'm', 'get m');
is($e->sex('male'), 'm', 'set male');
is($e->sex, 'm', 'get m');
throws_ok { $e->sex('other') }
qr{Attribute \(sex\) does not pass the type constraint because:},
'sex is f or m';
is($e->employer('Acme Corporation'), 'Acme Corporation', 'set employer');
is($e->employer, 'Acme Corporation', 'get employer');
use Person;
my $p = Person->new(employee => 'Acme Corp');
isa_ok($p, 'Person');
throws_ok { $p->employee }
qr{Can't locate object method "employee" via package "Person"},
'no employee method';
MooseX::StrictConstructor
use strict;
use warnings;
use v5.10;
use Person;
use Employee;
my $student = Person->new( name => 'Foo' );
say $student->name; # Foo
my $programmer = Employee->new( name => 'Bar', employer => 'Perl Corp' );
say $programmer->name; # Bar
say $programmer->employer; # Perl Corp
my $unemployed = Person->new( name => 'Foo', employer => 'Java Corp' );
# Exception:
# Found unknown attribute(s) init_arg passed to the constructor: employer ...
package Person;
use Moose;
use Moose::Util::TypeConstraints;
use MooseX::StrictConstructor;
enum 'Person::Type::Sex' => [ qw(f m) ];
coerce 'Person::Type::Sex'
=> from 'Str'
=> via { lc substr($_, 0, 1) };
has 'name' => (is => 'rw');
has 'birthday' => (isa => 'DateTime', is => 'rw');
has 'sex' => (isa => 'Person::Type::Sex', is => 'rw', coerce => 1);
1;
Testing Strict attributes
use strict;
use warnings;
use v5.10;
use Test::More tests => 12+2+2;
use Test::Exception;
use Employee;
my $e = Employee->new;
isa_ok($e, 'Employee');
is($e->name('Foo'), 'Foo', 'setter');
is($e->name, 'Foo', 'getter');
is($e->sex('m'), 'm', 'set m');
is($e->sex, 'm', 'get m');
is($e->sex('f'), 'f', 'set f');
is($e->sex, 'f', 'get f');
is($e->sex('M'), 'm', 'set M');
is($e->sex, 'm', 'get m');
is($e->sex('male'), 'm', 'set male');
is($e->sex, 'm', 'get m');
throws_ok { $e->sex('other') }
qr{Attribute \(sex\) does not pass the type constraint because:},
'sex is f or m';
is($e->employer('Acme Corporation'), 'Acme Corporation', 'set employer');
is($e->employer, 'Acme Corporation', 'get employer');
use Person;
throws_ok { Person->new(employee => 'Acme Corp') }
qr{Found unknown attribute},
'MooseX::StrictConstructor';
#isa_ok($p, 'Person');
#throws_ok { $p->employee }
# qr{Can't locate object method "employee" via package "Person"},
# 'no employee method';
throws_ok { Employee->new(employee => 'Acme Corp') }
qr{Found unknown attribute},
'MooseX::StrictConstructor';
Encapsulation
There is no real enforced encapsulation in Moose either. The user still receives a hash reference and nothing stops her from poking around the details. Nothing except her good will.
Class data
Variables declared within the class using my will remain within the class. They won't be associated with any of the objects. Methods of the class will be able to access them.
Special actions during object construction
# called after the constructor has been called
sub BUILD {
my $self = shift;
};
Singleton in Moose
MooseX::Singleton;
A game - the main class or the board is a singleton
Configuration
Database access
use strict;
use warnings;
use Test::More;
plan tests => 2;
use Games::Spacefight;
my $game = Games::Spacefight->new;
isa_ok($game, 'Games::Spacefight');
my $game2 = Games::Spacefight->new;
is $game2, $game, 'Singleton';
package Games::Spacefight;
use Moose;
use MooseX::Singleton;
1;
Destructor in Moose
Usually you don't need to implement a destructor in Perl. but in case you do DESTROY is the name in standard perl and DEMOLISH in Moose.
Object Oriented Perl using Moose
Running subprocess
Possibilities
- system
- qx, backticks
- open "...|" and open "|...."
- IPC::Open2
- IPC::Open3
- IPC::Run
- IPC::Run3
- Expect
See the documentation of IPC::Run3 for a comparison
Prima
Prima Hello World
use strict;
use warnings;
use Prima qw(Application Buttons);
Prima::MainWindow->new(
text => 'Hello world!',
size => [ 200, 200],
)-> insert( Button =>
centered => 1,
text => 'Hello world!',
onClick => sub { $::application-> close },
);
run Prima;
wxPerl - wxWidgets for Perl
wxPerl - Hello World
{% embed include file="src/examples/wxperl/hello_world.pl)
See more examples in https://github.com/PadreIDE/Padre/tree/master/share/examples/wx
Microsoft Windows
Win32 examples
use 5.010;
use Modern::Perl;
use Win32::EventLog;
my $h = Win32::EventLog->new('System', $ENV{ComputerName}) or die;
my $recs;
my $base;
$h->GetNumber($recs);
say $recs;
$h->GetOldest($base);
say $base;
use 5.010;
use Modern::Perl;
use Win32::GUI;
use 5.010;
use Modern::Perl;
use Win32::IPConfig;
#my $host = '127.0.0.1';
my $host = 'dwarf';
my $ipconfig = Win32::IPConfig->new($host) or die;
say 'hostname=', $ipconfig->get_hostname;
say 'domain=', $ipconfig->get_domain;
use 5.010;
use Modern::Perl;
use Win32;
# Win32::MsgBox('hello');
my $resp = Win32::MsgBox('hello', 3, 'Some title');
Win32::MsgBox($resp);
use 5.010;
use Modern::Perl;
use Email::Send::SMTP::Gmail;
my $mail=Email::Send::SMTP::Gmail->new(
-smtp =>'gmail.com',
-login =>'szabgab@gmail.com',
-pass =>'',
# -debug => 0,
);
$mail->send(-to=>'szabgab@gmail.com',
-subject =>'Hello! 2',
-verbose => 0,
-body =>'Just testing it',
# -attachments=>'full_path_to_file',
);
$mail->bye;
say 'DONE';
use 5.010;
use Modern::Perl;
use File::Find::Rule;
my $rule = File::Find::Rule->file()
->name( '*.log' )
->maxdepth(1)
->size( '>7K' )
->start( 'C:\Windows' );
while (my $file = $rule->match) {
say $file, ' ', -s $file;
}
say 'DONE';
# list all the files ordered by date
# ordered by the number of 'Error' strings in each file
PSGI
PSGI Cookbook
Parallel processing
Types of problems
- CPU intensive application - use more of the cores to reduce the wallclock time.
- IO intensive applications - don't waste the CPU and wallclock time while waiting for the IO process.
- Interactive applications - make sure they are responsive during long operations.
Types of solutions
- Number of processes (forking on Unix or spawning)
- Number of threads (Single threaded vs Multi-threaded)
- Asynchronous, non-blocking or synchronous vs blocking (aka "normal") Cooperative Multitasking
Tasks
- count.pl - count numbers CPU intensive
- process_csv.pl - mostly CPU intensive but also reading files
- httpget.pl - Download web pages and extract title - mostly IO intensive
Measurements on 32 core
On the 32 core CPU-Optimized server on Digital Ocean
perl count.pl 0 30 10000000
Elapsed time 7.92
perl count.pl 30 30 10000000
Elapsed time 0.45
perl process_csv.pl 0 30
Elapsed time 38.01
perl process_csv.pl 30 30
Elapsed time 3.02
perl httpget.pl wikipedia.txt 0 80
Elapsed time 12.93
perl httpget.pl wikipedia.txt 20 80
Elapsed time 4.06
perl httpget.pl wikipedia.txt 80 80
Elapsed time 1.11
Measurements on 4 core
perl count.pl 0 12 40000000
Elapsed time 13.57
perl count.pl -1 12 40000000
Elapsed time 5.78
perl count.pl 2 12 40000000
Elapsed time 6.85
perl count.pl 3 12 40000000
Elapsed time 6.42
perl count.pl 12 12 40000000
Elapsed time 5.79
perl process_csv.pl 0 6
Elapsed time 14.91
perl process_csv.pl 2 6
Elapsed time 10.25
perl process_csv.pl 3 6
Elapsed time 8.30
perl process_csv.pl 6 6
Elapsed time 8.71
perl httpget.pl wikipedia.txt 0 20
Elapsed time 18.72
perl httpget.pl wikipedia.txt 10 20
Elapsed time 4.17
The Hardware
lscpu
cat /proc/cpuinfo
htop
Installations
cpanm Parallel::ForkManager
cpanm LWP::Simple
cpanm HTML::TreeBuilder::XPath
cpanm Win32::Getppid
Ubuntu system perl:
libwww-perl
libparallel-forkmanager-perl
libhtml-treebuilder-xpath-perl
Fork
- fork
use strict;
use warnings;
main();
sub main {
my $pid = fork();
die "Could not fork" if not defined $pid;
if ($pid) {
parent_process();
} else {
child_process();
}
}
sub child_process {
print "In child process\n";
exit;
}
sub parent_process {
print "In parent process\n";
my $finished = wait();
}
Fork with functions details
- fork
use strict;
use warnings;
my $historical = 42;
main();
sub main {
print "PID before fork $$\n";
my $pid = fork();
die "Could not fork" if not defined $pid;
if ($pid) {
print "PID of child: $pid\n";
parent_process();
} else {
print "PID received in child: $pid\n";
child_process();
}
}
sub child_process {
#sleep 1;
my $ppid = getppid();
print "In child process $$ of parent $ppid\n";
print "value in child: $historical\n";
$historical = 23;
#my $x = 0;
#my $y = 3 / $x;
#die "bad thing";
exit 42;
}
sub parent_process {
#sleep 1;
print "value in parent $historical\n";
my $ppid = getppid();
print "In parent process $$ of parent $ppid\n";
my $finished = wait();
my $exit_code = $? / 256;
print "finished: $finished\n";
print "exit code: $exit_code\n";
print "value in parent $historical\n";
}
Fork random and seed
- rand
- srand
use strict;
use warnings;
srand 42;
print rand(), "\n";
print rand(), "\n";
print rand(), "\n";
#exit;
main();
sub main {
my $pid = fork();
die "Could not fork" if not defined $pid;
if ($pid) {
parent_process();
} else {
child_process();
}
}
sub child_process {
#srand;
print "In child process\n";
print "child ", rand(), "\n";
print "child ", rand(), "\n";
print "child ", rand(), "\n";
exit;
}
sub parent_process {
print "In parent process\n";
print "parent ", rand(), "\n";
print "parent ", rand(), "\n";
print "parent ", rand(), "\n";
my $finished = wait();
}
Fork without functions
- fork
use strict;
use warnings;
main();
sub main {
my $pid = fork();
die "Could not fork" if not defined $pid;
if (not $pid) {
print "In child process\n";
exit;
}
print "In parent process\n";
my $finished = wait();
}
Fork details
-
fork
use strict;
use warnings;
BEGIN {
if ($^O eq "MSWin32") {
print "Running on Windows\n";
require Win32::Getppid;
import Win32::Getppid qw(getppid);
}
}
main();
sub main {
my $shared = 42;
my $pid = fork();
die "Could not fork" if not defined $pid;
if (not $pid) {
sleep 1;
my $parent = getppid();
print "In Child: PID: $$ from parnet: $parent\n";;
print "Shared value: $shared\n";
$shared = $$;
exit;
}
sleep 1;
print "In Parent: PID: $$ Child: $pid\n";
my $finished = wait();
my $exit_code = $? / 256;
print "Child finised: $finished Exit code: $exit_code\n";
print "Shared: $shared\n";
}
Fork many
use strict;
use warnings;
BEGIN {
if ($^O eq "MSWin32") {
print "Running on Windows\n";
require Win32::Getppid;
import Win32::Getppid qw(getppid);
}
}
main();
sub main {
my ($forks) = @ARGV;
die "Usage: $0 FORKS\n" if not $forks;
my $shared = 42;
for my $id (1 .. $forks) {
my $pid = fork();
die "Could not fork" if not defined $pid;
if (not $pid) {
sleep 1;
my $parent = getppid();
print "Fork number $id, In Child: PID: $$ from parnet: $parent\n";;
print "Shared value: $shared\n";
$shared = $$;
exit;
}
print "In Parent: PID: $$ Child created: $pid\n";
}
sleep 1;
print "In Parent: PID: $$\n";
print "Shared value: $shared\n";
for (1 .. $forks) {
my $finished = wait();
my $exit_code = $? / 256;
print "Child finised: $finished with exit code $exit_code\n";
}
}
Parent process ID
use strict;
use warnings;
BEGIN {
if ($^O eq "MSWin32") {
print "Running on Windows\n";
require Win32::Getppid;
import Win32::Getppid qw(getppid);
}
}
main();
sub main {
my $parent = getppid();
print "PID: $$ from parnet: $parent\n";
}
pstree
pstree -p -a
htop
htop
- H
- F4 to filter
Active non-blocking waiting with waitpid
- waitpid
- POSIX
- WNOHANG
Up till now we were usig the wait
function to wait for a child process to terminate. It is a blocking call that will wait till any of the child processes terminates.
There are other options as well. Using the waitpid
function you could wait for a specific child to terminate using its PID or you can have a non-blocking way to check
if there is any child-processes that has already terminated. The non-blocking wait mode allows the parent process to do other things while waiting for the child processes
to do their job.
use strict;
use warnings;
use Time::HiRes qw(sleep);
use POSIX ':sys_wait_h';
main();
sub main {
my ($sleep, $exit) = @ARGV;
die "Usage: $0 SLEEP EXIT\n" if not defined $exit;
my $pid = fork();
die 'Failed to fork' if not defined $pid;
if ($pid == 0) {
print "Child process PID $$\n";
sleep $sleep;
exit $exit;
}
while (1) {
my $pid = waitpid(-1, WNOHANG);
if ($pid > 0) {
my $exit_code = $?/256;
print "Child process $pid finished with exit code $exit_code\n";
last;
}
print "Parent could do something else, but now sleeping\n";
sleep 0.1;
}
}
perl active_waiting.pl 0 0
perl active_waiting.pl 1 0
perl active_waiting.pl 1 1
Non-blocking waiting with waitpid - multiple forks
In this example we create multiple child processes and wait for them with a non-blocking waitpid. Each process will sleep for a random number of seconds imitating the randomness of the time it takes each one of them to finish doing their job. They also generate a random exit code to imitate that some of them might have failed.
use strict;
use warnings;
use Time::HiRes qw(sleep);
use POSIX ':sys_wait_h';
main();
sub main {
my ($workers) = @ARGV;
die "Usage: $0 WORKERS\n" if not defined $workers;
$| = 1; # disable output buffering on STDOUT
my %process;
for my $worker_id (1 .. $workers) {
my $pid = fork();
die 'Failed to fork' if not defined $pid;
if ($pid == 0) {
my $exit_code = int rand(5);
my $sleep_time = rand(5);
print "Child process worker ID: $worker_id PID $$ will sleep for $sleep_time and then exit with code $exit_code\n";
sleep $sleep_time;
exit $exit_code;
}
$process{$pid} = $worker_id;
next;
}
while (1) {
my $pid = waitpid(-1, WNOHANG);
if ($pid > 0) {
my $exit_code = $?/256;
my $worker_id = delete $process{$pid};
print "Child process $pid worker id $worker_id finished with exit code $exit_code\n";
next;
}
print '.';
sleep 0.1;
last if not %process;
}
}
perl active_waiting_loop.pl 1
perl active_waiting_loop.pl 5
Non-blocking waiting, rerun on failure
In this example we have a list of tasks we need to do. The user can supply the number of child processes that will deal with the tasks. Each child process generates a random number to wait to imitatet the work time and a random number as the exit code.
The parent monitors the child processes. If one of them exits with a non-zero error code the parent re-runs that job with another child process until all the tasks are done.
use strict;
use warnings;
use Time::HiRes qw(sleep);
use POSIX ':sys_wait_h';
main();
sub main {
my ($workers) = @ARGV;
die "Usage: $0 WORKERS\n" if not defined $workers;
#my @tasks = 'a' .. 'z';
my @tasks = 'a' .. 'd';
my %process;
while (1) {
if (@tasks and scalar(keys %process) < $workers) {
my $task = shift @tasks;
my $pid = fork();
die 'Failed to fork' if not defined $pid;
if ($pid == 0) {
my $exit_code = int rand(3);
my $sleep_time = rand(5);
print "Child process Task: $task PID $$ will sleep for $sleep_time and then exit with code $exit_code\n";
sleep $sleep_time;
exit $exit_code;
}
$process{$pid} = $task;
next;
}
my $pid = waitpid(-1, WNOHANG);
if ($pid > 0) {
my $exit_code = $?/256;
my $task = delete $process{$pid};
print "Child process $pid task $task finished with exit code $exit_code\n";
if ($exit_code > 0) {
unshift @tasks, $task;
}
next;
}
sleep 0.1;
last if not %process;
}
}
Child process Task: a PID 128387 will sleep for 3.71995377981634 and then exit with code 2
Child process Task: b PID 128388 will sleep for 0.137503658640838 and then exit with code 0
Child process Task: c PID 128389 will sleep for 3.57264931009681 and then exit with code 1
Child process 128388 task b finished with exit code 0
Child process Task: d PID 128390 will sleep for 0.940422063447244 and then exit with code 0
Child process 128390 task d finished with exit code 0
Child process 128387 task a finished with exit code 2
Child process 128389 task c finished with exit code 1
Child process Task: a PID 128391 will sleep for 3.09917882712156 and then exit with code 2
Child process Task: c PID 128392 will sleep for 1.52811677938857 and then exit with code 2
Child process 128392 task c finished with exit code 2
Child process Task: c PID 128393 will sleep for 1.21268558593997 and then exit with code 0
Child process 128393 task c finished with exit code 0
Child process 128391 task a finished with exit code 2
Child process Task: a PID 128394 will sleep for 2.05004244389542 and then exit with code 2
Child process 128394 task a finished with exit code 2
Child process Task: a PID 128395 will sleep for 4.95541832222202 and then exit with code 0
Child process 128395 task a finished with exit code 0
Functions to be speed up
package Task;
use strict;
use warnings;
use Time::HiRes qw(time);
use LWP::Simple qw(get);
use HTML::TreeBuilder::XPath;
sub count {
my ($max) = @_;
my $counter = 0;
while ($counter < $max) {
$counter++;
}
}
sub process_file {
my ($file) = @_;
my $total = 0;
open my $fh, '<', $file or die;
while (my $line = <$fh>) {
chomp $line;
my @fields = split /,/, $line;
$total += $fields[2];
}
return $total;
}
sub get_title {
my ($url) = @_;
my $content = get $url;
my $tree= HTML::TreeBuilder::XPath->new_from_content($content);
my $nb = $tree->findvalue( '/html/head/title' );
return $nb;
}
use lib '.';
use Transformers qw(show_elapsed_time);
show_elapsed_time('count', 'process_file', 'get_title');
1;
Counter process
use strict;
use warnings;
use Time::HiRes qw(time);
use lib '.';
use Task;
use ForkedCounter;
main();
sub main {
my ($fork_count, $task_count, $max) = @ARGV;
die "Usage: $0 FORK_COUNT HOW_MANY_TIMES TILL_WHICH_NUMBER\n" if not $max;
my $start = time;
if ($fork_count == 0) {
for my $i (1..$task_count) {
Task::count($max);
}
} elsif ($fork_count == -1) {
ForkedCounter::counter_one_by_one($task_count, $max);
} else {
ForkedCounter::counter($fork_count, $task_count, $max);
}
my $end = time;
my $elapsed = $end-$start;
printf "Elapsed time %.2f\n", $elapsed;
}
perl count.pl 0 12 40000000
Forked counter process
package ForkedCounter;
use strict;
use warnings;
use Task;
sub counter_one_by_one {
my ($task_count, $max) = @_;
for my $fork_id (1 .. $task_count) {
my $pid = fork();
die "Could not fork" if not defined $pid;
if (not $pid) {
Task::count($max);
exit;
}
}
for my $fork_id (1 .. $task_count) {
wait();
}
}
sub counter {
my ($fork_count, $task_count, $max) = @_;
my @tasks;
push @tasks, 1+int($task_count/$fork_count) for 1 .. ($task_count % $fork_count);
push @tasks, int($task_count/$fork_count) for 1 .. ($fork_count - ($task_count % $fork_count));
print "@tasks\n";
for my $fork_id (1 .. $fork_count) {
my $pid = fork();
die "Could not fork" if not defined $pid;
if (not $pid) {
#print "PID $pid $tasks[$fork_id-1]\n";
for my $i (1..$tasks[$fork_id-1]) {
Task::count($max);
}
exit;
}
}
for my $fork_id (1 .. $fork_count) {
wait();
}
}
1;
Prepare CSV files
Generate CSV data files to be used by the CSV processing task. Each file has rows with the 3rd column being increasing numbers. The number in the last row of each file contains the seriel number of the file. That will make the sum of the numbers different and we'll be able to verify if the results come from the different files.
use strict;
use warnings;
my ($n, $size) = @ARGV;
die "Usage: $0 NUMBER_OF_FILES SIZE_OF_FILES\n" if not defined $size;
my $content = "";
my $cnt = 0;
while ($cnt < $size) {
$cnt++;
$content .= "a,b,$cnt,c,d\n";
}
for my $ix (1..$n) {
my $filename = sprintf "data_%02s.csv", $ix;
#print "$filename\n";
open my $fh, '>', $filename or die "Could not open $filename\n";
print $fh $content;
print $fh "a,b,$ix,c,d\n";
}
perl prepare_files.pl 12 2000000
Process CSV files
We expect two parameters on the command line. The number of parallel processes to run and how many files to process. For parallel 0 means not to use the forking mechanizm at all. We use the number of files instead of accepting the list of files on the command line, becasue it is easier to select a subset of the files this way.
use strict;
use warnings;
use Time::HiRes qw(time);
use lib '.';
use Task;
use ForkedProcessCSV;
main();
sub main {
my ($parallels, $file_count) = @ARGV;
die "Usage $0 PARALLELS FILE_COUNT\n" if not defined $file_count;
my %results;
my @files = sort glob "data_*.csv";
die "Not enough files\n" if $file_count > @files;
@files = @files[0 .. $file_count-1];
#print "@files";
my $start = time;
if ($parallels == 0) {
for my $file (@files) {
my $total = Task::process_file($file);
$results{$file} = $total;
}
} else {
%results = ForkedProcessCSV::process_csv($parallels, @files);
}
for my $file (@files) {
print "$file $results{$file}\n";
}
my $end = time;
my $elapsed = $end-$start;
printf "Elapsed time %.2f\n", $elapsed;
}
$ perl process_csv.pl 0 1
Elapsed time 1.51
$ perl process_csv.pl 0 4
Elapsed time 5.92
$ perl process_csv.pl 2 4
Elapsed time 4.02
$ perl process_csv.pl 4 4
Elapsed time 4.01
$ perl process_csv.pl 0 10
Elapsed time 15.18
$ perl process_csv.pl 4 10
Elapsed time 9.05
Use Parallel::ForkManager
use strict;
use warnings;
use Parallel::ForkManager;
use Data::Dumper qw(Dumper);
main();
sub main {
my ($parallels) = @ARGV;
die "Usage: $0 PARALLELS\n" if not defined $parallels;
my $shared = 42;
my $pm = Parallel::ForkManager->new($parallels);
foreach my $input (2, 3, 5, 11) {
my $pid = $pm->start and next;
print "PID $$ input: $input shared: $shared\n";
$shared = $$;
$pm->finish();
}
$pm->wait_all_children;
print "Shared: $shared\n";
}
Return values using Parallel::ForkManager
- fork
use strict;
use warnings;
use Parallel::ForkManager;
use Data::Dumper qw(Dumper);
main();
sub main {
my ($parallels) = @ARGV;
die "Usage: $0 PARALLELS\n" if not defined $parallels;
my %results;
my $pm = Parallel::ForkManager->new($parallels);
$pm->run_on_finish( sub {
my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $data_structure_reference) = @_;
my $input = $data_structure_reference->{input};
$results{$input} = $data_structure_reference->{result};
#print "Finished PID $pid and exit code: $exit_code\n";
});
foreach my $input (2, 3, 5, 11) {
my $pid = $pm->start and next;
print "PID $$\n";
my $result = $input * 2;
$pm->finish(0, {input => $input, result => $result});
}
$pm->wait_all_children;
print Dumper \%results;
}
Forked process CSV files
package ForkedProcessCSV;
use strict;
use warnings;
use Parallel::ForkManager;
use lib '.';
use Task;
sub process_csv {
my ($parallels, @files) = @_;
my %results;
my $pm = Parallel::ForkManager->new($parallels);
$pm->run_on_finish( sub {
my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $data_structure_reference) = @_;
my $filename = $data_structure_reference->{filename};
$results{$filename} = $data_structure_reference->{total};
#print "Finished PID $pid and exit code: $exit_code\n";
});
foreach my $file (@files) {
my $pid = $pm->start and next;
print "PID $$\n";
my $total = Task::process_file($file);
$pm->finish(0, {filename => $file, total => $total});
}
$pm->wait_all_children;
return %results;
}
1;
HTTP GET
use strict;
use warnings;
use Time::HiRes qw(time);
use lib '.';
use Task;
use ForkedHTTP;
binmode(STDOUT, ":utf8");
main();
sub main {
my ($filename, $parallels, $limit) = @ARGV;
die "Usage: $0 FILENAME PARALLEL LIMIT\n" if not defined $limit;
open my $fh, '<', $filename or die;
my @urls = <$fh>;
chomp @urls;
if ($limit and $limit < scalar @urls) {
@urls = @urls[0..$limit-1];
}
#print scalar @urls;
my %results;
my $start = time;
if ($parallels == 0) {
for my $url (@urls) {
$results{$url} = Task::get_title($url);
}
} else {
%results = ForkedHTTP::get_titles($parallels, @urls);
}
for my $url (@urls) {
print "$url $results{$url}\n";
}
my $end = time;
my $elapsed = $end-$start;
printf "Elapsed time %.2f\n", $elapsed;
}
Forked HTTP requests
package ForkedHTTP;
use strict;
use warnings;
use Parallel::ForkManager;
use lib '.';
use Task;
sub get_titles {
my ($parallels, @urls) = @_;
my %results;
my $pm = Parallel::ForkManager->new($parallels);
$pm->run_on_finish( sub {
my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $data_structure_reference) = @_;
my $url = $data_structure_reference->{url};
$results{$url} = $data_structure_reference->{title};
});
foreach my $url (@urls) {
my $pid = $pm->start and next;
print "PID $$\n";
my $title = Task::get_title($url);
$pm->finish(0, {url => $url, title => $title});
}
$pm->wait_all_children;
return %results;
}
1;
Exercise: Process Excel files
- Create N excel files, similar to the CSV files we had in the example and then process them. Both in a serial manner and in parallel.
Exercise: Web crawler
- Build a crawler that, given a single URL and a number N will visit N pages linked from that page. (Maybe need to also get to the links of the links etc.)
- You can use HTML::TreeBuilder::XPath to extract the links from the html document.
- It is probably better to allow only the main process to fork.
MCE - Many-Core Engine
-
MCE
-
fork
-
On Debian/Ubuntu it is called
libmce-perl
use strict;
use warnings;
use MCE;
main();
sub main {
my ($workers) = @ARGV;
die "Usage: $0 WORKERS\n" if not defined $workers;
my $mce = MCE->new(
max_workers => $workers,
user_func => sub {
my ($mce) = @_;
$mce->say("Hello from PID $$ WID " . $mce->wid);
}
);
$mce->run;
}
$ perl use_mce.pl 5
Hello from PID 210834 WID 2
Hello from PID 210833 WID 1
Hello from PID 210835 WID 3
Hello from PID 210836 WID 4
Hello from PID 210837 WID 5
MCE - map running in parallel
-
MCE
-
fork
-
map
The MCE package provides a map-like function that automatically runs the different tasks in separate processes then collects the results in the correct order.
By default it creates 4 child processes, but you can control that and a few other things by calling the init method.
use strict;
use warnings;
use MCE::Map;
main();
sub main {
print "main PID: $$\n";
my @results = mce_map { work($_) } 1..10;
print "Results: @results\n";
}
sub work {
my ($param) = @_;
print "Param $param PID: $$\n";
return $param * $param;
}
main PID: 150164
Param 1 PID: 150168
Param 2 PID: 150168
Param 7 PID: 150167
Param 8 PID: 150167
Param 3 PID: 150165
Param 5 PID: 150166
Param 4 PID: 150165
Param 6 PID: 150166
Param 9 PID: 150168
Param 10 PID: 150168
Results: 1 4 9 16 25 36 49 64 81 100
MCE - map with init
use strict;
use warnings;
use MCE::Map;
main();
sub main {
MCE::Map->init(
max_workers => 3, # defaults to 4
chunk_size => 2, # defaults to 1
);
print "main PID: $$\n";
my @results = mce_map { work($_) } 1..10;
print "Results: @results\n";
}
sub work {
my ($param) = @_;
print "Param $param PID: $$\n";
#exit 23 if $param == 2;
return $param * $param;
}
main PID: 210966
Param 3 PID: 210969
Param 1 PID: 210967
Param 5 PID: 210968
Param 4 PID: 210969
Param 2 PID: 210967
Param 6 PID: 210968
Param 7 PID: 210968
Param 9 PID: 210969
Param 8 PID: 210968
Param 10 PID: 210969
Results: 1 4 9 16 25 36 49 64 81 100
other modules
- Parallel::Forker another module that could be used.
- Proc::Queue has not been updated since 2008
Database access using Perl DBI
Architecture of a DBI Application
- dbi
- dbd
|<- Scope of DBI ->|
.-. .--------------. .-------------.
.-------. | |---| XYZ Driver |---| XYZ Engine |
| Perl | | | `--------------' `-------------'
| script| |A| |D| .--------------. .-------------.
| using |--|P|--|B|---|Oracle Driver |---|Oracle Engine|
| DBI | |I| |I| `--------------' `-------------'
| API | | |...
|methods| | |... Other drivers
`-------' | |...
`-'
Taken from the DBI documentation.
Any number of databases at the same time.
Probably every RDBMS can be accessed with its own DBD.
While in many areas Perl has plenty of ways to achieve things in the area of database access a single package has evolved as the de-facto standard way of the low level access. The DBI - Database Independent Interface of Perl - is the module that most of the perl applications in the world use as their core.
Between this module and between virtually and relational database we have a Database Driver -
a module in the DBD::*
namespace. For example there are modules such as DBD::Oracle and DBD::MySQL
or DBD::Pg for PostgreSQL.
Above DBI there are hundreds of higher level abstractions. Most of them live in the DBIx::*
namespace on CPAN.
Create Sample Database
- sqlite
For our examples we are going to use an SQLite database. It is simple to install as it comes within DBD::SQLite, its own database driver and it provides everything we need for our examples.
Actually SQLite is a very good database useful in many applications that don't require concurrent write access to the database frequently.
In order to create the sample database run examples/dbi/create_sample.pl
{% embed include file="src/examples/dbi/sample.sql)
Connect to database
- connect
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dsn = "dbi:SQLite:dbname=$dbfile";
my $user = "";
my $password = "";
my $dbh = DBI->connect($dsn, $user, $password);
$dbh->disconnect;
Connecting to other databases:
my $dsn = "DBI:mysql:database=$database;host=$hostname;port=$port";
my $dbh = DBI->connect($dsn, $user, $password);
SELECT with one result
- SELECT
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile");
my $sth = $dbh->prepare('SELECT COUNT(*) FROM people');
$sth->execute;
my ($count) = $sth->fetchrow_array();
print "There are $count number of rows.\n";
SELECT with more results
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile");
my $sth = $dbh->prepare('SELECT fname, lname FROM people');
$sth->execute;
while (my @row = $sth->fetchrow_array()) {
print "$row[0] $row[1]\n";
}
SELECT, prepare with placeholders
- placeholder
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile");
my $id = shift;
$id = 1 if not defined $id;
my $sth = $dbh->prepare('SELECT fname, lname FROM people WHERE id <= ?');
$sth->execute($id);
while (my @row = $sth->fetchrow_array()) {
print "$row[0] $row[1]\n";
}
SELECT, using hashref
- NAME_lc
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile");
my $id = shift;
$id = 1 if not defined $id;
my $sth = $dbh->prepare('SELECT fname, lname FROM people WHERE id <= ?');
$sth->execute($id);
while (my $h = $sth->fetchrow_hashref('NAME_lc')) {
print "$h->{fname} $h->{lname}\n";
}
INSERT
- INSERT
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile");
my ($fname, $lname, $email, $pw) = qw(Moose Foobar moose@foobar.com really?);
$dbh->do('INSERT INTO people (fname, lname, email, pw) VALUES (?, ?, ?, ?)',
undef,
$fname, $lname, $email, $pw);
We might need to insert some data.
UPDATE
- UPDATE
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dsn = "dbi:SQLite:dbname=$dbfile";
my $dbh = DBI->connect($dsn);
show_table("before");
my ($email, $id) = ('new@address.com', 1);
$dbh->do("UPDATE people SET email = ? WHERE id = ?", undef, $email, $id);
show_table("after");
sub show_table {
print "----- $_[0]\n";
my $sth = $dbh->prepare("SELECT id, email FROM people");
$sth->execute();
while (my $h = $sth->fetchrow_hashref('NAME_lc')) {
print "$h->{id} $h->{email}\n";
}
}
----- before
1 foo@bar.com
2 peti@bar.com
----- after
1 new@address.com
2 peti@bar.com
We might need to update some data.
The same can be also done using a prepared a statement. It is better to use prepared statements if you would like to execute the same SQL query several times with different values.
Simple Database access using Perl DBI and SQL
Advanced Database access using Perl DBI
Data integrity
When you are working in a place where data integrity is important you have to use transactions when executing multiple queries that should either all succeed or all fail.
#!/usr/bin/perl
use strict;
use warnings;
system "$^X examples/dbi/create_sample.pl";
use DBI;
my $dbfile = "sample.db";
my $dsn = "dbi:SQLite:dbname=$dbfile";
my $dbh = DBI->connect($dsn);
system "$^X examples/dbi/show_accounts.pl before";
debit(1, 100);
system "$^X examples/dbi/show_accounts.pl middle";
#exit; # process killed
credit(2, 100);
system "$^X examples/dbi/show_accounts.pl account";
sub debit {
credit($_[0], -1 * $_[1]);
}
sub credit {
my ($id, $amount) = @_;
my $sth = $dbh->prepare("SELECT amount FROM accounts WHERE id = ?");
$sth->execute($id);
my ($current) = $sth->fetchrow_array();
$sth->finish;
$dbh->do("UPDATE accounts SET amount = ? WHERE id = ?",
undef, $current + $amount, $id);
}
Try to see what happens when you enable the exit() function
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dsn = "dbi:SQLite:dbname=$dbfile";
my $dbh = DBI->connect($dsn);
if ($ARGV[0]) {
print "----- $ARGV[0]\n";
}
my $sth = $dbh->prepare("SELECT id, amount FROM accounts");
$sth->execute();
while (my $h = $sth->fetchrow_hashref('NAME_lc')) {
print "$h->{id} $h->{amount}\n";
}
Transactions
- transactions
- begin_work
- commit
- rollback
$dbh->begin_work
$dbh->commit
$dbh->rollback
#!/usr/bin/perl
use strict;
use warnings;
system "$^X examples/dbi/create_sample.pl";
use DBI;
my $dbfile = "sample.db";
my $dsn = "dbi:SQLite:dbname=$dbfile";
my $dbh = DBI->connect($dsn);
system "$^X examples/dbi/show_accounts.pl before";
$dbh->begin_work;
debit(1, 100);
system "$^X examples/dbi/show_accounts.pl middle";
exit; # process killed
credit(2, 100);
$dbh->commit;
system "$^X examples/dbi/show_accounts.pl account";
sub debit {
credit($_[0], -1 * $_[1]);
}
sub credit {
my ($id, $amount) = @_;
my $sth = $dbh->prepare("SELECT amount FROM accounts WHERE id = ?");
$sth->execute($id);
my ($current) = $sth->fetchrow_array();
$sth->finish;
$dbh->do("UPDATE accounts SET amount = ? WHERE id = ?",
undef, $current + $amount, $id);
}
Sample database creation
#!/usr/bin/perl
use strict;
use warnings;
use FindBin qw($Bin);
use DBI;
my $verbose = shift;
my $dbfile = "sample.db";
unlink $dbfile;
print "Creating sample SQLite database at $dbfile\n" if $verbose;
my $dbh = DBI->connect("dbi:SQLite:dbname=sample.db");
my $schema;
{
open my $fh, '<', "$Bin/sample.sql" or die;
local $/ = undef;
$schema = <$fh>;
}
foreach my $sql (split /;/, $schema) {
next if $sql !~ /\S/; # skip empty entries
$dbh->do($sql);
}
$dbh->disconnect;
Disconnect from the database
- disconnect
$dbh->disconnect
Attributes
- attributes|dbi
- AutoCommit
- PrintError
- RaiseError
- NAME_lc
- FetchHashKeyName
- TraceLevel
In addition to the connection string, the username and the password one can and should configure the database with the DBI specific configuration options (or attributes). These options all go in a HASH reference in the 4th parameter of the DBI connect method.
As the databases and the DBD - Database drivers might all decide arbitrarily on their mode regarding transactions one should always explicitly declare how she wants to operate.
The AutoCommit option.
my %attributes = (
PrintError => 1,
RaiseError => 1,
AutoCommit => 1,
FetchHashKeyName => 'NAME_lc', # NAME_uc
TraceLevel => 1, # see Debugging and Trace levels later
);
my $dsn = "dbi:SQLite:dbname=$dbfile";
my $dbh = DBI->connect($dsn, $username, $password, \%attributes);
Error handling
- Set the attributes PrintError and RaiseError
- Check for returned undef (or empty lists)
- Check $h->err and $h->errstr (aka. $DBI::err and $DBI::errstr)
- err - Native DB engine error code
- errstr - Native DB engine error string
$sth = $dbh->prepare($statement) or die $dbh->errstr;
$rv = $sth->execute(@bind_values) or die $sth->errstr;
fetchrow_array (and others) return undef when no more row or if they encounter an error. Use RaiseError or check $sth->err
Debugging (Trace levels)
$dbh->trace();
$sth->trace();
DBI->trace();
TraceLevel attribute
Can be used like this:
DBI->trace(1);
DBI->trace(1, '/tmp/dbitrace.log');
The trace level can be 0 (off) .. 15 (usually 1-4 is more than enough)
In CGI scripts add the following:
BEGIN { $ENV{DBI_TRACE}='1=/tmp/dbitrace.log'; }
Calculating bank balance, take two: DBD::CSV
Database access using DBIx::Class
DBIx::Class
Use SQL Database without writing SQL
Create Sample Database
For our examples we are going to use an SQLite database. It is simple to install as it comes within DBD::SQLite, its own database driver and it provide everything we need for our examples.
Actually SQLite is a very good database useful in many applications that don't require concurrent write access to the database frequently.
In order to create the sample database run examples/dbi/create_sample.pl
package My::DB;
use strict;
use warnings;
our $VERSION = '0.01';
use base qw/DBIx::Class::Schema/;
__PACKAGE__->load_classes();
1;
Internet Explorer
Win32::IE::Mechanize
- Written by Abe Timmerman
- Currently at version 0.003
- It is like WWW::Mechanize but using IE as the user agent
- Connecting to IE using Win32::OLE automation
- Win32::IE::Link is just like WWW::Mechanize::Link
- Win32::IE::Form is just like HTML::Form
Accessing Google
use strict;
use warnings;
use Win32::IE::Mechanize;
my $w = Win32::IE::Mechanize->new(visible => 1);
$w->get("http://www.google.com/ncr");
$w->follow_link(text_regex => qr/advanced/i);
=pod
foreach my $f ($w->forms()) {
printf("Name: %s\n", $f->name);
foreach my $in ($f->inputs) {
printf(" Input: %s %s %s\n",
$in->name, $in->type, $in->value);
}
}
=cut
$w->form_name("f");
$w->select("num", 20);
$w->submit_form(
# form_name => 'f',
fields => {
as_q => 'perl training',
},
);
# But specifically with Google, you should use their API and not their web site
Examples
We have already seen examples using WWW::Mechanize
They all work here too.
X-Windows
X11
- X11
- xcalc
X11 X-Windows is the mostly used windowing system for unix. There are several window managers running on X11 which makes the capturing part difficult. On unix machines we have a nice graphical calculator called xcalc.
X11 Tools
- X11::GUITest
- X11::Protocol
- LDTP - Linux Desktop Testing Project
X11::GUITest
-
X11::GUITest
-
Available since 2003 by Dennis K. Paulsen.
LDTP - Linux Desktop Testing Project
-
LDTP
-
LDTP Cross Platform GUI Test Automation tool
Manual steps
Launch the application manually and look at its title
xcalc
X11: Launch the application
- StartApp
- WaitWindowViewable
- GetInputFocus
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
perl examples/X/xcalc01.pl
As you can see we only open the xcalc window and do not close it yet. So now if you run this for a second time (without manually closing the first xcalc window) you will see the error message and you'll see that the Main windows id is the same as previously as the first windows was found first but the Focus is on another window, the one we have just opened.
Let's see the coordinates
- GetWindowPos
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
perl examples/X/xcalc02.pl
Nice, but where exactly are these coordinates ? Let's put the mouse to the top left corner.
Where is this top left corner ?
- MoveMouseAbs
MoveMouseAbs($x, $y);
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
MoveMouseAbs($x, $y);
perl examples/X/xcalc03.pl
And let's see the rest of the coordinates
MoveMouseAbs($x, $y);
sleep(2);
MoveMouseAbs($x+$width, $y);
sleep(2);
MoveMouseAbs($x+$width, $y+$height);
sleep(2);
MoveMouseAbs($x, $y+$height);
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
MoveMouseAbs($x, $y);
sleep(2);
MoveMouseAbs($x+$width, $y);
sleep(2);
MoveMouseAbs($x+$width, $y+$height);
sleep(2);
MoveMouseAbs($x, $y+$height);
perl examples/X/xcalc04.pl
The sleep() commands were added only so that we can see the mouse moving around.
Smooth on the edges
MoveMouseAbs($x, $y);
foreach my $w (0..$width) {
MoveMouseAbs($x+$w, $y);
}
foreach my $h (0..$height) {
MoveMouseAbs($x+$width, $y+$h);
}
foreach my $w (0..$width) {
MoveMouseAbs($x+$width-$w, $y+$height);
}
foreach my $h (0..$height) {
MoveMouseAbs($x, $y+$height-$h);
}
perl examples/X/xcalc05.pl
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
MoveMouseAbs($x, $y);
foreach my $w (0..$width) {
MoveMouseAbs($x+$w, $y);
}
foreach my $h (0..$height) {
MoveMouseAbs($x+$width, $y+$h);
}
foreach my $w (0..$width) {
MoveMouseAbs($x+$width-$w, $y+$height);
}
foreach my $h (0..$height) {
MoveMouseAbs($x, $y+$height-$h);
}
Closing the window
- ClickMouseButton
- SendKeys
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
MoveMouseAbs($x+2, $y-2);
ClickMouseButton M_LEFT;
SendKeys('c');
We could close the window by typing Alt-F4 but it does not work on my notebook as I have set Alt-F4 to be something else.
So we can either click on the X in the top-right corner, or click on the X in the top left corner, and then press C. We chose the latter.
The above code will probably work on most of the possible screen resolutions but generally moving a fixed number of pixels might not be the best solution.
perl examples/X/xcalc06.pl
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
MoveMouseAbs($x+2, $y-2);
sleep(2);
ClickMouseButton M_LEFT;
sleep(2);
SendKeys('c');
Placing the cursor on one of the buttons and clicking on it
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
my $button_width = $width/5;
my $button_height = $height*0.8/8;
MoveMouseAbs($x + 2.5 * $button_width, $y+$height * 0.2 + 5.5 * $button_height);
ClickMouseButton M_LEFT;
Observing the xcalc application we see that it has 5 buttons in a row and 8 in a column. At the top of the application there is another window which is about 20% of the total height, full width.
So first we calculate the size of a button and then the center of button "5".
perl examples/X/xcalc07.pl
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
my $button_width = $width/5;
my $button_height = $height*0.8/8;
MoveMouseAbs($x + 2.5 * $button_width, $y+$height * 0.2 + 5.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
MoveMouseAbs($x+2, $y-2);
ClickMouseButton M_LEFT;
SendKeys('c');
So let's do all the calculations
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
my $button_width = $width/5;
my $button_height = $height*0.8/8;
MoveMouseAbs($x + 2.5 * $button_width, $y+$height * 0.2 + 5.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
MoveMouseAbs($x + 4.5 * $button_width, $y+$height * 0.2 + 4.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
MoveMouseAbs($x + 1.5 * $button_width, $y+$height * 0.2 + 4.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
MoveMouseAbs($x + 4.5 * $button_width, $y+$height * 0.2 + 7.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
perl examples/X/xcalc08.pl
That would be 5 * 7 =
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
my $button_width = $width/5;
my $button_height = $height*0.8/8;
MoveMouseAbs($x + 2.5 * $button_width, $y+$height * 0.2 + 5.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
MoveMouseAbs($x + 4.5 * $button_width, $y+$height * 0.2 + 4.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
MoveMouseAbs($x + 1.5 * $button_width, $y+$height * 0.2 + 4.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
MoveMouseAbs($x + 4.5 * $button_width, $y+$height * 0.2 + 7.5 * $button_height);
ClickMouseButton M_LEFT;
sleep(2);
MoveMouseAbs($x+2, $y-2);
ClickMouseButton M_LEFT;
SendKeys('c');
Fetch the list of Child Windows
- GetChildWindows
- GetWindowName
my @children = GetChildWindows $Main;
foreach my $id (@children) {
my $name = GetWindowName($id) || '';
print "Child: '$id' '$name'\n";
}
perl examples/X/xcalc11.pl
Non of them has a title, this seem to be like a dead-end, but wait a second, maybe we can find the location of the child windows.
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
my @children = GetChildWindows $Main;
foreach my $id (@children) {
my $name = GetWindowName($id) || '';
print "Child: '$id' '$name'\n";
}
{
my ($x, $y, $width, $height) = GetWindowPos($Main);
MoveMouseAbs($x+2, $y-2);
ClickMouseButton M_LEFT;
SendKeys('c');
}
Locate the Child Windows
printf "Main %6s %6s %6s %6s\n", GetWindowPos($Main);
my @children = GetChildWindows $Main;
foreach my $id (@children) {
printf "%6s %6s %6s %6s %6s\n", $id, GetWindowPos($id);
}
perl examples/X/xcalc12.pl
Main: 18874425
Focus: 18874425
Main 6 26 226 304
18874426 6 26 226 304
18874428 186 298 40 26
18874429 142 298 40 26
18874430 98 298 40 26
18874431 54 298 40 26
18874432 10 298 40 26
18874433 186 268 40 26
18874434 142 268 40 26
18874435 98 268 40 26
18874436 54 268 40 26
18874437 10 268 40 26
18874438 186 238 40 26
18874439 142 238 40 26
18874440 98 238 40 26
18874441 54 238 40 26
18874442 10 238 40 26
18874443 186 208 40 26
18874444 142 208 40 26
18874445 98 208 40 26
18874446 54 208 40 26
18874447 10 208 40 26
18874448 186 178 40 26
18874449 142 178 40 26
18874450 98 178 40 26
18874451 54 178 40 26
18874452 10 178 40 26
18874453 186 148 40 26
18874454 142 148 40 26
18874455 98 148 40 26
18874456 54 148 40 26
18874457 10 148 40 26
18874458 186 118 40 26
18874459 142 118 40 26
18874460 98 118 40 26
18874461 54 118 40 26
18874462 10 118 40 26
18874463 186 88 40 26
18874464 142 88 40 26
18874465 98 88 40 26
18874466 54 88 40 26
18874467 10 88 40 26
18874468 10 28 216 46
18874469 17 31 204 38
18874470 145 53 18 15
18874471 109 53 34 15
18874472 79 53 26 15
18874473 49 53 26 15
18874474 22 55 26 15
18874475 36 34 186 17
18874476 22 34 10 15
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
printf "Main %6s %6s %6s %6s\n", GetWindowPos($Main);
my @children = GetChildWindows $Main;
foreach my $id (@children) {
printf "%6s %6s %6s %6s %6s\n", $id, GetWindowPos($id);
}
{
my ($x, $y, $width, $height) = GetWindowPos($Main);
MoveMouseAbs($x+2, $y-2);
ClickMouseButton M_LEFT;
SendKeys('c');
}
Using the keyboard
SendKeys("7*5=");
sleep(5);
perl examples/X/xcalc21.pl
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
SendKeys("7*5=");
sleep(5);
{
my ($x, $y, $width, $height) = GetWindowPos($Main);
MoveMouseAbs($x+2, $y-2);
ClickMouseButton M_LEFT;
SendKeys('c');
}
Separate the keystrokes
SendKeys("7");
sleep(2);
SendKeys("*");
sleep(2);
SendKeys("5");
sleep(2);
SendKeys("=");
sleep(2);
perl examples/X/xcalc22.pl
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw(:ALL);
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
if (!$Main) {
die("Couldn't find xcalc window in time!");
}
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
if ($Focus != $Main) {
die "The focus is not on the main window or you have two xcalcs open\n";
}
SendKeys("7");
sleep(2);
SendKeys("*");
sleep(2);
SendKeys("5");
sleep(2);
SendKeys("=");
sleep(2);
{
my ($x, $y, $width, $height) = GetWindowPos($Main);
MoveMouseAbs($x+2, $y-2);
ClickMouseButton M_LEFT;
SendKeys('c');
}
Full solution for xcalc
perl examples/X/xcalc.pl keyboard
perl examples/X/xcalc.pl mouse
#!/usr/bin/perl
use strict;
use warnings;
use X11::GUITest qw{:ALL};
if (not @ARGV or
not grep {$ARGV[0] eq $_} qw(keyboard mouse)) {
die "Usage: $0 [keyboard|mouse]\n";
}
StartApp('xcalc');
my ($Main) = WaitWindowViewable('Calculator');
print "Main: $Main\n";
my $Focus = GetInputFocus();
print "Focus: $Focus\n";
die("Couldn't find xcalc window in time!") if not $Main;
my ($x, $y, $width, $height) = GetWindowPos($Main);
print "$x $y $width $height\n";
if ($ARGV[0] eq "keyboard") {
print "Testing keyboard\n";
SendKeys('7');
sleep(1); # just for the show
SendKeys('*');
SendKeys('5');
sleep(1); # just for the show
SendKeys('=');
}
if ($ARGV[0] eq "mouse") {
print "Testing mouse\n";
press(2,5); #7
sleep(1); # just for the show
press(5,5); #*
sleep(1); # just for the show
press(3,6); #5
sleep(1); # just for the show
press(5,8); #=
}
# Here we should be able to read the result and check if it is correct
sleep(2);
# Hopefully a good guess for clicking on the upper left corner of a window
# (the X - sign) and then pressing 'c' in order to close the applicaton
MoveMouseAbs($x+2, $y-2);
ClickMouseButton M_LEFT;
SendKeys('c');
sub press {
my ($w, $h) = @_;
MoveMouseAbs(button_width($w), button_height($h));
ClickMouseButton M_LEFT;
}
# We assume that the window has a display part at the top of about 20% of the
# total height and it has a keypad of 5x8 size.
sub button_height{
my $n = shift;
# 80% of the full window has 8 buttons in it
my $button_height = $height*0.8/8;
return int $y + $height*0.2 + ($n-0.5) * $button_height;
}
sub button_width {
my $n = shift;
my $button_width = $width/5; # total width has 5 buttons in it
return int $x + ($n-0.5) * $button_width;
}