January 21st, 2015 by depesz | Tags: , , | No comments »
Did it help? If yes - maybe you can help me?

Recently I saw this discussion on LinkedIn.

In there a guy asks whether modifying script while it's executing will change
the way it executes.

This is of course trivial to check, with simple program:

#!/usr/bin/env perl
 
for ( 1..100 ) {
    printf "test, unmodified\n";
    sleep 1;
}

Running it, and then changing to:

#!/usr/bin/env perl
 
for ( 1..100 ) {
    printf "test, modified\n";
    sleep 1;
}

Of course it will print 100 times “test, unmodified". Changing source code doesn't change state of running program.

But – maybe it's possible to do it?

Simple version, is detecting if the file changed, and if it did – just re-run.

#!/usr/bin/env perl
 
my $original_mtime = ( stat( __FILE__ ) )[9];
 
for ( 1..100 ) {
    print "test, unmodified\n";
 
    my $current_mtime = ( stat( __FILE__ ) )[9];
    if ( $current_mtime != $original_mtime ) {
        print "Script changed, reloading\n";
        exec $0;
    }
    sleep 1;
}

Now, when I ran it, and then, modified it's content, I got:

=$ ./test.pl
...
test, unmodified
Script changed, reloading
test, modified
...

This looks good. But we've lost state – that is, it will print some number of times of “test, unmodified", and then 100 times “test, modified" – as it doesn't know that it was printing something before. Of course, I could have added storing state in external file, and then loading. This should be trivial (as long as we know what state should contain), so I'll leave it out.

The thing is – doing the exec() is not really nice. Perhaps there is some simple way to handle reloading, and additionally, handling state without having to deal with temporary files?

Yes – we can use modules to do it. And there is handy Module::Reload helper module.

So, let's write external script, that will stay constant, but will load worker module from current directory:

#!/usr/bin/env perl
 
use Module::Reload;
use lib '.';
use Worker;
 
for (1..100) {
    Worker->do_work();
    Module::Reload->check();
    sleep 1;
}

in the same directory, I also create Worker.pm file, with this content:

package Worker;
 
sub do_work {
    print "Module, unmodified\n";
}
 
1;

Then I run test.pl, and while it's running, change do_work function in Worker.pm. Result:

=$ ./test.pl
Module, unmodified
Module, unmodified
Use of uninitialized value $mtime in numeric gt (>) at /usr/local/share/perl/5.18.2/Module/Reload.pm line 25.
Module, unmodified
Module, modified
Module, modified
...

This works, but doesn't look nice. There is some warning. And the module itself – well, it does not really look all that complex. So perhaps we don't need this module, and instead we could just add reloading of the Worker module, and possibly even state storage?

I will also make it so that the external program doesn't need to know anything about the worker – it will just run some function continuously until it will return “undef" – meaning it finished its work.

New test.pl:

#!/usr/bin/env perl
 
use lib '.';
use Worker;
 
my $library_file = $INC{'Worker.pm'};
my $library_mtime = ( stat( $library_file ) )[9];
 
while ( Worker->do_some_work() ) {
    my $new_mtime = ( stat( $library_file ) )[9];
    next if $new_mtime == $library_mtime;
 
    $library_mtime = $new_mtime;
 
    my $state = Worker->save_state();
    delete $INC{'Worker.pm'};
    require 'Worker.pm';
    Worker->load_state( $state );
}

Now, in the while – worker is called to do “some" work – not all. After each bit, test.pl checks if Worker.pm changed, and if yes – reloads it, preserving state, and continues.

Worker.pm looks like:

package Worker;
 
our $iteration = 1;
 
sub do_some_work {
    return if $iteration > 100;
    printf "$iteration Original\n";
    $iteration++;
    sleep 1;
    return 1;
}
 
sub save_state {
    return $iteration;
}
 
sub load_state {
    my $class = shift;
    $iteration = shift;
}
1;

(in case you're not familiar – “$iteration //= 1" will set $iteration to 1 only if it's undef – so on first run
The two important bits are:

  • return if $iteration > 100 – this will signal to test.pl that the work has been done, and we can exit.
  • return 1; at the end of do_some_work – it means that there is still work to be done, and that test.pl should rerun do_some_work after checking for new version of worker

With this approach we have relative freedom about changing what has to be done – as long as:

  • Worker.pm compiles correctly
  • Worker module contains do_some_work(), save_state() and load_state() functions

If Worker.pm wouldn't compile correctly, test.pl will die when trying to load it. Of course this can be trivially “worked around" by changing test.pl to:

#!/usr/bin/env perl
 
use lib '.';
use Worker;
 
my $library_file = $INC{'Worker.pm'};
my $library_mtime = ( stat( $library_file ) )[9];
 
while ( Worker->do_some_work() ) {
    my $new_mtime = ( stat( $library_file ) )[9];
    next if $new_mtime == $library_mtime;
 
    $library_mtime = $new_mtime;
 
    eval {
        my $state = Worker->save_state();
        delete $INC{'Worker.pm'};
        require 'Worker.pm';
        Worker->load_state( $state );
    };
}

that is – adding eval{} around reloading of module – if it will fail, eval will catch error, and previous version of code will be still in effect.

With such thing, we should add some kind of error handling – printing the error message, or logging it, but since it's just example, I figured I don't need to add it here.

Having written it – I don't think it's necessary. What's more – there are definitely Perl experts that will do the same in a nicer, safer way. But as a simple example of a technique – I think it does its job. Have fun 🙂

Leave a comment