Feb 122009
 

Introduction

I recently switched to using Gnome’s Evolution groupware client for email and calendar management.  Although I really like Mozilla’s Thunderbird interface better for email, Evolution offers connectivity to Microsoft’s Exchange server, while Thunderbird does not.   Plus, Evolution provides built-in calendar support, which is still in the works as a plug-in (SunBird) for Thunderbird, and again, no Exchange connectivity support.  😥

Well, I am learning to like Evolution.  It has some nice features, like various spam filters plug-ins (Bogofilter, SpamAssassin).  And, it generally works.  But, now the question arises, how do I migrate my Evolution settings and email data from one machine to another?  Unfortunately, it’s not a simple matter of copying a single directory with its contents.  But, the solution is not too bad…

Solution

Evolution’s data and settings live in 4 places:

  • $HOME/.evolution – email data (Inbox, Sent, etc.)
  • $HOME/.gconf/apps/evolution – your account settings
  • $HOME/.gnome2_private/Evolution – your passwords
  • $HOME/.camel_certs – SSL Certificates, if any

However, you should not just close Evolution and copy these directories.  Evolution uses a calendar server and a Gnome settings server, which may keep some of these files open.  Therefore, you must shut-down Evolution and the appropriate servers, and then archive the data, like so:

$ gconftool-2 --shutdown
$ evolution --force-shutdown
$ cd
$ tar -zcvf evolution-backup.tar.gz .evolution .gconf/apps/evolution .gnome2_private/Evolution .camel_certs

This creates a compressed archive file, which can be copied to another Linux box and unarchived, like so:

$ tar -zxvf evolution-backup.tar.gz -C ~/

Others have recommended shutting down the Evolution and Gnome settings servers on the new machine, before installing the new files.  Afterward, the servers can be restarted, like so:

$ gconftool-2 --shutdown
$ evolution --force-shutdown
$ tar -zxvf evolution-backup.tar.gz -C ~/
$ gconftool-2 --unload evolution_setting.xml
$ gconftool-2 --load evolution_setting.xml

I did not have to do this, but it may be useful.

After doing this, simply fire up Evolution, and you should be good to go!

Share
Feb 122009
 

Problem

Recently, I switched to using Gnome’s Evolution for my email and calendar groupware client on my Gentoo Linux workstation.  I encountered multiple problems that had to be worked.  One of the more noticeable and annoying problems was the messed up icons.  All of the button icons were pictures of blank paper with a red X in the middle, similar to a “file not found” icon.

Gnome Evolution with broken icons.

Apparently, this is a fairly common problem on new installs where people primarily use KDE and not Gnome.  Here is a collection of solutions that may help.

Solutions

Do you have any gnome icons installed?  Make sure you have at least gnome-icon-theme and hicolor-icon-theme installed.  For Gentoo, this is performed, like so:

# emerge gnome-icon-theme hicolor-icon-theme

Do you have read access to the installed icons?  For whatever reason, the installed icons occasionally lose their read permission.  As root, or using sudo, you need to ensure the appropriate permissions, like so:

# chmod -Rf a+rX /usr/share/icons

Have you configured all your Gnome-based tools, including Evolution, to use an installed icon theme?  As your standard, desktop user (not root), check for the existence of this file:

$ cat ~/.gtkrc-2.0

If it does not exist, or if it does not specify an icon theme, you need to add one, like so:

$ echo gtk-icon-theme-name=\"gnome\" >> ~/.gtkrc-2.0

Restart your Evolution email client and enjoy your pretty icons!  Red X’s be gone!!!  🙂

References

Gnome Evolution with proper icons

Share
Jan 232009
 

Problem Introduction

I use an open-source solution for hosting my family’s photos, called Gallery. Recently, my gallery2 install crashed, and I lost my online database. “No big deal”, I thought. “I have all the pictures saved on my local computer.” WRONG-O!!! I hit two major snags: One, gallery can stumble very easily while doing a bulk upload from a local server. Two, all of the album dates were stamped with the time of my recent upload, which destroyed by chronological sorting of the albums.

I started manually “editing” each album through Gallery2’s web interface, but that gets old real fast, so I decided to try a little MySQL wizardry.

Solution

Gallery2 stores most of its data, except for the photos and movies, in a MySQL database on the web-server. Of course, you can use other back-ends, but MySQL is very popular. The database can be manipulated using your favorite MySQL monitor, whether that occurs through a command shell or a web-interface, like phpMyAdmin.

My solution was to use MySQL to search through each album, find the oldest picture, and update the album’s origination date to match the oldest picture.

Using my favorite MySQL interface, I crafted the following SQL statement to examine the problem:

SELECT g_id AS albumId, g_title, g_originationTimestamp, (
SELECT MIN(g_originationTimestamp)
FROM g2_Item
INNER JOIN g2_ItemAttributesMap
 
ON g2_Item.g_id = g2_ItemAttributesMap.g_itemId
WHERE g_parentSequence LIKE CONCAT( '%/', albumId, '/' )
) AS oldestPicture
FROM g2_Item
INNER JOIN g2_ItemAttributesMap ON g2_Item.g_id = g2_ItemAttributesMap.g_itemId
WHERE g_canContainChildren > 0

This produced results like:

albumId albumTitle currentTime oldestPicture
7 Gallery 1232042846 NULL
58896 Caleb’s Turn 1232650930 1175871585
59400 Originals 1232650977 1175871585
59416 Landscapes and Nature 1232651065 1232651128
59417 Blossoms and Sunsets 1232651128 1081207812
59600 icebergs 1232651167 1232651167
59633 More Sunsets and Bathtime 1232651168 1174399484
59634 Originals 1232651168 1174399529
59943 Trees and Geese 1232651264 1081295988
60013 Family Photos 1232653712 978487347

As you can see, many of the albums had pictures much older than the timestamp on the folder. The theory appears sound. Let’s experiment!

Attempt #1

Since we are experimenting on your gallery database, obviously you want to back it up first. If I have to tell you that, you are in over your head. 😉

Using a modified version of the above query, I tried to update the origination timestamp to the value of the oldest picture, like so:

UPDATE g2_Item AS albumId SET g_originationTimestamp =
(
SELECT MIN(g_originationTimestamp)
FROM g2_Item INNER JOIN g2_ItemAttributesMap
ON g2_Item.g_id=g2_ItemAttributesMap.g_itemId
WHERE g_parentSequence LIKE CONCAT('%/', albumId.g_id, '/')
) WHERE g_canContainChildren > 0

Unfortunately, this produces the following error:

ERROR #1093 - You can't specify target table 'albumId' for update in FROM clause

Apparently, UPDATE will not allow you to modify a table that is part of the query. That seems reasonable. However, that is exactly what I needed to do. This forced me to create a temporary table to hold the intermediate results, and then use those results to update the desired table.

Attempt #2

Here was my final solution, which required 4 separate statements:

DROP TABLE IF EXISTS newAlbumTimes;

This first statement is only necessary, if you iterate and experiment with this approach. It deletes the temporary table, if it exists, which may happen after you tweak something and try again, depending on your connection method.

CREATE TEMPORARY TABLE newAlbumTimes (albumId INT(11), albumTitle VARCHAR(128), albumTimeStamp INT(11), oldestPicture INT(11))
SELECT g_id AS albumId, g_title AS albumTitle, g_originationTimestamp AS albumTimeStamp, (
SELECT MIN(g_originationTimestamp )
FROM g2_Item
INNER JOIN g2_ItemAttributesMap
 
ON g2_Item.g_id = g2_ItemAttributesMap.g_itemId
WHERE g_parentSequence LIKE CONCAT( '%/', albumId, '/' )
) AS oldestPicture
FROM g2_Item
INNER JOIN g2_ItemAttributesMap ON g2_Item.g_id = g2_ItemAttributesMap.g_itemId
WHERE g_canContainChildren >0
ORDER BY oldestPicture;

Now that is the “brains” of the operation. First, notice the select statement, very similar to the original query. However, these results are being fed into a “CREATE TEMPORARY TABLE” statement, which catches the results. Notice, we had to tell MySQL the structure of this temporary table, which should generally match the structure of the output columns. This temporary table will be destroyed when the connection closes. However, we will use this table in the meantime, to update the timestamps based on the oldest picture’s timestamp.

UPDATE g2_Item INNER JOIN newAlbumTimes
ON g2_Item.g_id = newAlbumTimes.albumId
SET g_originationTimestamp=oldestPicture
WHERE oldestPicture AND oldestPicture < albumTimeStamp;

Using the temporary table, the origination time is updated for all albums in the item table, but only if the oldest picture column is non-NULL and if the picture is stamped older than the album. Otherwise, we assume the time stamp for the album is better than what we have calculated, so we leave it alone.

UPDATE g2_Entity INNER JOIN newAlbumTimes
ON g2_Entity.g_id = newAlbumTimes.albumId
SET g_creationTimestamp=oldestPicture, g_modificationTimestamp=oldestPicture
WHERE oldestPicture AND oldestPicture < g_creationTimeStamp;

We also have to update the creation time stamps, which is the date actually displayed and permitted to be edited for the album. This is the statement that does what we want!

Conclusion

Running the above 4 statements in order produces the desired affect.  Preferably, these should be executed as a single entry to avoid the temporary table disappearing. However, this procedure only “bubbles up” the oldest time stamp by one level.  Most albums will be updated correctly after this.  However, if you have nested albums (albums inside albums), you will have to run this procedure at least once for each level of album (or folder) hierarchy.  This will cause your top level album to show a creation date equal to or older than the oldest picture anywhere in its hierarchy.

Notes

If you are running an older version (Gallery1) or newer version (Gallery3, currently in early development), you will obviously have to modify the above statements to match the your database structure. Also, these statements were executed using MySQL 5.0.70. If you use an older version, like MySQL 4.0, you may find the nested queries do not work. This will require you to create an additional temporary table to cache those intermediate results – much more complicated.  It may be better to upgrade, if you can.

You can include additional constraints on the WHERE clause of the final two statements to only modify certain albums.  This can be useful if you are only want to fix a certain subset of albums in your gallery.

Share
Dec 052008
 

Introduction

Every computer you buy or assemble comes with some decent integrated sound card.  A while back (~2000), I bought a “prosumer” (entry-level professional) sound card, the Mia by EchoAudio.  It was a pain at first, because the Windows drivers for XP were just not mature.  However, after about 6 months of updating drivers every other week, I finally got a rock-solid sound card that blew the doors off any integrated sound card I have used, even up to this day (12-2008).  All the integrated cards sound so “tinny” and “weak”, compared to the rich, full sound of this card.

For some time, I have avoided moving my main workstation to Linux, in some measure, due to the lack of drivers for this sound card.  However, a few weeks ago, I noticed an entry for the EchoAudio Mia in the kernel config!  Here is how I managed to get it working on Gentoo Linux.

Default ALSA Installation

The most modern sound system on Linux at this time is ALSA.  The Gentoo wiki page for installing it is here:

http://www.gentoo.org/doc/en/alsa-guide.xml

One important note:  ALSA can be compiled into the kernel or compiled separately as loadable modules.  Currently, Gentoo has mostly abandoned the in-kernel approach (which uses the alsa-drivers package), and it now uses the loadable module approach.  There’s no reason to buck the system here, so we are going to use the loadable module approach.

When you activate the appropriate ALSA kernel options, make sure you include the driver for the EchoAudio Mia, or whatever Echo Audio product you may be using.  Otherwise, you can follow the above guide up to the point where you are ready to run alsaconf.

Make sure you:  Recompile the kernel.  Copy it into place.  Update grub.conf.  Reboot.  You know the drill.  🙂

Modified ALSA Installation for Mia

The Mia driver depends on alsa-utils, but it also needs other ALSA packages, which are not necessary for other integrated sound cards, like hda-intel.  Furthermore, the default make.conf flags do not include the Mia components.  To include these, and to use the latest version of ALSA ;), let us first add a few keywords and compile flags to the ALSA build configuration:

Now, we are ready to install ALSA, again, in addition to the other necessary packages:

# Use latest version of everything ALSA
$ echo -e "media-sound/alsa-tools ~amd64\nmedia-sound/alsa-utils ~amd64\nmedia-sound/alsa-firmware ~amd64\nmedia-sound/alsa-headers ~amd64\nmedia-libs/alsa-lib ~amd64" >> /etc/portage/package.keywords

# Include Mia during ebuilds
$ echo 'ALSA_CARDS="mia"' >> /etc/make.conf

# Check for weirdness:
$ emerge -pvt alsa-utils alsa-tools alsa-firmware alsa-lib alsa-headers

# Build!
$ emerge alsa-utils alsa-tools alsa-firmware alsa-lib alsa-headers

Now update the ALSA configuration using alsa-conf, and you should be good to go!

$ alsaconf

Other Tips

The best mixer to use is the echomixer, which is made for the EchoAudio products, like the Mia.

In the default configuration, most everything is muted, so you will have to slide up the appropriate sliders.  Just be careful not to overdrive the card.  Stop at “0 dB” or less.  Do not slide it up to “+6 dB”; otherwise, you get a fair amount of distortion.

Other pages that mention the “alsa-drivers” package are based upon the “IN-KERNEL” approach.  Those instructions are not compatible with these.  Be careful if you decide to “mix and match”.

References

  1. ALSA’s EchoAudio development status – http://www.webalice.it/g_pochini/ead
Share
Dec 052008
 

Introduction

Regrettably, there are some software applications that just run better on Windows, specifically, Windows XP.  Of course, Windows runs better on Linux, so I guess we can still hold to our axiom that “All things run better on Linux”. 😯 Ok, not really.  😀

In my case, I would like to run the full Office 2003 suite on Linux.  Using Wine is an option, but it can be a bit buggy.  CrossOver is a better option, but it costs money, and I am cheapskate.  Plus, it’s like “double-taxation”.  I have to pay somebody a tax, so I can pay my Microsoft tax.  That ain’t right!  Well, I am not really a Linux purist – I am just a practical cheapskate.  And, I would like to learn about OS virtualization, and apparently, so do you!  Otherwise, you would not be reading this.  😉

Installing KVM on Gentoo

KVM is just one of many possible virtualization methods, which is a way to run one OS inside of another OS.  (Imagine a “window” that is running Windows XP inside, and it “thinks” it is the entire computer.  It does not realize that it is running inside of another “computer”.)

Note:  These instruction are for:

Host:  Gentoo 2008.0
RAM:  >= 1.5GB
Kernel:  2.6.27
KVM:  v79
Guest:  Windows XP Pro

Of course, there these instructions may have to be varied slightly to accommodate your exact application.

Here are my modified instructions, based on the Gentoo Wiki:

1. Update the kernel with IN-KERNEL KVM (no need for module mayhem):

$ cd /usr/src/linux
$ make menuconfig

[*] Virtualization --->
        --- Virtualization
        <*> Kernel-based Virtual Machine (KVM) support
        <*>   KVM for Intel processors support
        < >   KVM for AMD processors support
        <*>   PCI driver for virtio devices (EXPERIMENTAL)
        <*>   Virtio balloon driver (EXPERIMENTAL)

If you want to be able to do networking, you should also enable VLAN bridging and tapping, while you are here:

Device Drivers --->
    [*] Network device support --->
            <M> Universal TUN/TAP device driver support

Networking --->
    Networking options --->
        <*> 802.1d Ethernet Bridging
        <*> 802.1Q VLAN Support

Copy new kernel into place.  Update grub.conf.  Reboot using new kernel. … You know the drill. 🙂

2. Ensure the latest version of KVM:

$ echo 'app-emulation/kvm ~amd64' >> /etc/portage/package.keywords

3. Activate useful USE flags:

$ echo 'app-emulation/kvm gnutls sdl' >> /etc/portage/package.use

4. Check emerge for weirdness and install:

$ emerge -pvt kvm usbutils bridge-utils usermode-utilities

These are the packages that would be merged, in reverse order:

Calculating dependencies... done!
[ebuild   R   ] sys-apps/usermode-utilities-20040406-r1  0 kB
[ebuild   R   ] net-misc/bridge-utils-1.4  0 kB
[ebuild   R   ] sys-apps/usbutils-0.73  USE="zlib -network-cron" 0 kB
[ebuild   R   ] app-emulation/kvm-79  USE="alsa esd gnutls modules ncurses sdl -havekernel -pulseaudio -test -vde" 0 kB

Looks ok to me.  Does it look ok to you?  Let’s go:

emerge -pvt kvm usbutils bridge-utils usermode-utilities

5. Setup access for non-root users:

For each non-root user, add them to the KVM group:

gpasswd -a <non-root-userid> kvm

Launching Guest for First Time and Installing Windows:

First, you need to create an “image” file, which will contain the entire Windows XP guest OS (think C:\ drive).  Here’s the default way:

kvm-img create winxp_raw.img 30G

This will create a RAW image format that is 30 GB in size.  This is the simplest and most portable image format.  However, it is not the coolest!

kvm-img create -f qcow2 winxp.img 30G

This does the same thing, but it uses the latest QEMU format, which enables additional features, like image overlays.

Second, if you use ALSA for your host’s sound, then you can enable it like so:

export QEMU_AUDIO_DRV=alsa

Third, install Windows XP into image.  Here’s the simplest method:

kvm -hda winxp.img -cdrom /dev/cdrom1 -boot d

This will do the same thing but it will use a local image of the install ISO (faster?), use 1GB of RAM (default is much less), use host’s local clock (helps Windows see the right time), emulate better VGA card (more colors and resolution), and allow access to 2 processors:

kvm -hda winxp.img -cdrom /winxp/ISO/WINXPSP2.ISO -m 1024 -localtime -vga std -smp 2 -boot d

Using The Virtualized Guest Windows XP

The emulated “box” will reboot once as part of the Windows XP installation process.  After it comes back up, you should be good to go!  You can now download programs, install programs, update the install, etc., just like you would with a regular Windows XP installation.  Of course, there will be some limitations, because the emulated hardware is not exactly feature-rich.

At some point, you will “shut down” the emulated Windows XP machine.  To restart it, use the a similar command – with the exception of not booting from the install disk (or ISO):

kvm -hda winxp.img -cdrom /dev/cdrom1 -m 1024 -localtime -vga std -smp 2

Accessing Host Drives

To access a local partition, first ensure that samba is installed – not running – just installed.

Then, simply add the path to the mounted partition, like so:

kvm -hda winxp.img -cdrom /dev/cdrom1 -m 1024 -localtime -vga std -smb /path/to/dir

Otherwise, you can add the share name, if you have samba already running and properly configured, like so:

kvm -hda winxp.img -cdrom /dev/cdrom1 -m 1024 -localtime -vga std -smb <share_name>

Inside the Windows guest OS, the mounted share is available at:

\\10.0.2.4\qemu

Also, from inside the guest OS, you can SSH, SCP, SFTP, FTP, or telnet to the host, depending on running host services, using this IP:

10.0.2.2

Other options are listed on the Arch Linux Wiki.

Using Overlays

I have found that this process is not entirely stable.  Some combinations of host hardware, host OS, emulated hardware, and guest OS, work better than others.  If I tried to emulate too much hardware, the Windows XP installation would crash, so I typically had to install using the most modest, simplest emulation.

Also, I found that this process could be slow during “boot-up” and “installation”.  Maybe disabling ACPI emulation would help?

Anyway, you can quickly make a wrong turn and wreck your “virtual machine”, basically ruining your created image, in which you spent so much time setting up and installing.  Fortunately, there are 2 techniques to help mitigate this annoyance.

One, with the emulator shut-down, simply copy the image file to another location or file name to back it up, like so:

cp -fp winxp.img winxp_orig_install.img

Then you can always copy the good install back over a broken install, like so:

cp -fp winxp_orig_install.img winxp.img

Ta-Da!  Of course, the downside of this approach is rampant disk-usage.  You need double the disk space, possibly more, depending on how many backups you make.

Another technique is using “overlays”.  You can create an “overlay” of a good image like so:

kvm-img create -b winxp.img -f qcow2 winxp_20081225.ovl

Then you can boot from the overlay, just like you would any other image, like so:

kvm -hda winxp.ovl -cdrom /dev/cdrom1 -m 1024 -localtime -vga std -smp 2

The overlay contains a “diff” of the new state and the original image, so it is much smaller, since it only contains what changed. If the overlay gets corrupted, you can simply delete the overlay, create another, and go again!

You can also stack overlays, but I think this can waste diskspace too, and it requires that you keep the whole “stack” in place.  Pull out one overlay in the stack, or just move it, and the whole thing tumbles down!  🙁

Based on a tip from Bryan Murdock’s blog for resizing image files, you can combine an overlay stack into a new, single, independent image file, like so:

# create a new image file, which will be the consolidated image
kvm-img create -f qcow2 winxp_new.img 30G

Download the latest clone-zilla LIve-CD (or DVD) ISO.

http://www.clonezilla.org

In a KVM session, boot from the downloaded ISO, and include your original overlay as HDA and your new image as HDB, like so:

kvm -cdrom clonezilla-live-1.2.1-17.iso -hda winxp.ovl -hdb winxp_new.img -m 1024 -vga std -boot d

Generally, you should accept the defaults, unless you know what you are doing, and of course, you do. 😉  The key is to choose the option for a “device-device disk/partition to disk/partition” clone, or something to that effect.  (I don’t remember the exact wording.)  Make sure you copy the complete contents, including the MBR.  Your source is HDA, and your target is HDB.  … The cloning takes a while.  After it finishes, be sure to halt, and then start up a new KVM session, using the new image file:

kvm -hda winxp_new.img -cdrom /dev/cdrom1 -m 1024 -localtime -vga std

Try hiding the original image and overlay files to see if it works.  It should!

Other Things

The default network setup is good for surfing the web, downloading stuff, and checking email.  However, if you want other devices on your LAN to “see” the guest OS as another machine, you will have to create a bridge and tap.  This gets a little more complicated.  See the references below for more details.

If the installation or something crashes, try restarting the machine – but, don’t boot from the installation disk.  Many times the install process completed “good enough” before crashing.  😮  Yeah, I know.  It smells funny to me too, but it works.  🙄  Just be sure to keep lots of backup copies of your images or overlays.

References

  1. http://en.gentoo-wiki.com/wiki/KVM
  2. http://kvm.qumranet.com/kvmwiki/HOWTO1
  3. https://help.ubuntu.com/community/WindowsXPUnderQemuHowTo
  4. https://help.ubuntu.com/community/KVM
  5. http://bryan-murdock.blogspot.com/2007/12/resize-qemukvm-windows-disk-image.html
  6. http://www.linuxjournal.com/video/run-your-windows-partition-without-rebooting
Share
Nov 202008
 

Introduction

In TikiWiki’s setup, an admin account is created.  The default password is “admin”.  And, you are forced to change that password as soon as you login.  Unfortunately, you are not required to provide an email address for the “admin” account during setup.

This sets the stage for two moments of sheer panic:

  1. You log out of admin, forget the password, and try to reset the password:  Resetting passwords typically require an email address; therefore, you cannot reset the password.  Yikes!  8-|
  2. While in admin you enable, “Challenge-Response Authorization”.  Later, you log out of admin and try to log back in.  Unfortunately, “Challenge-Response Authorization”, although more secure and therefore desirable, depends on the user additionally entering his email address.  But, the admin account has no email address, and so you cannot log back in as admin, even if you know the password.  Double Yikes!!! 8-|

If you get bitten, here are a couple of anti-venom therapies.

Solution

If you have database access (MySQL, in this case), either via a shell, MySQL client, or phpMyAdmin, you can directly update the database.  Instructions here are for shell access:

To simply reset the admin password to “admin”:

$ mysql -u my_db_user_id i -p my_db_name
Enter password:

mysql> UPDATE `users_users` SET `password`='admin', `hash`= md5('adminadmin') WHERE
    -> `login`='admin';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> quit;
Bye

Once you log in, you should obviously change the password to a strong, non-default password.

If you are unable to login, and you cannot reset the password, because you forgot to assign an email address to the “admin” account, do the following:

$ mysql -u my_db_user_id i -p my_db_name
Enter password:

mysql> UPDATE `users_users` SET `email`='myemail@server.com' WHERE `login`='admin';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> quit;
Bye

By setting the email address for the admin account, you can now reset the password, assuming you did not disable that feature too, before you logged out. >8-|

If you are still stuck, be sure to check out phpMyAdmin, which provides you a graphical tool to explore and edit the underlying MySQL database without knowing the command-line syntax.

References

http://doc.tikiwiki.org/tiki-index.php?page=lost+admin+password&bl=y

http://doc.tikiwiki.org/tiki-view_faq.php?faqId=7

Share
Nov 122008
 

Problem Introduction

I frequently use a workstation that sits behind an Adtran NetVanta 3120.  The NV3120 is powerful little box.  It provides secure VPN access back to corporate headquarters, but it also provides a 4-port switch, a highly configurable firewall, and generally more bells and whistles than you could ever want.

Recently, I added a Hewlett-Packard Photosmart C7280 to the network.  However, it sits beyond the NV3120’s LAN, so other workstations on the greater LAN can use it, like my Gentoo laptop.

The default printer configuration went great!  I was printing in no time from my workstation behind the NV3120.  However, scanning was another issue.

Apparently, when used in scan mode, the HP C7280 originates traffic on a non-established port, so it becomes blocked or is otherwise lost.  I knew everything else was working fine, because I could bypass the NV3120 and scanning would work great!  But, that was not going to be acceptable for frequent use.

Network Topology

Here is an ASCII representation of the relevant network subsection:

                                            Incoming Line
                                                  |
                                         [ Wireless Router ]
                                            192.168.1.1
                      /                           |                        \
               192.168.1.100                192.168.1.101              192.168.1.102
            [ NetVanta 3120 ]    [ HP C7280 Printer-Scanner-Fax ]    [ Workstation #3 ]
                10.10.0.110                                               Laptop
             /             \
    10.10.0.99           10.10.0.100
[ Workstation #1 ]   [ Workstation #2 ]
  Windows XP Pro           Linux

The critical path is highlighted in red.

Solution

Eventually, I called the Adtran tech support.  I was pleasantly surprised to receive a call back from a support engineer in short order.  He understood my problem very quickly, and he knew immediately what to do!  What follows are my scribbled notes for the steps he proscribed:  (Of course, your policy names and IP numbers may vary.)

  1. Backup NV3120 configuration, in case something goes wrong.  😉
  2. Configure NV3120 to grab static IP, not DHCP-based IP from wireless router:

    Click on:  System -> Public Interface -> IP SettingsComplete as follows:IP:  192.168.1.100
    MASK:  255.255.255.0
    DEF GW:  192.168.1.1

  3. Add UDP relay for NetBios broadcast by HP C7280 printer (192.168.1.101) to be encapsulated and relayed through NV3120 (192.168.1.100) to its LAN (10.10.0.X) and vice-versa:Click on:  Data -> UDP Relay -> IP Helper AddressAdd following addresses:

    10.10.0.99 – Public (eth0)
    10.10.0.100 – Public (eth0)
    192.168.1.101 – vlan1
    UDP Forward Protocol:  netbios (port 137)  [Press “Add”]

  4. Allow traffic between 10.10.0.X subdomain and and 192.168.1.X subdomain:Click on:  Data -> Firewall -> Security Zones -> Edit Security Zones -> Public
    Add Policy to Zone “Public”
    Type:  Allow
    Description:  Allow 192.168.1.X to 10.10.0.X
    Stateless Processing:  OFF
    Destination Security Zone:  <Any Security Zone>
    Source – Specified:  192.168.1.0 / 255.255.255.0
    Destination – Specified:  10.10.0.96 / 255.255.255.240
    Protocol:  any

    Use “arrows” to move new policy right below “VPN Selector” and before everything else.

  5. Allow traffic between 192.168.1.X subdomain and 10.10.0.X subdomain:Click on:  Data -> Firewall -> Security Zones -> Edit Security Zones -> Private

    Add Policy to Zone “Private”
    Type:  Allow
    Description:  Allow 10.10.0.X to 192.168.1.X
    Stateless Processing:  OFF
    Destination Security Zone:  <Any Security Zone>
    Source – Specified:  10.10.0.96 / 255.255.255.240
    Destination – Specified:  192.168.1.0 / 255.255.255.0
    Protocol:  any

    Use “arrows” to move new policy right above “NAT list wizard-ics” and below everything else.

  6. Create policy for UDP Relay:Click on:  Data -> Firewall -> Security Zones -> Edit Security Zones -> Public
    Add Policy to Zone “Public”
    Type:  Advanced
    Description:  Relay netbios
    Policy Action:  Allow
    Destination Security Zone:  <Self Bound>
    Stateless Processing:  OFF

    – Add New Traffic Selector –
    Type:  Permit
    Protocol:  UDP
    Source:  Any, Any
    Destination:  Any host, Port:  “Well Known” : 137 – netbios-ns

    Use “arrows” to move second from top, below “VPN selector”, but above recent “Allow 192.168.1.X to 10.10.0.X” policy.

  7. Reassign VPN Crypto Map – It occasionally gets lost during the above changes:Click on:  Data -> VPN -> VPN Peers -> Advanced VPN Policies -> Assign Crypto Maps to Interfaces:

    Public    VPN
    vlan1     none

  8. Save configuration changes and reboot NV3120 unit.  Backup configuration again, in case something goes wrong in the future.  😉
  9. On wireless router, add a “static route”, so traffic intended for the VPN subdomain (10.10.0.X) that leaves the printer (192.168.1.X) can find its way back to VPN subdomain and not onto global internet:

    On wireless router’s configuration page (not NV3120), click on:  Advanced -> Static Routes -> Add (Or, similar depending on brand and model):

    Name:  NV3120-VPN
    Private:  Off
    Active:  On
    Destination IP:  10.10.0.96
    Gateway IP:  192.168.1.100
    Metric:  2Beyond the destination and gateway IP’s, the exact settings and menu navigation path will vary depending on router’s brand and model.

Explanation

Admittedly, the solution is a bit complex, but the problem is a bit complex too.  Part of the complication comes from the fact that the printer broadcast various netbios-ns UDP packets to find computers on its domain.  However, the computer used in this case does not exist on that domain.  It exists on a private, VPN domain.  So, we have to not only configure the firewall to allow traffic, but we must also relay UDP broadcasts between the two domains.

Many thanks to the Adtran support engineer, who guided me through the above steps, including configuring the 3rd party router!

Share
Nov 102008
 

The Problem

Occasionally, I find a text file that was written on a Windows box that contains additional garbage text.  Most often the text displayed, looks like this:

/*^M
 * @(#)MyApplication.java  2.0  01 April 2005^M
 *^M
 * Copyright (c) 2003-2005 Werner Randelshofer^M
 * Staldenmattweg 2, Immensee, CH-6405, Switzerland.^M
 * This software is in the public domain.^M
 */^M^M

Or, even worse, as a single line, like this:

/*^M * @(#)MyApplication.java  2.0  01 April 2005^M *^M * Copyright (c) 2003-2005 Werner Randelshofer^M * Staldenmattweg 2, Immensee, CH-6405, Switzerland.^M * This software is in the public domain.^M */^M^M

Either way, this is annoying, if not unusable.

Brief Explanation

The primary cause of the problem is a difference of encoding ‘newline’ between the Unix and DOS (Windows) conventions.  The difference is long-standing, dating back to the days when printers were the primary ‘display’.

The Windows’ convention uses two ASCII characters, which signal ‘line-feed’ (which meant to roll the printer paper up one line) and ‘carriage-return’ (which meant to send the printer head back to the beginning of the line).  Unix selected one of those characters (‘carriage-return’) to do the same thing.

These symbols usually appear as:

^M^J

Or,

^M

Depending on the encoding, platform, and application.

The Solution Using Emacs

On most Unix platforms, commands such as unix2dos and dos2unix can be used to convert a text file from Windows to UNIX format or vice-versa.  However, sometimes a file can get so garbled that even these tools do not work.  Regardless, it is nice to know-how to fix this in Emacs.

The easiest way to fix the second case in Emacs is:

  1. Place the cursor on the first part of the strange character, the caret (^).
  2. Press C-‘ ‘ (Control + Space) to begin marking.
  3. Move to the right one character.  (You’ll notice that it jumps an extra character.  That is because ^M is really one ASCII character.)
  4. Press C-W to remove the text.
  5. Immediately, press C-Y to yank the text back.
  6. Jump to the top of the document (Esc-< or M-<).
  7. Replace all occurrences:
    1. M-x replace-string
    2. Press C-Y to paste in the text to be replaced.
    3. Press C-Q, C-J to replace with a ‘quoted’ ^J, which is the Unix newline (or, C-Q, C-M, C-Q, C-J for Windows).
    4. Press ‘Enter’ to replace all occurrences.

A little experimentation will be necessary to adapt to other cases.  You can read more here:

http://lists.freebsd.org/pipermail/freebsd-questions/2006-October/134422.html

Share
Oct 302008
 

HOWTO Connect a Linux computer to an HP PhotoSmart C7280 Printer

The HP PhotoSmart C7280 All-In-One printer contains a photo printer, scanner, and fax machine.  It can be setup as a wired Ethernet print server, wireless 802.11g print server, or a local USB printer.  It is very nice, and if you watch the NewEgg specials, you can often find one for a very good price every so often.  I have enjoyed using it from my Windows workstation; however, since I have the C7280 connected to my network through its Ethernet port (a wired print server), I would like to be able to use my Linux laptop to also print to it.

Fortunately, most HP printers are well supported in Linux.  So, I had high hopes!

As mentioned in other posts, my current favorite distribution of Linux is Gentoo, so my directions will be for Gentoo; however, you can probably adapt them to your favorite distro.

CUPS

CUPS is the modern Unix/Linux printing interface.  It provides both a server and client for the common printing tasks (lpr, lpq, lpstat, etc.).  Therefore, CUPS must be installed before you can do anything else.

I added a few extra USE flags to my CUPS install, although I don’t think these are necessary in general:

$ echo 'net-print/cups dbus ppds' >> /etc/portage/package.use

Beyond that, installation is simple:

$ emerge cups

Since we are connecting to the C7280 via the network, no configuration changes are required for CUPS.  However, you will have to fire up the CUPS daemon and add it to your start-up services:

$ /etc/init.d/cupsd start
$ rc-update add cupsd default

You can find more info on configuring CUPS to work on Gentoo with other setups here:

http://www.gentoo.org/doc/en/printing-howto.xml

HPLIP

The HP printer drivers are based on a standard HPLIP package, which is used with all modern HP printers, and a PPD file, which is specific to your printer model.  The latest HPLIP package can be installed in Gentoo, like so:

# For AMD64, Intel Core2, and newer x86 64-bit archs
$ echo 'net-print/hplip ~amd64' >>/etc/portage/package.keywords
# Install HPLIP
$ emerge hplip

The latest PPD file for the C7280 should be downloaded from the Linux Printing repository.  Currently, the C7200 model covers the C7280, and it’s PPD can be downloaded from here:

http://www.openprinting.org/show_printer.cgi?recnum=HP-PhotoSmart_C7200

On a Gentoo box, the PPD file should be saved in a certain location, and only root should have access to it:

mv <path_to_download>/HP-PhotoSmart_C7200-hpijs.ppd /usr/share/ppd/HP/
chown root:root /usr/share/ppd/HP/HP-PhotoSmart_C7200-hpijs.ppd

With that put in place, you are now ready to configure the HPLIP program, like so:

$ hp-setup

The wizard should make everything self-explanatory, except you may have to manually search for the PPD file, if the wizard cannot find it for you.  When I used the wizard, it was able to find the printer automatically and very quickly.  However, I had to locate the PPD file for it.

If everything goes smoothly, you will be done.  All that remains is to restart cups, like so:

$ /etc/init.d/cupsd restart

If things don’t go smoothly, you may have to add the printer manually through the CUPS interface or to the printers.conf file, as I had to do.

Manually Adding the C7280 to CUPS

Unfortunately, the HPLIP setup wizard was not working correctly, and I had to manually add the printer to CUPS.  I used the web interface to CUPS, which can be accessed using a web-browser on the Linux box at:

http://localhost:631

From here, I clicked on “Add Printer”, and manually entered the necessary information.  (You should know the IP address of the C7280 printer on your network.)  Most of it was obvious, except these two bits:  The device connection type was:

AppSocket/HP JetDirect

And, the “Device URI” was:

socket://192.168.0.11:9100

Of course, you will have to change the above IP address to match your needs.  … If you have already configured a Windows box to use the same printer, you can get some clues for the above info in the Windows’ printer’s properties.

The CUPS wizard may request a user id and password.  Any requested userid is referring to root and root’s login password.  These are needed near the end of the CUPS wizard, so it can edit the CUPS configuration files for you.

After entering the necessary info, pointing to the downloaded PPD file, and completing the web install, I was printing my first test page in no time!

If you prefer to work on the command line, and you are comfortable with CUPS, here are the modifications to my CUPS’ files:

/etc/cups/printers.conf

# Printer configuration file for CUPS v1.3.8
# Written by cupsd on 2008-10-30 17:37
<DefaultPrinter HP-PhotoSmart-C7280>
Info HP PhotoSmart C7280
Location My Office
DeviceURI socket://192.168.0.11:9100
State Idle
StateTime 1225405864
Accepting Yes
Shared Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
OpPolicy default
ErrorPolicy stop-printer
</Printer>

That’s it!

And, for good measure, you should always restart CUPS after monkeying around with its files:

$ /etc/init.d/cupsd restart

Conclusion

Well, it took a little longer than I first hoped, but it was not so bad.  Now, I can print, scan, and fax from HP PhotoSmart C7280 using my Gentoo Linux laptop. 🙂

Share
Oct 212008
 

The Spec

The Perl system() function is very powerful and elegant. It forks a child process, suspends the parent process, executes the provided command (potentially calling your shell to help parse the arguments), while seamlessly redirecting all IO between the parent and child process! The usage is simple, but what happens behind the scenes is amazing!

Unfortunately, there is no way to interrupt the system() function in Perl. Sure, you can kill the main Perl program, but that’s not what I want. I want to call system() with 2 additional arguments: timeout and maxattempts. This subroutine would operate just like the traditional system() function, unless operation time exceeded the timeout value, in which case, the command would be killed and restarted, until the maximum number of attempts was exceeded.

A Dead End

You can find many resources that detail how to timeout a long Perl operation, like so:

eval {
    local $SIG{ALRM} = sub { die "alarm clock restart" };
    alarm 10;
    flock(FH, 2);   # blocking write lock
    alarm 0;
};
if ($@ and $@ !~ /alarm clock restart/) { die }

Unfortunately, there is a little footnote that says you should not try this with system calls; otherwise, you get zombies. Sure enough, if you substitute a system() function for the above flock, the parent Perl script is alarmed by the timeout and exits the eval. Normally, this would kill the flock or any other function. But the system function persits. The parent may even complete the remainder of its program and exit, but the child will keep on ticking – not what I wanted. The second problem is that there is no way to get, or access the process id of the command executed by the system() function; therefore, there is no way to kill a system function call by the parent Perl process – at least, no way that I have found.

The above link suggests using fork and exec to create your own function, which is ultimately what I did. So, let’s jump straight to the chase scene, shall we? Here’s my final solution.

Preferred Solution

#!/usr/bin/perl -w
use strict 'refs';
use POSIX "sys_wait_h";

sub timedSystemCall {

  my ($cmd, $timeout, $maxattempts, $attempt, $origmax) = @_;

  # degenerate into system() call - infinite timeout, if timeout is undefined or negative
  $timeout = 0 unless defined($timeout) && ($timeout > 0);
  # degenerate into system() call - 1 attempt, if max attempts is undefined or negative
  $maxattempts = 1 unless defined($maxattempts) && ($maxattempts > 0);
  $attempt = 1 unless defined($attempt) && ($attempt > 0);
  $origmax = $maxattempts unless defined $origmax;

  local ($rc, $pid);

  eval {
    local $SIG{ALRM} = sub { die "TIMEOUT" };

    # Fork child, system process
  FORK: {
      if ($pid = fork) {
        # parent picks up here, with $pid = process id of child; however...
        # NO-OP - Parent does nothing in this case, except avoid branches below...
      } elsif (defined $pid) {  # $pid is zero if here defined
        # child process picks up here - parent process available with getppid()
        # execute provided command, or die if fails
        exec($cmd) || die("(E) timedSystemCall: Couldn't run $cmd: $!\n");
        # child never progresses past here, because of (exec-or-die) combo
      } elsif ($! =~ /No more processes/) {
        # Still in parent:  EAGAIN, supposedly recoverable fork error
        print STDERR "(W) timedSystemCall: fork failed.  Retrying in 5-seconds. ($!)\n";
        sleep 5;
        redo FORK;
      } else {
        # unknown fork error
        die "(E) timedSystemCall:  Cannot fork: $!\n";
      }
    }

    # set alarm to go off in "timeout" seconds, and call $SIG{ALRM} at that time
    alarm($timeout);
    # hang (block) until program is finished
    waitpid($pid, 0);

    # program is finished - disable alarm
    alarm(0);
    # grab output of waitpid
    $rc = $?;
  };                            # end of eval

  # Did eval exit from an alarm timeout?
  if (($@ =~ "^TIMEOUT") || !defined($rc)) {
    # Yes - kill process
    kill(KILL => $pid) || die "Unable to kill $pid - $!";
    # Collect child's remains
    ($ret = waitpid($pid,0)) || die "Unable to reap child $pid (ret=$ret) - $!";
    # grab exit output of child process
    if ($rc = $?) {
      # exit code is lower byte: shift out exit code, leave top byte in $rc
      my $exit_value = $rc >> 8;
      # killing signal is lower 7-bits of top byte, which was shifted to lower byte
      my $signal_num = $rc & 127;
      # core-dump flag is top bit
      my $dumped_core = $rc & 128;
      # Notify grandparent of obituary
      print STDERR "(I) timedSystemCall:  Child $pid obituary: exit=$exit_value, kill_signal=$signal_num, dumped_core=$dumped_core\n";
    }
    # Can we try again?
    if ($maxattempts > 1) {
      # Yes! Increment counter, for print messages
      $attempt++;
      print STDERR "(W) timedSystemCall:  Command timed-out after $timeout seconds.  Restarting ($attempt of $origmax)...\n";
      # Recurse into self, while decrementing number of attempts. Return value from deepest recursion
      return timedSystemCall($cmd, $timeout, $maxattempts-1, $attempt, $origmax);
    } else {
      # No!  Out of attempts...
      print STDERR "(E) timedSystemCall:  Exhausted maximum attempts ($origmax) for command: $cmd\nExiting!\n";
      # Return error code of killed process - will require interpretation by parent
      return $rc;
    }
  } else {
    # No - process completed successfully!  Hooray!!!  Return success code (should be zero).
    return $rc;
  }
}

exit timedSystemCall("inf.pl", 5, 3);

The reason this solution is preferred is because it does not consume CPU while waiting for the child to complete or timeout. Furthermore, it’s the simplest and most elegant solution I have found.

This solution works because the child inherits the exact same environment as the parent, including its standard IO handles (STDOUT, STDIN, STDERR), just as does the command issued by the system() function. Therefore, when the child prints to its STDOUT, it is printing directly to the parent’s STDOUT. And, when the child requests input from its STDIN, it is querying its parent’s STDIN. Therefore, we are not required to perform any fancy polling to copy the child’s output to the parent’s output, or otherwise shuttle communication between the child and the parent’s environment. Moreover, if the parent is killed for some reason, our child process is also killed, so we don’t have to worry about zombies – as much.

The hints for this solution came from an example on pg. 167 of O’Reilly’s Programming Perl, under the fork function description, and from pg. 554-555 of O’Reilly’s Perl Cookbook, under the discussion, “16.1. Gathering Output from a Program”.

Unfortunately, this was not the first solution I created. If you are interested, a few other solutions I found are provided following a few usage examples. Both of these other solutions mostly work; however, they have drawbacks, when compared to the above, preferred solution.

Usage Examples

If the above script is called using the infinite output script as a child, you get output like so:

perl: timedSystemCall("inf.pl", 5, 3);
1
2
3
4
5
(I) timedSystemCall:  Child 14672 obituary: exit=0, kill_signal=9, dumped_core=0
(W) timedSystemCall:  Command timed-out after 5 seconds.  Restarting (2 of 3)...
1
2
3
4
5
(I) timedSystemCall:  Child 14683 obituary: exit=0, kill_signal=9, dumped_core=0
(W) timedSystemCall:  Command timed-out after 5 seconds.  Restarting (3 of 3)...
1
2
3
4
5
(I) timedSystemCall:  Child 14685 obituary: exit=0, kill_signal=9, dumped_core=0
(E) timedSystemCall:  Exhausted maximum attempts (3) for command: inf.pl
$ echo $?
9

This particular child produces output every second for infinity, except it is limited by our new function for a 5-second timeout with a maximum of 3 attempts. The function politely reports all restarts on standard error, so not to comingle with the standard output.

If the system() call does not exceed the timeout, or if the last two arguments are omitted, then the perl script ends as would be expected of a normal system() call, like so:

# complete in 3 seconds - before 5 sec timeout
perl: timedSystemCall("inf.pl 3", 5, 3);
1
2
3
$ echo $?
# degenerate into system() behavior
perl: timedSystemCall("inf.pl");
0
1
2
3
4
5
6
7
8
9
^C
Captured SIGINT.  Exiting after 9 seconds.
$ echo $?
130

Hopefully, you will find this function useful. If you are intested in better understanding a few alternatives, although lesser they may be, then read on! Otherwise, enjoy this new function!

Solution #1

My first solution hinges around the open3 function, which launches the input command and returns the essential process id, so we can kill it, if it runs too long. Output is synchronized by polling non-blocking versions of the child’s output handles, and dumping them to the parent’s output. This waiting loop is CPU bound, so it consumes 100% of one CPU, trying to keep the outputs synchronized – bad! Furthermore, the child’s input is not synchronized – very bad!

use IPC::Open3;
use Fcntl;
use POSIX "sys_wait_h";

sub timedSystemCall {

  local ($cmd, $timeout, $maxattempts, $retry, $origmax) = @_;

  # degenerate into system() call - infinite timeout, if timeout is undefined or negative
  $timeout = 0 unless defined($timeout) && ($timeout > 0);
  # degenerate into system() call - 1 attempt, if max attempts is undefined or negative
  $maxattempts = 1 unless defined($maxattempts) && ($maxattempts > 0);
  $attempt = 1 unless defined($attempt) && ($attempt > 0);
  $origmax = $maxattempts unless defined $origmax;

  local ($rc, $pid);

  eval {
    local $SIG{ALRM} = sub { die "TIMEOUT" };

    $pid = open3(\*WTR, \*RDR, \*ERR, $cmd) || die("(E) timedSystemCall: Unable to launch command - $cmd\n$!\n");
    # Make reads from RDR to be non-blocking
    my $rflags = 0;
    fcntl(RDR, F_GETFL, $rflags) || die $!;
    $rflags |= O_NONBLOCK;
    fcntl(RDR, F_SETFL, $rflags) || die $!;
    # Make reads from RDR to be non-blocking
    my $eflags = 0;
    fcntl(ERR, F_GETFL, $eflags) || die $!;
    $eflags |= O_NONBLOCK;
    fcntl(ERR, F_SETFL, $eflags) || die $!;
    #$pid = open3(">&STDIN", "<&STDOUT", "<&STDERR", $cmd) || die("(E) timedSystemCall: Unable to launch command - $cmd\n$!\n");

    alarm($timeout);

    # Is program finished?
    until (waitpid($pid, WNOHANG)) {
      # No!
      # NONBLOCKING: Did the program produce any output (STDOUT)?
      while () {
        # Yes - dump output to this program's STDOUT
        print STDOUT;
      }
      #NONBLOCKING: Did the program produce any errors (STDERR)?
      while () {
        # Yes - dump errors to this program's STDERR
        print STDERR;
      }
    } # exit until
    # program is finished - disable alarm
    alarm(0);
    # grab output of waitpid, and separate bytes
    $rc = $?;
    # close associated IO handles
    close(WTR);
    close(RDR);
    close(ERR);
  }; # end of eval

  # Did eval exit from an alarm timeout?
  if (($@ =~ "^TIMEOUT") || !defined($rc)) {
    # Yes - kill process
    kill(KILL => $pid) || die "Unable to kill $pid - $!";
    # Collect child's remains
    ($ret = waitpid($pid,0)) || die "Unable to reap child $pid (ret=$ret) - $!";
    # grab exit output of child process
    if ($rc = $?) {
      # exit code is lower byte: shift out exit code, leave top byte in $rc
      my $exit_value = $rc >> 8;
      # killing signal is lower 7-bits of top byte, which was shifted to lower byte
      my $signal_num = $rc & 127;
      # core-dump flag is top bit
      my $dumped_core = $rc & 128;
      # Notify grandparent of obituary
      print "(I) timedSystemCall:  Child $pid obituary: exit=$exit_value, kill_signal=$signal_num, dumped_core=$dumped_core\n";
    }
    # Can we try again?
    if ($maxattempts > 1) {
      # Yes! Increment counter, for print messages
      $retry++;
      print "(W) timedSystemCall:  Command timed-out after $timeout seconds.  Restarting ($retry of $origmax)...\n";
      # Recurse into self, while decrementing number of attempts. Return value from deepest recursion
      return timedSystemCall($cmd, $timeout, $maxattempts-1, $retry, $origmax);
    } else {
      # No!  Out of attempts...
      print "(E) timedSystemCall:  Exhausted maximum attempts ($origmax) for command: $cmd\nExiting!\n";
      # Return error code of killed process - will require interpretation by parent
      return $rc;
    }
  } else {
    # No - process completed successfully!  Hooray!!!  Return success code (should be zero).
    return $rc;
  }
}

The intense CPU utilization and lack of STDIN synchronization makes this solution undersirable and arguably a failure. It worked in my particular application, but it may not work in others. Between this issue and the unnecessary CPU utilization, this solution is an academic curiosity, but nothing more.

Solution #2

The second solution is similar to the first, because it depends on the open3 function. However, it directly connects the child’s IO handles to the parents, so that it behaves more like the final, preferred soltuion.

#!/usr/bin/perl -w
use strict 'refs';

use FileHandle;
use IPC::Open3;
use POSIX "sys_wait_h";

sub timedSystemCall {

  local ($cmd, $timeout, $maxattempts, $retry, $origmax) = @_;

  # degenerate into system() call - infinite timeout, if timeout is undefined or negative
  $timeout = 0 unless defined($timeout) && ($timeout > 0);
  # degenerate into system() call - 1 attempt, if max attempts is undefined or negative
  $maxattempts = 1 unless defined($maxattempts) && ($maxattempts > 0);
  $attempt = 1 unless defined($attempt) && ($attempt > 0);
  $origmax = $maxattempts unless defined $origmax;

  local ($rc, $pid, *DUPOUT, *DUPERR, *DUPIN);

  eval {
    local $SIG{ALRM} = sub { die "TIMEOUT" };

    # duplicate stdandard IO handles
    open DUPOUT, ">&STDOUT";
    open DUPERR, ">&STDERR";
    open DUPIN,  "<&STDIN";
    # launch child command, attached directly to standard IO handles
    $pid = open3("<&STDIN", ">&STDOUT", ">&STDERR", $cmd) || die("(E) timedSystemCall: Unable to launch command - $cmd\n$!\n");
    # select primary output, and then disable buffering (activate auto-flush)
    select STDERR; $| = 1;
    select STDOUT; $| = 1;

    # set alarm to go off in "timeout" seconds, and call $SIG{ALRM} at that time
    alarm($timeout);
    # hang (block) until program is finished
    waitpid($pid, 0);

    # program is finished - disable alarm
    alarm(0);
    # grab output of waitpid
    $rc = $?;
    # close child's associated IO handles
    close(STDOUT);
    close(STDERR);
    close(STDIN);
    # restore orig handles
    open STDOUT, ">&DUPOUT";
    open STDERR, ">&DUPERR";
    open STDIN, "<&DUPIN";
  }; # end of eval

  # Did eval exit from an alarm timeout?
  if (($@ =~ "^TIMEOUT") || !defined($rc)) {
    # Yes - kill process
    kill(KILL => $pid) || die "Unable to kill $pid - $!";
    # Collect child's remains
    ($ret = waitpid($pid,0)) || die "Unable to reap child $pid (ret=$ret) - $!";
    # close child's associated IO handles
    close(STDOUT);
    close(STDERR);
    close(STDIN);
    # restore orig handles
    open STDOUT, ">&DUPOUT";
    open STDERR, ">&DUPERR";
    open STDIN, "<&DUPIN";
    # grab exit output of child process
    if ($rc = $?) {
      # exit code is lower byte: shift out exit code, leave top byte in $rc
      my $exit_value = $rc >> 8;
      # killing signal is lower 7-bits of top byte, which was shifted to lower byte
      my $signal_num = $rc & 127;
      # core-dump flag is top bit
      my $dumped_core = $rc & 128;
      # Notify grandparent of obituary
      print "(I) timedSystemCall:  Child $pid obituary: exit=$exit_value, kill_signal=$signal_num, dumped_core=$dumped_core\n";
    }
    # Can we try again?
    if ($maxattempts > 1) {
      # Yes! Increment counter, for print messages
      $retry++;
      print "(W) timedSystemCall:  Command timed-out after $timeout seconds.  Restarting ($retry of $origmax)...\n";
      # Recurse into self, while decrementing number of attempts. Return value from deepest recursion
      return timedSystemCall($cmd, $timeout, $maxattempts-1, $retry, $origmax);
    } else {
      # No!  Out of attempts...
      print "(E) timedSystemCall:  Exhausted maximum attempts ($origmax) for command: $cmd\nExiting!\n";
      # Return error code of killed process - will require interpretation by parent
      return $rc;
    }
  } else {
    # No - process completed successfully!  Hooray!!!  Return success code (should be zero).
    return $rc;
  }
}

exit timedSystemCall("inf.pl", 5, 3);

The advantage of this solution is that the standard IO handles (STDOUT, STDERR, STDIN) are directly connected to the child. The parent does not have to poll the child with non-blocking reads, and dump that output to the parent’s IO. So, this solution is somewhat simpler. Plus, it does not consume excess CPU while in the tight polling loop.

The other interesting thing about this solution is that the standard IO handles must be duplicated, or saved before they are fed to the child process. Otherwise, when the child process is killed, the standard IO handles will be automatically closed. Any restarted children will not be able to duplicate them, so the open3 command fails. But, what is worse, the parent cannot communicate to the outside world to communicate the cause of the error. This brings sudden and silent death to the parent. However, if the standard IO handles are first saved, then they can be restored after the child is killed, hence the duplicate (“DUP”) IO handles. The above solution is a good example of this technique.

The hint for this technique came from pg. 193, of O’Reilly’s, Programming Perl, under the open function explanation. Another hint came from the middle of pg. 568 of O’Reilly’s, Perl Cookbook, under discussion on “16.8. Controlling Input and Output of Another Program“.

Further Thoughts

Here are a few links for further reading on the topic of busting out of a long Perl operation:

http://www.mail-archive.com/beginners@perl.org/msg81677.html
http://coding.derkeiler.com/Archive/Perl/comp.lang.perl.misc/2003-10/0422.html

Could there be a better way? Could you improve on my final solution? I am both an optomist and an optomizerist (sic), so if you can improve my solution, let me know! 🙂

Share