Search iEntry News

Search And Replace A Block Of Text With Perl

By Michael Marr
Expert Author
Article Date: 2010-06-29

Searching for and then replacing text is such a fundamental function in computing that I was eager to take the opportunity to do such an operation over a large website still being served via static HTML pages.

I quickly assumed that armed with the right regular expression, I could easily conquer this task. However, this quickly became a tale of defeat followed by outstanding victory with the help of Perl.

My first option was to find a *nix command line tool to run my search and replace operation. Apparently, sed is this kind of tool. However, after spending hours trying to find a proper regular expression to do what I needed, I had to abandoned this option. The problem was that my search string contained multiple lines, and no matter how I tried, I couldn't get sed to play nice with multiple lines. I was able to find a single line and replace it, but obviously this was not the complete operation I needed.

My next task was to use a PERL script and attempt to accomplish the same thing as sed does. However, my various attempts to do the same was too on-point, as again I was only able to work with single lines. Luckily, I found this script: http://noctilucent.org/blog/2003/12/replacing-large-chunks-of-text-with-perl.html Following the instructions supplied by the author gave me a working script. However, as most things in life, this was not 100% tailored to my intended purpose. The script works great on one directory, but the site I was needing to run a search and replace on was spread across multiple directories and subdirectories. I needed to run this script across all the files recursively. Here's how I did it:

find ./ -name '*.html' -exec perl -pi -0777 sub.pl {} ;

Executing the command find across the current directory will also search all subdirectories for any file with .html in the name. Obviously, replace '*.html' with your proper search string. Then, it executes the perl script sub.pl (from the tutorial linked above), passing into it the current file name being found via the {} parameter. The ; signifies that this is the end of the statement to be executed on the results of find. The perl parameters -pi -0777 allow the script to be ran on the actual files passed into as parameters and save the output back into the files, while also treating whitespace and breaks in a manner compatible with the regular expression searching for the multiple line string.



About the Author:
Michael Marr is a IT staff Writer for WebProNews.




Newsletter Archive | Article Archive | Submit Article | Advertising Information | About Us | Contact

PerlProNews is an iEntry, Inc. ® publication - All Rights Reserved Privacy Policy and Legal