![]() |
|
12.02.08 Using Perl To Count Letter Occurrence In A Document By
Dave Taylor
Hey I want a Perl script that reads a file and sends me the number of occurrences of the alphabets in that file... Could you please help me? Dave's Answer: This is an interesting little puzzle so what I will do is show how to write a quick, short Bourne Shell script for Linux (or Mac OS X if you crack open your Terminal.app program) that can do what you seek. The key idea is that if you could transform the input to be one- character-per-line, it'd be unreadable for humans, but would make it really, really easy to sort and tally for a computer program. How do you do that? With one of what I call the unsung heroes of the Unix command line, the "fold" command. Generally, people use fold to wrap overly long lines in text files (it's great for processing info prior to printing it, for example) but as with all great Unix command line utilities, it has parameters that let you change its behavior. And that's just what we'll do.
Try this yourself: $ date | fold -w3 Wed Nov 26 0 9:56 :48 MST 2008 That's with width=4. Turn it into "-w1" and each and every character is on its own line. (I won't reproduce it here because it's crazy long and you get the idea anyway, I hope!) Now that each character is on its own line, it's simple to sort the output to ensure that they're in alphabetical order with "sort". To tally matching lines turns out to be a feature of another of the unsung heroes, "uniq". Check its man page and you'll see: Continue reading this article.
|
|||||||
|
| ||
|