Wednesday, October 27, 2010

Extracting names from text file

I'm beginning, for this blog, a series of short utility scripts and essays that relate, in one way or another, to the general subject of indexing and data retrieval.

The first entry is a short Perl script (just 18 command lines) that extracts the names (of people) wherever the names may occur within a provided text file. The output consists of an alphabetized list of non-repeating names. The script is so simple that it can be easily be translated into any language that supports regular expressions (regex).

The script is available at:

http://www.julesberman.info/factoids/namesget.htm

Blog readers who are uninterested in indexing and data retrieval may want to visit my two other blogs,

Machiavelli's Laboratory (scientific ethics taught from the perspective on an unethical scientist)

and

Neoplasms (essays on tumor biology)

- © 2010 Jules Berman

key words: indices, indexing, indexes, index, data retrieval, information retrieval, informatics

No comments: