|
| 1 | +This is version 2.04 of agrep - a new tool for fast |
| 2 | +text searching allowing errors. |
| 3 | +agrep is similar to egrep (or grep or fgrep), but it is much more general |
| 4 | +(and usually faster). |
| 5 | +The main changes from version 1.1 are 1) incorporating Boyer-Moore |
| 6 | +type filtering to speed up search considerably, 2) allowing multi patterns |
| 7 | +via the -f option; this is similar to fgrep, but from our experience |
| 8 | +agrep is much faster, 3) searching for "best match" without having to |
| 9 | +specify the number of errors allowed, and 4) ascii is no longer required. |
| 10 | +Several more options were added. |
| 11 | + |
| 12 | +To compile, simply run make in the agrep directory after untar'ing |
| 13 | +the tar file (tar -xf agrep-2.04.tar will do it). |
| 14 | + |
| 15 | +The three most significant features of agrep that are not supported by |
| 16 | +the grep family are |
| 17 | +1) the ability to search for approximate patterns; |
| 18 | + for example, "agrep -2 homogenos foo" will find homogeneous as well |
| 19 | + as any other word that can be obtained from homogenos with at most |
| 20 | + 2 substitutions, insertions, or deletions. |
| 21 | + "agrep -B homogenos foo" will generate a message of the form |
| 22 | + best match has 2 errors, there are 5 matches, output them? (y/n) |
| 23 | +2) agrep is record oriented rather than just line oriented; a record |
| 24 | + is by default a line, but it can be user defined; |
| 25 | + for example, "agrep -d '^From ' 'pizza' mbox" |
| 26 | + outputs all mail messages that contain the keyword "pizza". |
| 27 | + Another example: "agrep -d '$$' pattern foo" will output all |
| 28 | + paragraphs (separated by an empty line) that contain pattern. |
| 29 | +3) multiple patterns with AND (or OR) logic queries. |
| 30 | + For example, "agrep -d '^From ' 'burger,pizza' mbox" |
| 31 | + outputs all mail messages containing at least one of the |
| 32 | + two keywords (, stands for OR). |
| 33 | + "agrep -d '^From ' 'good;pizza' mbox" outputs all mail messages |
| 34 | + containing both keywords. |
| 35 | + |
| 36 | +Putting these options together one can ask queries like |
| 37 | + |
| 38 | +agrep -d '$$' -2 '<CACM>;TheAuthor;Curriculum;<198[5-9]>' bib |
| 39 | + |
| 40 | +which outputs all paragraphs referencing articles in CACM between |
| 41 | +1985 and 1989 by TheAuthor dealing with curriculum. |
| 42 | +Two errors are allowed, but they cannot be in either CACM or the year |
| 43 | +(the <> brackets forbid errors in the pattern between them). |
| 44 | + |
| 45 | +Other features include searching for regular expressions (with or |
| 46 | +without errors), unlimited wild cards, limiting the errors to only |
| 47 | +insertions or only substitutions or any combination, |
| 48 | +allowing each deletion, for example, to be counted as, say, |
| 49 | +2 substitutions or 3 insertions, restricting parts of the query |
| 50 | +to be exact and parts to be approximate, and many more. |
| 51 | + |
| 52 | +agrep is available by anonymous ftp from cs.arizona.edu (IP 192.12.69.5) |
| 53 | +as agrep/agrep-2.04.tar.Z (or in uncompressed form as agrep/agrep-2.04.tar). |
| 54 | +The tar file contains the source code (in C), man pages (agrep.1), |
| 55 | +and two additional files, agrep.algorithms and agrep.chronicle, |
| 56 | +giving more information. |
| 57 | +The agrep directory also includes two postscript files: |
| 58 | +agrep.ps.1 is a technical report from June 1991 |
| 59 | +describing the design and implementation of agrep; |
| 60 | +agrep.ps.2 is a copy of the paper as appeared in the 1992 |
| 61 | +Winter USENIX conference. |
| 62 | + |
| 63 | +Please mail bug reports (or any other comments) |
| 64 | + |
| 65 | + |
| 66 | +We would appreciate if users notify us (at the address above) |
| 67 | +of any extensions, improvements, or interesting uses of this software. |
| 68 | + |
| 69 | +January 17, 1992 |
| 70 | + |
| 71 | + |
| 72 | +BUGS_fixed/option_update |
| 73 | + |
| 74 | +1. remove multiple definitions of some global variables. |
| 75 | +2. fix a bug in -G option. |
| 76 | +3. fix a bug in -w option. |
| 77 | +January 23, 1992 |
| 78 | + |
| 79 | +4. fix a bug in pipeline input. |
| 80 | +5. make the definition of word-delimiter consistant. |
| 81 | +March 16, 1992 |
| 82 | + |
| 83 | +6. add option '-y' which, if specified with -B option, will always |
| 84 | +output the best-matches without a prompt. |
| 85 | +April 10, 1992 |
| 86 | + |
| 87 | +7. fix a bug regarding exit status. |
| 88 | +April 15, 1992 |
0 commit comments