[ILUG] Re: perl file processing
Marcus Furlong
furlongm at hotmail.com
Thu Oct 23 23:20:00 IST 2008
Brian Foster <blf <at> utvinternet.ie> writes:
>
> below's a quickly-put-together all-awk(1) solution,
> albeit if this was my problem I'd be more inclined
> to do some filtering first, probably with sed(1)
> like Francis did.
> cheers!
> -blf-
>
> #!/bin/gawk -f
> BEGIN {
> state = 0
> ncols = 0
> nlines = 0
>
> STDERR = "/dev/stderr"
> }
>
> state == 0 && $0 == "=== Stratified cross-validation ===" {
> state = 1
> next
> }
>
> state == 1 && $0 == "=== Detailed Accuracy By Class ===" {
> state = 2
> next
> }
>
> state == 2 && 2 <= NF && $NF ~ /^[A-Z]$/ {
> for (n = 1; n < NF; n++) {
> if ($n !~ /^[0-9.]*$/)
> next
> }
> state = 3
> ncols = NF
> }
>
> state == 3 && NF != ncols { exit } # goto END
>
> state == 3 && $NF !~ /^[A-Z]$/ { exit } # goto END
>
> state == 3 {
> for (n = 1; n < ncols; n++)
> col[n] += (0 + $n)
> nlines++
> next
> }
>
> END {
> #debug print "EXIT(" FNR "): nlines=" nlines, "ncols=" ncols
> if (nlines <= 0) {
> print FILENAME ": Data not found, state =", state >STDERR
> exit 1
> }
> for (n = 1; n < ncols; n++)
> print col[n]/nlines
> }
>
Just a follow-up question to this. I'm trying to use this script for a
variable-width NxN matrix of the following form:
=== Confusion Matrix ===
a b c d e f g h i j k <-- classified as
154 12 28 7 17 1 10 11 56 20 30 | a = A
6 174 11 3 2 2 3 3 16 6 20 | b = B
8 7 222 4 7 0 9 6 24 9 34 | c = D
8 3 21 154 20 9 37 4 42 45 29 | d = F
8 0 8 4 277 1 7 2 9 15 11 | e = G
4 3 11 7 13 185 43 5 32 19 18 | f = Q
0 1 6 13 16 9 242 2 9 36 19 | g = T
9 9 29 3 14 3 17 139 59 25 40 | h = U
10 2 3 1 3 2 1 4 167 7 67 | i = V
20 5 17 15 16 9 30 13 38 149 38 | j = X
6 6 6 1 2 1 2 3 39 10 266 | k = Z
Using the above script, how can I sum the diagonals? i.e. [0,0] to [N,N], [0,0]
being the top left corner.
Also, is it possible to pass the first string "=== Stratified cross-validation
===" in as a variable? I have multiple types of tests to retrieve from each
file, and hence multiple strings of text like the above to search for. Currently
I have just copied and pasted the script for each string I need to search for.
More information about the ILUG
mailing list