Tuesday, December 9, 2008

wp.factor

I came across a benchmark for comparing languages today. It did not contain a version for Factor, so I thought I would contribute one.

The idea is fairly straightforward, and in the words of the author:

read stdin, tokenize into words
for each word count how often it occurs
output words and counts, sorted in descending order by count

My attempt is below:

USING: arrays assocs kernel io math math.parser 
prettyprint sequences splitting sorting ;

IN: wp

: count-words ( assoc string -- assoc' )
    " " split harvest [ over inc-at ] each ;

: sort-assoc ( assoc -- seq )
    >alist sort-values reverse ;

: print-results ( seq -- )
    [ number>string "    " glue print ] assoc-each ;

: wp ( -- )
    H{ } clone
    [ [ count-words ] each-line ]
    [ sort-assoc print-results ]
    bi drop ;

MAIN: wp

You can run this from factor by putting it in a file called wp.factor and running from the shell:

cat file.txt | ./factor -run=wp

2 comments:

Matthew Maycock said...

Could you just get rid of the clone and drop?

mrjbq7 said...

Yes, and you can make it shorter as it is if you look at the version in my "re-factor" repository:

https://github.com/mrjbq7/re-factor/blob/master/wp/wp.factor