[KLUG Programming] sorting, uniq'ing, and grepping
Erik Gillespie
programming@kalamazoolinux.org
Mon, 14 Jul 2003 18:42:45 -0400 (EDT)
Hey Tony, here's my opinion for what it's worth:
I'd go with C because I know that better than Perl but either would be a
good cross-platform solution. The downfall of Perl is that you have to
make sure that Perl is installed. With C you just need the compiled code.
Just like you were basically doing with your scripts I would read both
files into a single array, call qsort() (#include <stdlib.h>, it's POSIX
compliant and every C implementation I've seen has it), then make a single
pass through the array and flag duplicates.
You could hand-code a very elegant algorithm to do everything at once by
hand but the sorting of the data is the clincher. Unless your grandpa was
Dijkstra you'll be stuck with a complexity of O(n log n) regardless of
whether you write your own l33t algorithm or you do things one step at a
time.
Erik
--
Word of the Day:
febrile: feverish.
Days until Matrix Revolutions is released: 114