comparing two lists

Status
Not open for further replies.

bwanaaa

Senior member
Dec 26, 2002
739
1
81
most applications to compare lists simply do a 'sequential' compare. but what if i want to compare the contents of two lists, regardless of the sequence of items? some like what this web page does:
http://jura.wi.mit.edu/bioc/tools/compare.php

but i need this kind of application for offline use (that's why the web page above is insufficient)

Does anyone here know of a fast simple application that achieves this?
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,622
2,384
136
The simplest way I can think of is to sort both lists, then compare. What purpose is it for, what kind of input do you need it to take? That's simple enough that I can code it up in a jiffy.
 

xSauronx

Lifer
Jul 14, 2000
19,582
4
81
excels conditional formatting or data sorting/filtering would probably be pretty good for this. do you need the kind of output that this website is providing?
 
Last edited:

esun

Platinum Member
Nov 12, 2001
2,214
0
0
They are treating them as sets, not "lists" in the traditional sense (you can see it says duplicate entries are removed). It's then just doing set operations (union, intersection, and subtraction).

I don't know of any applications off the top of my head, but if you know any programming language you could write it up very quickly. For example, in python:

a = set([a, b, c, d, e, f])
b = set([c, d, b, g, h, a])

a.union(b) # Would give you a or b
a.intersection(b) # Would give you a and b
a.difference(b) # Would give you a only
b.difference(a) # Would give you b only

See: http://docs.python.org/library/stdtypes.html#set

You could easily make a script read the sets in from a file or from a CLI then print out the results.
 

bwanaaa

Senior member
Dec 26, 2002
739
1
81
Thank you for your replies. I have the exciting task of comparing router logs. At our facility there are routers that govern subnets. I am tasked with looking at Internet access patterns. Each router is configured to log the packets. So I get a loooonnng list from each router log. I need to compare logs and see if there are similar access patterns. It's a little bit like comparing DNA sequences. You might find similar sub sequences but they are not in the same place ( in the log). To begin with I am simply going to look for one-line similarities. Then I will expand to 2 line , 3 line , etc. Multi line comparisons also have to be fuzzy ( I need to allow for matches to be separated by up to a few lines. I thought I'd start with one line comparisons. But I need a program that I KNOW works to compare the results to my code. The set approach is appealing. Just make an array of each list. Of course that is limited by the maximum size (dim statement) of the array.

I wonder how google eliminates duplications from its searches. The map reduce approach results in spawning (mapping) multiple processes whose results are then combined ( reduce).
 

MrDudeMan

Lifer
Jan 15, 2001
15,069
94
91
Perl also has a module to do something like this.

Code:
#! /usr/bin/perl -w

use List::Compare;

@Llist = qw(0 1 2);
@Rlist = qw(0 1 2 3 4);
@Alist = qw(0 1 2 3 4);

$lc = List::Compare->new(\@Llist, \@Rlist, \@Alist);

$lc->print_subset_chart;
$lc->print_equivalence_chart;
 
Status
Not open for further replies.