How can I count the number of times each line is repeated in a file? (Unix, example inside.)

LordJezo · Sep 17, 2003

I have a file with 1000s of lines on it.

Each line is an IP address.

How would I go about getting a summary report of the number of times each IP appears in the list?

For example, the file looks like this:

68.210.101.222
68.210.101.222
67.117.110.152
67.117.110.152
24.207.128.236
24.184.169.117
24.207.128.236
198.81.26.7
24.184.169.117
65.49.54.66

How would I get something along the lines of:

68.210.101.222 - 2
67.117.110.152 - 2
24.207.128.236 - 2
24.184.169.117 - 2
198.81.26.7 - 1
65.49.54.66 - 1

???

Thanks for any help..

incorrigible · Sep 17, 2003

Not to elegant, but it'll work:

First sort the data in the source file via:

# cat <sourcefile> | sort > sortedfile

Next, save this little awk program to a file called countem.awk (or whatever you want):

############ awk file start ##################

BEGIN{ LAST=""; COUNT=0 }

{
if(NR==1){
LAST=sprintf("%s",$0)
COUNT++
next
}

if($0 ~ LAST){
COUNT++
}
else{
printf("%s - %d\n",LAST,COUNT)
COUNT=1
}
LAST=sprintf("%s",$0)
}

END{ printf("%s - %d\n",LAST,COUNT) }

############ awk file end ##################

Then run:
# awk -f countem.awk sortedfile

It will give you the output you're looking for.:gift:

Haden · Sep 17, 2003

#!/usr/bin/perl

$name=$ARGV[0];

@arr=();
@cnt=();
open(FL, $name);
while (<FL>)
{
$index=0;
++$index until ($arr[$index] eq $_) || ($index > $#arr);
if ($index>$#arr)
{
push (@arr, $_);
push (@cnt, 1);
} else
{
$cnt[$index]++;
}
}

# print rezults, choose some readable form...
for ($i=0; $i<=$#arr; $i++)
{
chomp $arr[$i];
print "$arr[$i] - $cnt[$i]\n";
}

close(FL);

:beer:

Edit: just incase, to use it, throw filename at command line (./prgname file_to_analyze).

LordJezo · Sep 17, 2003

Thanks!

Trying the solutions now!

I'll update on how they went.

glugglug · Sep 17, 2003

easiest way if this is UNIX or you have cygwin or similar utility installed:

sort filename | uniq -c

notfred · Sep 17, 2003

My perl solution:

#!/usr/bin/perl

open FILE, $ARGV[0] or die "Invalid file name";
while (<FILE>){
chomp $_;
$lines{$_}++
}
close FILE;

foreach $line(keys %lines){
print "$line - $lines{$line}\n";
}

Usage: thisfile.pl inputfile.txt
Or if you want to save to a file: thisfile.pl inputfile.txt > outputfile.txt

Barnaby W. Füi · Sep 17, 2003

uniq -c wins by a long shot

thornc · Sep 19, 2003

Originally posted by: BingBongWongFooey
uniq -c wins by a long shot

BBWF, just out of curiosity how would you code something in Python for this problem??

LordSnailz · Sep 19, 2003

just passing by and wanted to give props to those that offered solutions ... you guys rock!

Barnaby W. Füi · Sep 19, 2003

Originally posted by: thornc

Originally posted by: BingBongWongFooey
uniq -c wins by a long shot

Click to expand...

BBWF, just out of curiosity how would you code something in Python for this problem??

Pretty much the same as notfred did, just in Python.

import sys

matches = {}

# build up our counts
for line in sys.stdin:
if matches.has_key(line):
matches[line] += 1
else
matches[line] = 1

# print out results
for line, count in matches.items():
print "%d: %s" % (count, line.rstrip())

Search

How can I count the number of times each line is repeated in a file? (Unix, example inside.)

LordJezo

Banned

incorrigible

Junior Member

Haden

Senior member

LordJezo

Banned

glugglug

Diamond Member

notfred

Lifer

Barnaby W. Füi

Elite Member

thornc

Golden Member

LordSnailz

Diamond Member

Barnaby W. Füi

Elite Member

TRENDING THREADS