• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Need help taking data from html into excel!

Akaz1976

Platinum Member
There is a table of data at following website and i was wondering if there was a way of taking that data into excel spread sheet.

website

I need two columns from that website. One containing date the other the value of Book to bill ratio for that date.

Can it be done?

Thanks

Akaz
 
Well its about 205 data points (17x12+1). But more importantly i do this sort of work often and would really like to learn a way that would help me in the future as well!

Akaz
 
Best way if the data is in a constan location within the page, analyze the doc and develop a script/simple parsing program to extract the fields you wish and put them into the data format that you need.

Do an initial search for the table and then parse out the individual fields desired.
 
If you run the source code from that page through this script:

#!/usr/bin/perl
open F, "in.txt";
open OUT, ">out.txt";
$stripped;
while ($in = <F> ) {
$in =~ s/<[^>]*?>//g;
$stripped .= $in;
}
$stripped =~ s/\s*\n\s*[\s\n]+/\n/g;
@stripped = split /\n/, $stripped;
for (my $i = 0; $i < @stripped; $i++) {
if ($stripped[$i] =~ /^\D/ || $stripped[$i] =~ /[A-Za-z]/ || $stripped[$i] =~ /^\d{4}$/) {
$stripped[$i] = '';
}
if ($stripped[$i] =~ /^\d\//) {
$stripped[$i] = "0$stripped[$i]";
}
$stripped[$i] =~ s/^(\d{2})\/(\d{2})(.*)/$2\/$1$3/;
}
@stripped = sort @stripped;
for (my $i = 0; $i < @stripped; $i++) {
$stripped[$i] =~ s/^(\d{2})\/(\d{2})(.*)/$2\/$1$3/;
}
$div ='';
foreach $line(@stripped) {
if($line){
$line =~ /^\d{2}\/(\d{2}).*/;
$new = $1;
print "|$new - $div| \n";
if($new eq $div){
print OUT "$line\t";
}
else {
$div = $new;
print OUT "\n$line\t";
}
}
}


You get the following output:


01/00--1.08 02/00--1.15 03/00--1.22 04/00--1.22 05/00--1.23 06/00--1.18 07/00--1.21 08/00--1.20 09/00--1.18 10/00--1.09 11/00--1.01 12/00--0.92
01/01--0.88 02/01--0.79 03/01--0.70 04/01--0.63 05/01--0.66 06/01--0.74 07/01--0.90 08/01--0.94 09/01--0.89 10/01--0.86 11/01--0.87 12/01--0.94
01/02--1.01 02/02--1.00 03/02--0.98 04/02--0.94 05/02--0.96 06/02--0.96 07/02--0.99 08/02--0.98 09/02--0.95 10/02--0.91 11/02--0.92 12/02--0.96
01/03--1.02
01/85--0.90 02/85--0.85 03/85--0.82 04/85--0.75 06/85--0.83 07/85--0.82 08/85--0.86 09/85--0.90 10/85--0.94 11/85--0.92 12/85--0.97
01/86--1.03 02/86--1.09 03/86--1.06 04/86--1.05 05/86--0.80 05/86--1.01 06/86--1.01 07/86--1.01 08/86--1.07 09/86--1.03 10/86--1.02 11/86--1.01 12/86--1.09
01/87--1.10 02/87--1.19 03/87--1.17 04/87--1.14 05/87--1.09 06/87--1.11 07/87--1.19 08/87--1.13 09/87--1.08 10/87--1.00 11/87--1.02 12/87--1.03
01/88--1.05 02/88--1.05 03/88--1.04 04/88--0.99 05/88--0.97 06/88--0.96 07/88--1.01 08/88--1.01 09/88--1.01 10/88--0.96 11/88--0.97 12/88--0.96
01/89--1.01 02/89--1.01 03/89--1.04 04/89--1.04 05/89--1.04 06/89--1.01 07/89--1.00 08/89--0.99 09/89--0.98 10/89--0.97 11/89--0.98 12/89--0.98
01/90--0.99 02/90--0.99 03/90--1.02 04/90--1.01 05/90--1.01 06/90--1.03 07/90--1.05 08/90--1.05 09/90--1.00 10/90--0.97 11/90--0.96 12/90--0.97
01/91--0.98 02/91--1.00 03/91--1.00 04/91--0.99 05/91--1.00 06/91--1.00 07/91--1.04 08/91--1.03 09/91--1.01 10/91--0.94 11/91--0.94 12/91--0.97
01/92--1.03 02/92--1.03 03/92--1.02 04/92--1.01 05/92--1.00 06/92--1.00 07/92--1.02 08/92--1.06 09/92--1.05 10/92--0.99 11/92--0.99 12/92--1.01
01/93--1.07 02/93--1.04 03/93--1.04 04/93--1.01 05/93--1.01 06/93--1.02 07/93--1.02 08/93--0.98 09/93--0.96 10/93--0.95 11/93--0.99 12/93--1.02
01/94--1.11 02/94--1.13 03/94--1.13 04/94--1.07 05/94--1.03 06/94--1.03 07/94--1.02 08/94--1.02 09/94--0.98 10/94--0.96 11/94--0.96 12/94--1.01
01/95--1.07 02/95--1.08 03/95--1.05 04/95--1.06 05/95--1.07 06/95--1.08 07/95--1.09 08/95--1.14 09/95--1.15 10/95--1.08 11/95--1.02 12/95--1.02
01/96--1.06 02/96--1.03 03/96--0.96 04/96--0.92 05/96--0.94 06/96--0.94 07/96--0.97 08/96--0.99 09/96--1.02 10/96--1.03 11/96--1.02 12/96--1.02
01/97--1.04 02/97--1.10 03/97--1.10 04/97--1.07 05/97--1.00 06/97--1.00 07/97--0.99 08/97--1.01 09/97--1.04 10/97--1.05 11/97--1.05 12/97--1.04
01/98--1.03 02/98--1.03 03/98--1.01 04/98--0.97 05/98--0.96 06/98--0.99 07/98--1.05 08/98--1.09 09/98--1.08 10/98--1.01 11/98--0.97 12/98--0.99
01/99--1.03 02/99--1.04 03/99--1.04 04/99--1.00 05/99--1.02 06/99--1.01 07/99--1.05 08/99--1.04 09/99--1.05 10/99--1.03 11/99--1.01 12/99--1.05
 
mm.. I am running office xp and when I right click on the table it gives me an option to export to excel.. which I did. If you need the data PM me and I will email it to you.

 
Back
Top