Offline copy of Wikipedia with images?

BigToque

Lifer
Oct 10, 1999
11,700
0
76
Where I go to school our school cuts off internet access during school hours, but we're supposed to be able to access Wikipedia. The IT here seem to be unable to figure out why Wikipedia loads, but no images load, even though they've supposedly allowed wikipedia.org and wikimedia.org to pass through whatever filter they have set up.

I've wasted enough time waiting for these people to fix this issue and would like to just have a local copy of Wikipedia. I know I can download a copy of the database, and I found the torrent that is linked to on the Wikipedia website, but it only had the text (at least that's the impression I got from one of the comments left about the torrent file).

How do I get a copy of Wikipedia that includes all the images?
 
Last edited:

lxskllr

No Lifer
Nov 30, 2004
59,668
10,179
126
wget?

robots.txt said:
# Sorry, wget in its recursive mode is a frequent problem.
# Please read the man page and use it properly; there is a
# --wait option you can use to set the delay between hits,
# for instance.
#
User-agent: wget
Disallow: /
 

lxskllr

No Lifer
Nov 30, 2004
59,668
10,179
126

robots.txt said:
#
# robots.txt for http://www.wikipedia.org/ and friends
#
# Please note: There are a lot of pages on this site, and there are
# some misbehaved spiders out there that go _way_ too fast. If you're
# irresponsible, your access to the site may be blocked.
#

# advertising-related bots:
User-agent: Mediapartners-Google*
Disallow: /

# Wikipedia work bots:
User-agent: IsraBot
Disallow:

User-agent: Orthogaffe
Disallow:

# Crawlers that are kind enough to obey, but which we'd rather not have
# unless they're feeding search engines.
User-agent: UbiCrawler
Disallow: /

User-agent: DOC
Disallow: /

User-agent: Zao
Disallow: /

# Some bots are known to be trouble, particularly those designed to copy
# entire sites. Please obey robots.txt.
User-agent: sitecheck.internetseer.com
Disallow: /

User-agent: Zealbot
Disallow: /

User-agent: MSIECrawler
Disallow: /

User-agent: SiteSnagger
Disallow: /

User-agent: WebStripper
Disallow: /

User-agent: WebCopier
Disallow: /

User-agent: Fetch
Disallow: /

User-agent: Offline Explorer
Disallow: /

User-agent: Teleport
Disallow: /

User-agent: TeleportPro
Disallow: /

User-agent: WebZIP
Disallow: /

User-agent: linko
Disallow: /

User-agent: HTTrack
Disallow: /


User-agent: Microsoft.URL.Control
Disallow: /

User-agent: Xenu
Disallow: /

User-agent: larbin
Disallow: /

User-agent: libwww
Disallow: /

User-agent: ZyBORG
Disallow: /

User-agent: Download Ninja
Disallow: /

#
# Sorry, wget in its recursive mode is a frequent problem.
# Please read the man page and use it properly; there is a
# --wait option you can use to set the delay between hits,
# for instance.
#
User-agent: wget
Disallow: /

#
# The 'grub' distributed client has been *very* poorly behaved.
#
User-agent: grub-client
Disallow: /

#
# Doesn't follow robots.txt anyway, but...
#
User-agent: k2spider
Disallow: /

#
# Hits many times per second, not acceptable
# http://www.nameprotect.com/botinfo.html
User-agent: NPBot
Disallow: /

# A capture bot, downloads gazillions of pages with no public benefit
# http://www.webreaper.net/
User-agent: WebReaper
Disallow: /

# Don't allow the wayback-maschine to index user-pages
#User-agent: ia_archiver
#Disallow: /wiki/User
#Disallow: /wiki/Benutzer

#
# Friendly, low-speed bots are welcome viewing article pages, but not
# dynamically-generated pages please.
#
# Inktomi's "Slurp" can read a minimum delay between hits; if your
# bot supports such a thing using the 'Crawl-delay' or another
# instruction, please let us know.
#
User-agent: *
Disallow: /w/
Disallow: /trap/
Disallow: /wiki/Especial:Search
Disallow: /wiki/Especial%3ASearch
Disallow: /wiki/Special:Collection
Disallow: /wiki/Spezial:Sammlung
Disallow: /wiki/Special:Random
Disallow: /wiki/Special%3ARandom
Disallow: /wiki/Special:Search
Disallow: /wiki/Special%3ASearch
Disallow: /wiki/Spesial:Search
Disallow: /wiki/Spesial%3ASearch
Disallow: /wiki/Spezial:Search
Disallow: /wiki/Spezial%3ASearch
Disallow: /wiki/Specjalna:Search
Disallow: /wiki/Specjalna%3ASearch
Disallow: /wiki/Speciaal:Search
Disallow: /wiki/Speciaal%3ASearch
Disallow: /wiki/Speciaal:Random
Disallow: /wiki/Speciaal%3ARandom
Disallow: /wiki/Speciel:Search
Disallow: /wiki/Speciel%3ASearch
Disallow: /wiki/Speciale:Search
Disallow: /wiki/Speciale%3ASearch
Disallow: /wiki/Istimewa:Search
Disallow: /wiki/Istimewa%3ASearch
Disallow: /wiki/Toiminnot:Search
Disallow: /wiki/Toiminnot%3ASearch
#
...

...
 

rivan

Diamond Member
Jul 8, 2003
9,677
3
81
Just buy a copy of the whole internet from the Elders.

internet.jpg
 

AstroManLuca

Lifer
Jun 24, 2004
15,628
5
81
Wow, when I was in school they would tell us to not use Wikipedia as a primary source, just as a jumping-off point. Now they're blocking everything EXCEPT Wikipedia so it's impossible to verify any of the information stored there?

Whoever is making the decision there is brain dead.
 

alangrift

Senior member
May 21, 2013
434
0
0
Wow, when I was in school they would tell us to not use Wikipedia as a primary source, just as a jumping-off point. Now they're blocking everything EXCEPT Wikipedia so it's impossible to verify any of the information stored there?

Whoever is making the decision there is brain dead.

NEVER use Wikipedia or Blogs as primary sources (unless they did their own research).

I think people just copy the citations off wikipedia.
 

GlacierFreeze

Golden Member
May 23, 2005
1,125
1
0
Where I go to school our school cuts off internet access during school hours, but we're supposed to be able to access Wikipedia. The IT here seem to be unable to figure out why Wikipedia loads, but no images load, even though they've supposedly allowed wikipedia.org and wikimedia.org to pass through whatever filter they have set up.

I've wasted enough time waiting for these people to fix this issue and would like to just have a local copy of Wikipedia. I know I can download a copy of the database, and I found the torrent that is linked to on the Wikipedia website, but it only had the text (at least that's the impression I got from one of the comments left about the torrent file).

How do I get a copy of Wikipedia that includes all the images?

Where do you go to school?

Sounds like a dumb school. Decided it was a good idea to block Internet access? Then allowed only Wiki? IT staff doesn't know why Wiki images won't load? wtf