How to list directory contents of a web folder via linux command?

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
This seems like it would be simple, but google is failing me. How can I retrieve a listing of the files contained within a directory on a website? I know about wget and curl, but those only seem to work when you know the exact filename. Is there a way to simply retrive a listing of the files in a particular directory?
 
Last edited:

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
If we're talking about over HTTP, you can't. Unless it's setup to show the directory listing, then that information is intentionally hidden and cannot be made to show itself.

Now if you have access to the box, then a simple ls in the directory you want more information on will suffice.
 

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
Yes, this is over HTTP. I found a site with a daily podcast I want to check out, and I want to setup a script to automatically download the latest one each night. The podcasts have a regular format, except for an underscore at the end followed by what appears to be random characters, or at least characters without any discernable pattern. The filenames do have the current date in them.

UPDATE: I have discovered that the "random" characters are actually a 4-digit code that specifies the time the podcast was uploaded. For example, if the podcast was uploaded at 3:13 AM, the podcast file would end with _0131a. Nevertheless, this still doesn't help me, as the podcasts are uploaded at more or less random times. All I can be sure of is that there will be one AM and one PM podcast each day.

How can I download the latest one of these automatically if I don't know the exact complete filename? If I could do directory listings it would be simple.
 
Last edited:

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
How can I download the latest one of these automatically if I don't know the exact complete filename? If I could do directory listings it would be simple.
Use their RSS feed to get the URL. This scenario is precisely what it's for.
 

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
If we're talking about over HTTP, you can't. Unless it's setup to show the directory listing, then that information is intentionally hidden and cannot be made to show itself.

Now if you have access to the box, then a simple ls in the directory you want more information on will suffice.

Is listing directory contents over HTTP a security risk or something? Why wouldn't they allow a user to do that?

They can always restrict what directories a user can view, but if all the files in a directory are downloadable by anyone, then I don't see the harm in allowing users to do an ls in that directory.
 

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
Is listing directory contents over HTTP a security risk or something? Why wouldn't they allow a user to do that?
That's exactly it. They don't want every document visible, especially if there are settings files located in that directory or if it means server side scripts that should be executed get transmitted to the user instead. Furthermore a lot of sites like to have directories jump to an index of some kind (ex: forums.anandtech.com really goes to forums.anandtech.com/index.php), which requires that a directly redirect to that file. There's no convention in HTTP for both doing that and showing the contents of a directory, it's one or the other.

Honestly this is why FTP was invented. However that's all but gone by the wayside for most download services now.