• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

How hard is it to write a simple crawler in Java?

enwar3

Golden Member
A startup I want to work at wants me to complete a programming assignment (which seems pretty involved for an interview process). They asked for a simple crawler that makes a query to a search site and returns the number of results and the actual results in an array.

My question is, how long does this take to write in Java? Is this a huge project or is this something I can do in a couple hours? I'm pretty proficient in Java but I haven't done any iostream web stuff.
 
I like people who don't nef in our forum

Markbnj
Programming moderator
 
Last edited by a moderator:
I've had companies require me to take timed, online tests that took close to an hour, or debug problem code, not to mention multiple interviews that each involved investing at least an hour and sometimes more. I think if a company asked me to submit a small program that might take up to two hours to create I would actually be pleased. I can do at home, I don't need to put on a tie, I know I can do a good job at it.
 
... I think if a company asked me to submit a small program that might take up to two hours to create I would actually be pleased...

Quoting Admiral Ackbar, It's a Trap. For two, maybe three, reasons.

1. No question is ever so well asked that there is not wiggle room in a correct answer. Only a complete program spec, covering all situations, is really sufficient. And adherence to a complete spec would take way too long for a take-home interview question. Language? Platform? Language version? Allowed libraries? Environment assumptions? Compiler? Compiler version? Efficiency? Memory usage? I/O requirements? Naming conventions? Program structure? etc.

2. Regardless of the quality of your coding skills, there is going to be something 'wrong' with it. Every company, and consequently, every technical interviewer, has a different coding practice and different coding standards. If the organization has a coding standard, the interviewer is used to that standard. Deviations from that standard automatically seem conspicuous, even if the code is correct.

E.g., How much error checking is expected? Too much and the code looks overly complicated. Too little, and the code looks careless. What is the company culture about this kind of thing? You, the interviewee, don't know, because you don't work for the company...

3. There is no guarantee that whoever reviews your code knows their @$$ from their elbow in the first place. Its very common to find that your interviewer knows less about programming (or whatever the field happens to be) than one's self. However, interviews are commonly organized such that the interviewer is expected to be in the position of superiority, both intellectually and otherwise. When those roles are disturbed, people react in funny ways, e.g., with unjust negative opinions.

This point (#3) really only counts for half credit, because it is not specific to take-home coding assignments -- it can happen in any interview -- but a take-home assignment gives the @$$hat interviewer ammunition.
 
Last edited:
The scope would certainly have to be defined, but if you can't get by most of those issues in a discussion of your solution then you probably don't want to work for them anyway.
 
I've had companies require me to take timed, online tests that took close to an hour, or debug problem code, not to mention multiple interviews that each involved investing at least an hour and sometimes more. I think if a company asked me to submit a small program that might take up to two hours to create I would actually be pleased. I can do at home, I don't need to put on a tie, I know I can do a good job at it.

Agreed. No pressure with some employer looking over your shoulder when you solve/do programming.

What I don't understand is why I have to go over 4-6 rounds of interviews over a period of several months just to get a software development position. I'm on my 4th round so far for a job I applied back in November. It's really frustrating and nerve racking to keep waiting and waiting to see if you will advance to the "semi finals".
 
A startup I want to work at wants me to complete a programming assignment (which seems pretty involved for an interview process). They asked for a simple crawler that makes a query to a search site and returns the number of results and the actual results in an array.

My question is, how long does this take to write in Java? Is this a huge project or is this something I can do in a couple hours? I'm pretty proficient in Java but I haven't done any iostream web stuff.

http://www.gotoandlearn.com/

has a similar task, though written in Actionscript as part of the "Flex in a Week" class.

(it's either at gotoandlearn OR Adobe "Flex in a Week" series.)

it queries the Flickr API, and returns a table of search results.

might be worth taking a look at. different code, similar task.
 
Tip: watch out for loops / cycles in the website.

You need to track all of the URLs you've visited so far, while still allowing for similar links with different parameters, e.g. .../contentserve.php?page=1234 is not .../contentserve.php?page=4567
 
I would look at using HTTP GET/POST methods with a HttpURLConnection object to post the search request and then receive the HTML file with the search results. Once you have the HTML file, use an XML parser to parse them (since HTML is basically XML). There's an XML parser in the Java standard library. From there it should be pretty straightforward.

It shouldn't be that hard of a project but I think it would take more than a couple of hours, depending on how familiar you are with this stuff already.
 
Back
Top