the reason for my absolute hatred of mechanical HDDs

Voo · Oct 8, 2011

Cerb said:
Insane amounts of files for the amount of added functionality, creating a hard to manage mess all too often, is not a non-existent problem (the OP's copying speed is just a nice side effect). The culture of adding libraries and frameworks every other day also helps make it worse, of course.

I don't see the number of class files (not even source files) as one of the major factors of code complexity. Otherwise the pinnacle of maintainable software would be the 200k LoC single c file? I mean I've read lots of SE stuff, but that does seem a bit far fetched (especially considering modern IDEs) - any white papers about that phenomenon? Having the interface declaration at the top of a file instead of in another file doesn't matter much when you get to the point of declaration in any case by a single key stroke.

And why Java would be especially susceptible to adding libraries, is a mystery to me. If at all the Java STL is MUCH larger than the C++STL (heck I can't remember the last even medium sized c++ project without boost). And in lots of cases reinventing the wheel is a really bad idea (though admittedly MUCH more fun and it's nothing new that writing code is much simpler than reading it).

Cerb said:
Ripping out an application's guts when it's a big complex application that is fairly well-documented, including unfixed bugs (the kind that are easier to work around than fix, due to risk): generally bad. Ripping out an application's guts when its buggy by design, can be replaced modularly, with a quality live testing environment: depends.

Well I've talked with people from IBM or MS amongst things about this topic and so far not a SINGLE one of them could name any large code base that was rewritten from scratch or large parts of it and where it turned out to be a success. Not one of them, not a single example - now all of them agreed with the overall premise that throwing away large amounts of code is a bad idea, so maybe we're all biased, but then: Please name one larger example of an established SW company. I'd be more than interested in hearing it!

Note that this doesn't mean that refactoring code is bad - spending a few months on a large project to fix architectural problems or ugly code with careful refactoring and so on is a great thing to do. Or rewriting a small part of the application due to performance problems? All fine by me too.

Wait how did we end up with a discussion about SE? Bit OTy here.

taltamir · Oct 8, 2011

Voo said:
Well I've talked with people from IBM or MS amongst things about this topic and so far not a SINGLE one of them could name any large code base that was rewritten from scratch or large parts of it and where it turned out to be a success. Not one of them, not a single example - now all of them agreed with the overall premise that throwing away large amounts of code is a bad idea, so maybe we're all biased, but then: Please name one larger example of an established SW company. I'd be more than interested in hearing it!

Neither of those corporations is a great innovator. Besides the problem is in the phrasing... What is the difference between "large code-base that was rewritten from scratch" and "brand new software written from scratch to replace this older one that doesn't work well"? When fundamentally flawed architectural problems prevent code from being reusable a brand new project is made to replace it, which is the same as rewriting the whole thing from scratch except it isn't called that.

Also there is no such thing as a large codebase that was ever written from scratch.
You use libraries to make your life easier, and you use a compiler to condense significant amount of assembly code into a single simple command. So at what point do you draw the line and say "this was written from scratch"?

ZFS comes to mind as an example of starting from scratch and getting a good result. Chrome is questionably one (they did base it on webkit), and they recently wrote their own compiler for it for greater efficiency.

Also suspect is "large code base"... starting from scratch is VERY common, it is usually found in things like video games and various open source programs (individual programs). But to rewrite a massive codebase like an entire operating system is a very very expensive task. It isn't that it isn't beneficial, but that it will take far too long to be commercially viable.

What open source OS have been doing is assembling an OS from myriad components. Individual components are tossed in the bin and rewritten "from scratch" all the time. And then new OS can be assembled from those components. For example FreeBSD, solaris, etc dumped older ext3 based FS for the ZFS codebase (they still have capability to use those FS, but default to ZFS). And didn't MS dump FAT in favor of NTFS? Heck the entire 9.x line of OS by MS was tossed in the bin and later OS for the home was based on the NT branch. Nexenta took solaris kernal and coupled it with GNU. Ubuntu, Kubuntu, and Xubuntu share a kernal but use alternative userlands. For a very very long time microsoft used such open source operating systems as fileservers, despite them being the competition.

Voo · Oct 8, 2011

taltamir said:
Neither of those corporations is a great innovator

Yeah and the people in there surely have NO idea about what's going on in the software world elsewhere, so we obviously could've only discussed internal projects?

taltamir said:
ZFS comes to mind as an example of starting from scratch and getting a good result. Chrome is questionably one (they did base it on webkit), and they recently wrote their own compiler for it for greater efficiency.

Neither of those were rewritten from scratch, those were just new projects because they saw that their competitors lacked some features they thought lots of people would deem important - so they didn't have any chance to do something else. Also considering that Chromoum uses webkit as their rendering engine, it's pretty funny to say they started from scratch - the rendering engine of a browser is one of its most important parts and a rather large part of it. Sure replacing small parts where performance is important and you can improve it? As I said, that's fine - but that's all google did with webkit.

Also both filesystems and browsers are great examples of stagnant markets without much innovation - just look at how much changed in the browser landscape since google released chrome. If Mozilla had the same development speed 2007 when chrome development started, chrome would have had a much harder time catching up. Actually chrome took quite some time to catch up in terms of features compared to the other browsers anyhow - if the comparison weren't firefox 3.x vs. chrome 1.0 (you know the one without even a bookmark manager) but firefox 7.1 vs. chrome 1.0, how many people do you think would've changed?

If MS hadn't released any new IE while netscape was wasting 3 years rewriting their code base from scratch, the browser landscape today would look quite different.

And there's a difference between something that can be continually improved (say a rendering engine) and replacing something with a completely different product with different feature sets and goals. That's like using notepad as the starting point for Word - the two have so few in common that they're completely different products, although they are both used to write text files. Contrary to that netscape could've improved their browser without throwing out all the code.

taltamir · Oct 8, 2011

Voo said:
Neither of those were rewritten from scratch, those were just new projects because they saw that their competitors lacked some features

I call BS. Those both have massively different architectures than anything else on the market. And was written from scratch. And performs identical TASKS as everything previously on the market (store data, or view web pages) but with a properly designed architecture.

Also considering that Chromoum uses webkit

Which I explicitly stated

as their rendering engine, it's pretty funny to say they started from scratch - the rendering engine of a browser is one of its most important parts

You know what is also very important? sorting algorithms, and file access methods... and everything really that is done by a dll or a compiler. See my discussion about "from scratch" in the post you replied to.
As important a component as a rendering engine is. Chrome was built as a sandboxing, multi-process (rather then multi-threaded), gargage aware architecture to run it. Oh, and their own custom compiler. But yes, they relied on webkit. I love how much focus you place on chrome though seeing as I said "arguably chrome because they imported webkit" yet my prime example was ZFS. Your original claim was "rewritten from scratch or large parts of it". ZFS is from scratch. Chrome is arguably from scratch, definitely where "large parts of it" are brand new.

and a rather large part of it. Sure replacing small parts where performance is important and you can improve it? As I said, that's fine - but that's all google did with webkit.

Fun fact, webkit is not Chromium.

Also both filesystems and browsers are great examples of stagnant markets without much innovation

Not for lack of need, but for lack of budget. It takes a lot of time and money and not many have the will and the budget. ZFS was exceptional exactly for that reason.

And there's a difference between something that can be continually improved (say a rendering engine) and replacing something with a completely different product with different feature sets and goals.

Yes... but that is not what is discussed. We are discussing replacing product A with product B that have the same goals but do so in a completely different architectural manner.

That's like using notepad as the starting point for Word - the two have so few in common that they're completely different products

No, the two do the exact same thing. The are just different programs with different architectures. Just like how NTFS, FAT, ZFS, and ext3 are from scratch replacements of each other that do the same thing in a different architectural manner to address flaws in previous designs. (or also due to legal reasons forbidding the use of prior code due to it being proprietary).

Cerb · Oct 8, 2011

Wait how did we end up with a discussion about SE? Bit OTy here.

Well, you know, until the mods come in...

Voo said:
I don't see the number of class files (not even source files) as one of the major factors of code complexity. Otherwise the pinnacle of maintainable software would be the 200k LoC single c file? I mean I've read lots of SE stuff, but that does seem a bit far fetched (especially considering modern IDEs) - any white papers about that phenomenon? Having the interface declaration at the top of a file instead of in another file doesn't matter much when you get to the point of declaration in any case by a single key stroke.

Then to the next, then the next, then the next, and at some point, what the point of it all starts coming together. A single big 200KLOC file? No. But, a group of like functionality in a single file, instead of dispersed across several, each very small, doing very little on their own, and not seeming to serve any purpose until the whole group is put together. It's less keystrokes than having to go through more stages than other languages to work out what a random set of code is trying to do, at any given time, and isn't doing right, than other languages I've used. Lack of multiple inheritance, FI, adds files that do basically nothing, so if you didn't write them, you could be out navigating little 'this extends that, and here's one little near-pointless method', trying to find something useful.

And why Java would be especially susceptible to adding libraries, is a mystery to me. If at all the Java STL is MUCH larger than the C++STL (heck I can't remember the last even medium sized c++ project without boost). And in lots of cases reinventing the wheel is a really bad idea (though admittedly MUCH more fun and it's nothing new that writing code is much simpler than reading it).

Poor included GUI(s) aside, I can think of Spring, Struts, Wicket, and JSF off the top of my head, and then there's the likes of Hibernate and iBATIS to make working with a DBMS more complicated (not because they necessarily have to, but because they allow programmers who can't handle SQL to use SQL DBMSes, and that rarely turns out well).

Well I've talked with people from IBM or MS amongst things about this topic and so far not a SINGLE one of them could name any large code base that was rewritten from scratch or large parts of it and where it turned out to be a success. Not one of them, not a single example - now all of them agreed with the overall premise that throwing away large amounts of code is a bad idea, so maybe we're all biased, but then: Please name one larger example of an established SW company. I'd be more than interested in hearing it!

So would I. The phrasing was atrocious, and it's obvious I didn't proofread it, but nowhere was there anything about from scratch. It should have read more like so:

Ripping out an application's guts when its buggy by design, replacing it modularly, with a quality live testing environment: depends.

instead of:

Ripping out an application's guts when its buggy by design, can be replaced modularly, with a quality live testing environment: depends.

Start with what already needs changes, and/or what causes the most grief. Move on to the next most needing change or causing grief, and so on, doing more than just that minimum to fit requested changes, as part of a plan to make it work better into the future. From scratch would generally be a nightmare; and even a gradual rewriting should only be done for code that consistently exhibits frailty. I have a feeling such code doesn't get out of the likes of MS or IBM, on top of a from-scratch rewrite still generally being a bad idea.

Voo · Oct 8, 2011

Cerb said:
Well, you know, until the mods come in...

😉

Cerb said:
Then to the next, then the next, then the next, and at some point, what the point of it all starts coming together. A single big 200KLOC file? No. But, a group of like functionality in a single file, instead of dispersed across several, each very small, doing very little on their own, and not seeming to serve any purpose until the whole group is put together. It's less keystrokes than having to go through more stages than other languages to work out what a random set of code is trying to do, at any given time, and isn't doing right, than other languages I've used. Lack of multiple inheritance, FI, adds files that do basically nothing, so if you didn't write them, you could be out navigating little 'this extends that, and here's one little near-pointless method', trying to find something useful.

Well the whole abstract structures do have their purposes, but you surely can overdo it - though I don't see why the same code couldn't be produced in C# (you'd just have less files, but the same style could and certainly is applied there). And personally NOT having multiple inheritance is a big plus in my book - inheritance is often enough abused as is ("Why the hell does this inherit from that?" is a question I ask myself often enough if reading OO code) and MI can lead to interesting problems down the road. Sure it's useful sometimes, but in my experience only rarely. The distinction between interfaces/classes and then disallowing MI works nicely imo. If you ask me my biggest gripe with Java is not having first-class functions because it's against OOP principles - that leads to lots of useless boilerplate code and interfaces with one method - C# handles that MUCH nicer (oh and not having lambdas is evil for someone who likes functional programming 😉 ). So I don't have a problem with thousands instead of hundreds of classes (and you don't seem either if I read it right) but more with pointlessly deep hierarchies and in Java this just leads to lots of small classes when compared to say C# or Python.

If we're at it, what is FI?

Cerb said:
Poor included GUI(s) aside, I can think of Spring, Struts, Wicket, and JSF off the top of my head, and then there's the likes of Hibernate and iBATIS to make working with a DBMS more complicated (not because they necessarily have to, but because they allow programmers who can't handle SQL to use SQL DBMSes, and that rarely turns out well).

But you get that in basically any other language that's written for this kind of stuff. And for lots of the stuff you named, a case can be made. Eg ORMs avoid having to write lots of, lots of boiler plate code while getting basically the same result and at the same time avoiding silly mistakes (and if you want to implement all the caching and so on going on you'll need quite a lot of rather general code - and in the end you end up with a small hibernate yourself). And if you start worrying about sharding and so on, we get into quite complicated code - I'm more than happy not having to write that myself. Having some framework that abstracts the databases away seems hardly avoidable - you want to be able to abstract different databases and the whole scalability problem away to some part, because even great programmers probably aren't usually good database experts or worry all their time how to scale the backend (that doesn't mean that the programmers can do anything they want though, the way to program large scalable applications is interesting - even the high overview of how eg google does it, is extremely interesting; but note that for several other projects they're actually using hibernate internally for some projects and afaik several guys do contribute to shards )

Now you can abuse it like any more complex technology, especially if you don't understand it (eg n+1 problem) - but in my book still worth it.

Cerb said:
So would I. The phrasing was atrocious, and it's obvious I didn't proofread it, but nowhere was there anything about from scratch. [and so on]

Well I can agree with that. Replacing some important small part of an application because you see it's just not up to the task? E.g. the Javascript optimizer in chrome is fine with me. You can still use 98% of the existing code but still get the improved performance from the important part.
That bundled with reasonable refactoring is great - I've worked on projects where we decided to just stop implementing new features for a month or so and just refactor problematic parts of the code base - in my experience that was usually totally worth the time. We didn't throw away much code, but the end result was MUCH better - and we could be quite sure that we didn't introduce too many new bugs while doing it (and one finds lots of bugs that way too).

Voo · Oct 8, 2011

taltamir said:
I love how much focus you place on chrome though seeing as I said "arguably chrome because they imported webkit" yet my prime example was ZFS. Your original claim was "rewritten from scratch or large parts of it". ZFS is from scratch.

So please show me the application with a similar functional spec (which is what's usually understood in the software world under rewriting a product) that ZFS could've used. So which FS did exist back then that put its focus on data integrity, scalability and performance? Didn't exist? So how is it a REwrite if no comparable software with the same features did even exist before it? Same goes for FAT and NTFS - even if MS had wanted to, they couldn't have improved FAT to include the same features as they wanted for NTFS, so they didn't have any alternative. Note that they didn't create a new FS just when they decided that FAT was ugly and they could do much better. No they implemented a new FS as soon as it became obvious that the old one didn't have enough functionality (and that just changing a small part wouldn't do - they did that way before deciding to create a new product)

taltamir said:
Not for lack of need, but for lack of budget

So where exactly did Mozilla suddenly get all those extra funds since chrome was released? They must've found a real goldmine and not just noticed that not doing much for several years while a new competitor eats their lunch isn't a great idea.

taltamir said:
No, the two do the exact same thing.

No, they don't. They were both designed for completely different operational areas. They both show text documents and allow to edit them, but that's about it. Words goals are much broader than notebooks. Or are you claiming that their functional specification would be identical? No idea what's your definition of a rewrite is, but to cite wikipedia (getting late here so that must do):

Wikipedia said:
A rewrite in computer programming is the act or result of re-implementing a large portion of existing functionality without re-use of its source code

Did google want to implement the existing functionality of Firefox or Internet Explorer? Well obviously not and you nicely even say that yourself. Did ZFS wanted to implement existing functionality of NTFS or ext3? Hardly.

taltamir · Oct 8, 2011

Voo said:
So which FS did exist back then that put its focus on data integrity, scalability and performance?

Those are measurement of the quality of the FS, the purpose is "be a file system" aka "store files to a disk". Having scalability, data integrity, and performance are metrics of how good an FS is, and they required a rewrite from scratch to get those since the previous FS were architecturally flawed.

Voo said:
even if MS had wanted to, they couldn't have improved FAT to include the same features as they wanted for NTFS, so they didn't have any alternative.

So you admit it then? sometimes you have an architecturally flawed software whose code cannot be reused and must be replaced.

So where exactly did Mozilla suddenly get all those extra funds since chrome was released? They must've found a real goldmine and not just noticed that not doing much for several years while a new competitor eats their lunch isn't a great idea.

I have no idea what you are trying to say in these sentences. As for where mozilla gets its money, from google.

Did google want to implement the existing functionality of Firefox or Internet Explorer? Well obviously not and you nicely even say that yourself. Did ZFS wanted to implement existing functionality of NTFS or ext3? Hardly.

You are confusing functionality with quality.
Sandboxing = security.
Multi-threading/process = speed (for multi-core CPUs)
Data integrity = safety
Performance = Speed

Those are not "different design parameters", those are measurements of the quality of the architecture of something that does the exact same thing. Aka "view web pages" or "edit text documents" or "store data on a disk".

Voo · Oct 9, 2011

taltamir said:
You are confusing functionality with quality.
Sandboxing = security.
Multi-threading/process = speed (for multi-core CPUs)
Data integrity = safety
Performance = Speed

No those are all functional or non-functional requirements. Which are defined by requirements engineering at the start of the project. If the spec didn't deem it important to allow for the possibility of later improvements or changes in that area (and limiting the reach of a spec is rather important, you can't just say "Well it should do anything and be able to be improved in any way later on!") it may be extremely hard or indeed impossible to add some things later on or put more focus on a non-functional requirement without losing some functional requirements.
In which case it is perfectly fine to create a new product with the new requirements in mind - most projects reach that stage sooner or later..

taltamir said:
So you admit it then? sometimes you have an architecturally flawed software whose code cannot be reused and must be replaced.

You just don't get what REWRITE means. NTFS offered much more than just the "existing functionality" (functionality is a properly defined term in SE so please don't just redefine it as you want to support your argument) of FAT. Creating a new product with completely different functional and non-functional requirements is NOT the same as rewriting a product to fulfill exactly the same requirements as an existing product already does.

taltamir · Oct 9, 2011

So your argument boils down to:
nobody ever successfully tossess a large codebase to write a replacement from scratch. And if they do it doesn't count because it has "different features/design goals" such as "faster", "reliable", or "secure".

I fundamentally disagree with you that "it doesn't count" and I do not think there is any point to continuing this discussion as we have both made our position perfectly clear and simply disagree.

Voo · Oct 9, 2011

taltamir said:
So your argument boils down to:
nobody ever successfully tossess a large codebase to write a replacement from scratch. And if they do it doesn't count because it has "different features/design goals" such as "faster", "reliable", or "secure"

Not necessarily those non-functional features, because many (though not all! Eg. if one important requirement were to keep backwards compatibility with an older product, you may not be able to implement every wanted security feature) can be integrated into an existing design. See modern Windows kernels and what they did from a reliability/security point of view.

It's more about functional requirements that may very well be completely impossible to integrate into an existing product. I mean trying to integrate for example all the important features of NTFS into FAT just wouldn't work (eg how'd you implement alternative data streams in FAT?)

So my argument is, that NTFS isn't a rewrite of FAT as NTFS was clearly designed for different functional requirements which couldn't be implemented on the older spec (ie several old and by now unnecessary or even detrimental requirements specified in the FAT spec made it impossible to implement new features).

One of the reasons rewriting is such a bad idea in my book is, that after several years you basically have exactly the same product (as that's the SE definition of a rewrite) which makes customers not really happy (at least they won't care, "Why should I pay money for exactly the same functionality I already have?"). If after several years you have a product with new added functionality? Well I can see the value in that ("Hard links? 32k character limitations? sparse files? ADS?" Cool features, where can I get NTFS?)

So writing this at 7PM instead of 5o'clock in the morning hopefully summarizes my point a bit more clearly - if we still disagree now then it's at least not due to misunderstandings 😉

taltamir · Oct 9, 2011

Voo said:
One of the reasons rewriting is such a bad idea in my book is, that after several years you basically have exactly the same product

Well I can see the value in that ("Hard links? 32k character limitations? sparse files? ADS?" Cool features, where can I get NTFS?)

By your definition of rewrite, you are correct.
By my definition I am correct.

The only disagreement we seem to have is what is the definition of the word rewrite.

the reason for my absolute hatred of mechanical HDDs

Voo

Golden Member

taltamir

Lifer

Voo

Golden Member

taltamir

Lifer

Cerb

Elite Member

Voo

Golden Member

Voo

Golden Member

taltamir

Lifer

Voo

Golden Member

taltamir

Lifer

Voo

Golden Member

taltamir

Lifer

TRENDING THREADS