C++ vs Java - a motoring analogy

C++ vs Java - a motoring analogy

Author
Discussion

IainT

10,040 posts

240 months

Friday 8th June 2007
quotequote all
dilbert said:
I reckon it's pretty reasonable.

There are times when you just can't avoid using a manual. I've just been writing some code, and everything worked fine when working on small files. The trouble was that it just seemed to hang up and do nothing for ages if you loaded up bigger files maybe 60 or 80 megabytes. (Prescott)

You start digging around and realise that the problem is actually coming from the way that you are allocating memory or more specifically the lifetime of objects. Once the realisation is made, a couple of quick tweaks, and the software handles an 80 meg file just as easily as a small one. If your compiler or runtime environment is controlling cleanup of objects, I just cant see how one would make such optimisations.

Another issue that I can't see java handling so well is that of heap corruption. So for example, I make a mistake somewhere, and corrupt a bit of the heap. I actually coded the heap that I normally use myself, so putting in checks to find the source of the problem is a doddle. The thing is that the boundary checks to find the problem can need to be so heavy that you can't get to the point where the problem occurs.

If you cant own your heap, I just cant see how you could perform such debugging. I suppose being a virtual machine, access by reference can be protected by a software mechanism that C++ people would call an access violation or exception fault. Nevertheless I can't imagine the performance penalty of having to generate access violations in software rather than in hardware.
Problem with that way of thinking, as I see it, is that having to write my own heap and debugging it just isn't useful in achieveing my tasks.

If I'm writing a busines app why on earth do I want to be involved with that? It doesn't help me actually write the business app. As for your example of having to debug the heap - that's a great example of why many IT projects over run and cost more than they should - the project team spend too much time fixing problems that either shouldn't exist or shouldn't be their problem.


The key thing is to use the relevant tool for the job. Java does hide/insulate the developer from some of power of the underlying OS. Functionality that one may well need to harness to achieve performance targets or work round some limitation of supplied libraries. For that C++ (or even better Assembler) would be a good tool.

If you want to create a spreadsheet to do your accounts it's sensible to use Excel (or equivalent) rather than write the spreadsheet programme...

dilbert

7,741 posts

233 months

Friday 8th June 2007
quotequote all
I'm no top level cross technology guru, but there is a question that I have, and it relates to the underlying compute support that all of these languages depend on.

Clearly the primary drive for languages like Java, C# and so on are not so much ease of memory management, but cross platform compatibility. I'd have thought that's the no brainer. That's the thing that a C++ devotee like myself cannot argue with. In truth, there is no reason why C++ couldn't be a translated language. Depending on your view you could argue that C# is just that. Fortunately it has a different name.

My question though relates to the way in which processor technology is developing. The highest end microprocessors are now commonly dual core. One has to consider why. My iterpretation is that the drive for performance is still there, but the constraints of silicon, mean that it's difficult to bring them in a single microprocessor core. It's a pretty surefire thing that the future is multiprocessor, and in real terms it's already here.

If you're going to use two cores efficiently then the code has to be written in such a way that the tasks can be distributed, and this is something that is still proving to be a challenge in many software development arenas. The thing is though that if you never delete an object because it's all taken care of for you, you never really know who owns it. In a single thread that doesn't matter. In two threads on the same processor it probably doesn't matter. In two threads on different processors it really does matter.

So my question relates to the future of Java from the perspective that it may be inherently difficult to use it to take advantage of multiprocessor technologies. I don't use it, so I'm not sure. What I am sure about is that MS offer a product which has the platform independance capabilities of Java without the hideyhole approach to memory management. Clearly my money is going to be on that as a future development platform.

My question, however, is can Java compete at all in a multiprocessor world? If it can; Can it do so as it is now, or must it change?

dern

14,055 posts

281 months

Friday 8th June 2007
quotequote all
JonRB said:
dern said:
I make an effort not to fall for the lure of supporting legacy systems for big money because I know that one day I'll be dropped along with the system and then have a hell of a hill to climb.
Yes, that's always a concern in the back of my mind too but at the moment I'm not resisting the lure, I'm embracing it. But there's still new development going on as well, which is good.

From having dabbled with C# I know that the hill isn't as steep as it could be as Microsoft seemed to have had C++ developers in mind when they specced C# (probably due to their huge investment in the language - Windows itself is written in C++, or was up until at least NT), but you're right, it is a concern.
You've already got the OO stuff though which I didn't have with C. The rest is just syntax at the end of the day.

GreenV8S

30,269 posts

286 months

Friday 8th June 2007
quotequote all
dilbert said:
So my question relates to the future of Java from the perspective that it may be inherently difficult to use it to take advantage of multiprocessor technologies. ... can Java compete at all in a multiprocessor world? If it can; Can it do so as it is now, or must it change?
All the issues you're talking about have already been introduced and addressed by the Java threading model. Multi-threaded multi-core processing isn't fundamentally different to multi-threaded single core processing, and even the extra issues with distributed processing are fairly well understood and quite tractable. Distributed processing is where the future lies imo, now that communication bandwidth is so ubiquitous and small scale processing power is so cheap.

The issue about whether garbage collection should be done by the system or the programmer is just a matter of perspective. System programmers know that this is far too important to be left in the hands of application developers and has to be built into the runtime framework. And application developers know it's far too important to be left in the hands of system programmers and has to be managed by the application. Especially when the system programmers are the dozy twonks at JavaSoft, who wouldn't know a decent garbage collection strategy if it fell on them.

onomatopoeia

3,472 posts

219 months

Friday 8th June 2007
quotequote all
JonRB said:
plasticpig said:
If you are driving a manual car with C++ what is the guy who is using an assembly language driving?
Well, for a start, you're stretching the analogy because I defined it only for C++ and Java, so I'm throwing you an out-of-bounds error message. wink

But, if you insist, I'd say he was driving a car eligable for the London-Brighton run where you have to prime the fuel pump by hand, start it with a hand crank, advance or retard the ignition with a manual control, and double de-clutch when changing gear due to a lack of syncromesh. biggrin
Yet somehow while doing all this they get to Brighton in 15 minutes using only half a gallon petrol. hehe

JonRB

Original Poster:

75,191 posts

274 months

Friday 8th June 2007
quotequote all
GreenV8S said:
Especially when the system programmers are the dozy twonks at JavaSoft, who wouldn't know a decent garbage collection strategy if it fell on them.
hehe

TheExcession

11,669 posts

252 months

Friday 8th June 2007
quotequote all
JonRB said:
GreenV8S said:
Especially when the system programmers are the dozy twonks at JavaSoft, who wouldn't know a decent garbage collection strategy if it fell on them.
hehe
and anther hehe

TheExcession

11,669 posts

252 months

Friday 8th June 2007
quotequote all
I think there's another couple of issues to this debate too.

1. Speed of code entry/development;

I started with C programming when IBM XTs where just starting to roll out. My first C compiler was Turbo-C which came on I think 5 x 5-1/4 floppy disks! I got it off a friend at school and used to get my Dad to bring one of those 'portable' computers with a 5" green screen home so that I could practice. From the stuff that I'd been playing around on at school BBCs, Apples, Spectrums, ZX81s it was a real treat. I learnt all about link libraries and how to reduce your compile time from several hours to a few minutes by linking different obj files.

Then I got into programming on the Windows platform and still stayed doggedly with C programming until I got into Visual Basic. Man how, even to this day I hate having to do anything in VB. Why? Simply because it takes so many key presses to achieve the same result that I could get in C in half the time.

I wonder if anyone else ever noticed this, but VB is such a long winded language, and if the truth be known I find it incredibly hard to scan read VB code, where as C I could just look at and see what it was doing.


2. The Shift to OO thinking. (abit OT I know)

To be honest the step up from a procedural language (C) to a semi-OO language (as VB was in those days) didn't really confound me. I only really struggled when I took on Java. It took me a long while to shift my thought train. Once it clicked, I never looked back.


Returning a little now:

I don't want my code cluttered up with lots of memory allocation and deallocation code. TBH I don't even want to think about it, I want to concentrate on the Application's functionallity. As has been said already that is what I'm being paid to do. That's what my contract deliverables are all about.

I want code that is quick and clean to read (VB out of the window there) and Java is the nearest thing to C that I know. I never went with C++, I can read it but writing it and knowing the nuances is something that I haven't had or got time to learn.

I also think a lot of it is to do with the languages you first cut your teeth on. I find now a days that I can think in Java, I managed this with C too. I can see (C sic) pointers in my head, I understand what is going on in the background. I can structure my data in memory the way that I want to question it. With Java I suddenly found that the data and the code live together in classes - what a relief from C programming!

With Java I can code or write the design docs. I mean, as a language it just lives in my head and I 'think' in this language, I guess if we were talking french or German then I'd say I'm fluent, but Like learning Latin, C programming is a fundamental that will give you a grounding in any computer language. (I guess some one is going to tell me that I missed something by not learning assembler now wink )

Programming... ahh... It's all beautiful.

best
Ex

(regarding key presses vs number of lines of code written I think I really need to look into Python and Ruby - just not had time yet)

dilbert

7,741 posts

233 months

Friday 8th June 2007
quotequote all
GreenV8S said:
dilbert said:
So my question relates to the future of Java from the perspective that it may be inherently difficult to use it to take advantage of multiprocessor technologies. ... can Java compete at all in a multiprocessor world? If it can; Can it do so as it is now, or must it change?
All the issues you're talking about have already been introduced and addressed by the Java threading model. Multi-threaded multi-core processing isn't fundamentally different to multi-threaded single core processing, and even the extra issues with distributed processing are fairly well understood and quite tractable. Distributed processing is where the future lies imo, now that communication bandwidth is so ubiquitous and small scale processing power is so cheap.

The issue about whether garbage collection should be done by the system or the programmer is just a matter of perspective. System programmers know that this is far too important to be left in the hands of application developers and has to be built into the runtime framework. And application developers know it's far too important to be left in the hands of system programmers and has to be managed by the application. Especially when the system programmers are the dozy twonks at JavaSoft, who wouldn't know a decent garbage collection strategy if it fell on them.
Twonks or not, that all sounds a bit hand wavey. I'd be the first to step back and say that the folks at Sun have some idea about the problems of multiprocessing. In fact it'd be difficult to imagine that they have less idea than myself.

The thing is though, it seems a bit odd that you'd have to tell your environment that a different thread or process is taking on an object about which you have ownership. For example, some static or read only data could easily be shared between processes without worry. If that static data were big, it'd be really useful to be able to do that.

On the other hand, if the data were read/write, how would the environment tell that you'd passed the data to a different thread or process. To make things really complicated, you might have a situation where the read/write data is shared by two processes, and it's protected by a mutex.

I just can't envisage how a garbage collection environment could possibly figure out when to delete the object. Maybe the thing just eats memory until the whole program is finished, and abandons the heap at the end, but that seems a bit decadent for an application that might want to run for days, months or even years without crashing.

I suppose it's reasonable to assume that one could implement a heap in a Java environment, but surely it defeats the object. However I look at it garbage collection seems to confuse the issue significantly.

I just don't see how it helps that's all. Clearly it makes stuff that's easy, even easier, but it seems that as one pushes on with it there's a rather nasty break, where things that were easy, are suddenly rather more difficult, maybe even impossible.

Surely it's got to be easier to do your memory management as you go along, rather than having to sort it all out when it goes pearshaped.

Size Nine Elm

5,167 posts

286 months

Friday 8th June 2007
quotequote all
Horses for courses...

Done a lot of C# and Java. And also written a heap system in C because (at that point) there wasn't a good instrumented heap management package available where you could turn on heap debugging.

A garbage-collected heap is fine for most application code, but even then you should understand what is going on under the hood so it doesn't bite you. Anything complex and algorithmic still needs to be careful about dropping references as early as possible - repeatedly garbage-collecting a gigabyte does tend to slow things down a tad.

The MS .NET framework guys do seem pretty smart though, their stated aim is that a GC should cost about the same as a page fault.

I'll defer to Green V8S on the JavaSoft guys... smile

TheExcession

11,669 posts

252 months

Friday 8th June 2007
quotequote all
So should we put all this to a bit of a challenge then?

Perhaps we set a bit of a coding challenge, decide upon a requirement, code it up in all our various and preferred languages.

Ascertain the number of lines of code required and then do some performance testing?

best
Ex

JonRB

Original Poster:

75,191 posts

274 months

Friday 8th June 2007
quotequote all
TheExcession said:
So should we put all this to a bit of a challenge then?

Perhaps we set a bit of a coding challenge, decide upon a requirement, code it up in all our various and preferred languages.

Ascertain the number of lines of code required and then do some performance testing?
Count me out - I'm not interested in taking part in a religious war as to what is the "best" language - the entire point of my original post was why I (personally) stick with C++, the reason being is that I enjoy working with it.

TheExcession

11,669 posts

252 months

Friday 8th June 2007
quotequote all
JonRB said:
I'm not interested in taking part in a religious war as to what is the "best" language - the entire point of my original post was why I (personally) stick with C++, the reason being is that I enjoy working with it.
clap
Fair play Jon, I enjoy working with Java, but I still reckon I should be getting to grips with Python and Ruby. What I would say, and please don't take this as an insult, is that programming languages have moved on, and continue to move on, on above and beyond where we are at today.

best
Ex

dilbert

7,741 posts

233 months

Friday 8th June 2007
quotequote all
I'm not really interested in a coding war either. Ive got enough on my plate as it is. Certainly some people are going to say that it's because I use C++. hehe

dern

14,055 posts

281 months

Saturday 9th June 2007
quotequote all
JonRB said:
TheExcession said:
So should we put all this to a bit of a challenge then?

Perhaps we set a bit of a coding challenge, decide upon a requirement, code it up in all our various and preferred languages.

Ascertain the number of lines of code required and then do some performance testing?
Count me out - I'm not interested in taking part in a religious war as to what is the "best" language - the entire point of my original post was why I (personally) stick with C++, the reason being is that I enjoy working with it.
Same here. Nice to chat but I'm on holiday and have welding to do smile

GreenV8S

30,269 posts

286 months

Saturday 9th June 2007
quotequote all
dilbert said:
I just can't envisage how a garbage collection environment could possibly figure out when to delete the object. Maybe the thing just eats memory until the whole program is finished, and abandons the heap at the end, but that seems a bit decadent for an application that might want to run for days, months or even years without crashing.
Have a look at the Javasoft white papers on Java garbage collection and distributed garbage collection. Data can be shared transparently between threads, if the data is changing then you the application designer need to manage the access to it for example by synchronisation to maintain integrity. Objects can be passed by value (copied) between processes, or referenced remotely. It's something that needs some thought, but the problems and the common solutions are fairly well known.

On the other hand, a garbage collection strategy that enables the VM to use as much memory as necessary and no more, seems to be completely beyond the reach of those twonks. They must have had half a dozen stabs at it, and *still* you have to tell it how much memory your application is going to need and tell it which GC algorithm you fancy trying today.

dilbert

7,741 posts

233 months

Saturday 9th June 2007
quotequote all
GreenV8S said:
dilbert said:
I just can't envisage how a garbage collection environment could possibly figure out when to delete the object. Maybe the thing just eats memory until the whole program is finished, and abandons the heap at the end, but that seems a bit decadent for an application that might want to run for days, months or even years without crashing.
Have a look at the Javasoft white papers on Java garbage collection and distributed garbage collection. Data can be shared transparently between threads, if the data is changing then you the application designer need to manage the access to it for example by synchronisation to maintain integrity. Objects can be passed by value (copied) between processes, or referenced remotely. It's something that needs some thought, but the problems and the common solutions are fairly well known.

On the other hand, a garbage collection strategy that enables the VM to use as much memory as necessary and no more, seems to be completely beyond the reach of those twonks. They must have had half a dozen stabs at it, and *still* you have to tell it how much memory your application is going to need and tell it which GC algorithm you fancy trying today.
Funnily enough, as a result of this thread I hve been doing some digging about just to see what's what. Without looking specifically for problems those are some of the hits I get.

I cant fault a memory manager for reserving more memory than is needed, because in simple terms I know that is how a manual memory manager will achieve speed performance. A heap is by neccesity, deep. At the bottom, reserved blocks are big and have the widest potential scope of allocation, and at the top reserved blocks are comparitively small and have the smallest scope of allocation.

The picture of how much memory is allocated varies depending on where you look. From the bottom it looks like, say, a Meg is reserved, but from the top in a specific application instance maybe only 8 bytes is reserved. In practice the two applications may have entitlement to half a meg each. The other application may be using all of it's half meg entitlement. Implicit allocation is then half a meg plus eight bytes, whilst explicit allocation is a whole meg.

As far as I can see, the idea that the system sees an allocation of one meg is fine, even though only half a meg plus eight bytes is actually being used. That's just performance enhancing. If, the application has two allocated chunks, it frees one, and subsequently canot re-reserve it, that's just bad.

This is the thing though, presumably you can make things safer by hanging onto the blocks and protecting them for a bit. The thing is that if you don't indicate the instant the chunk is free, you can't expect it to be available when you want it for something else. Of course if ownership is more ambiguous, the garbage collector merely needs to be more conservative, but that's done at the expense of freedom.

What I can't figure out though is why manual cleanup is so problematic for people. For me, most of anything I create on the heap is a referenced in a pointer (as a class member). You create objects in the constructor, destroy them in the destructor, possibly they might be destroyed prior to serialization from disk.

For each item you create, you must delete it. It's a one liner. All these problems and issues, for the sake of avoiding one line of code? Whilst I can see that the platform independence issue *might* cause some of these new languages to be fifth generation, it does not seem to be the case that garbage collection is critical to fifth generation languages, IMHO.

As for C++ being legacy. I'm happy to accept that people see it as legacy, but that does not mean it will become so. C++, is just an amalgam of ideas. It's lineage is such that it just tries to integrate as much as it can of all the capabilities that are seen as useful. Many people don't like that because there can be more than one way of doing something. For me.... Celebrate the difference!

Edited by dilbert on Saturday 9th June 13:11

kiwisr

9,335 posts

209 months

Saturday 9th June 2007
quotequote all
I think the analogy is not really very good, Jave outperforms C++ in many areas, including memory allocation, but lags in others such as numerical processing.

I think the manual vs automatic analogy is more suited to the type of person rather than the type of technology.

Microsoft admit that even if you write VB using Visual Studio you can pretty much achieve everything C++ does with identical performance (there is a very short list somewhere with a few exceptions)

GreenV8S

30,269 posts

286 months

Saturday 9th June 2007
quotequote all
dilbert said:
For each item you create, you must delete it. It's a one liner. All these problems and issues, for the sake of avoiding one line of code?
It's not one line of code though. If you've got allocated objects that are referenced in many places you need to know when the last reference is being released in order to know whether it is time to dispose of the object. In C++ you can do this using smart pointers that keep track of the number of references to each object. The Garbage Collection approach achieves the same result by working out for itself which allocated objects are no longer referenced by anything, and disposing of them. Pros and cons to each approach, and it's largely down to whether you prefer to have the application designer or the runtime designer responsible for getting it right. Letting the runtime designer take care of it makes for a very easy life for the application designer, it's just a shame that the twonks at Javasoft keep doing such a mediocre job of it.

Edited by GreenV8S on Saturday 9th June 14:50

dilbert

7,741 posts

233 months

Saturday 9th June 2007
quotequote all
GreenV8S said:
dilbert said:
For each item you create, you must delete it. It's a one liner. All these problems and issues, for the sake of avoiding one line of code?
It's not one line of code though. If you've got allocated objects that are referenced in many places you need to know when the last reference is being released in order to know whether it is time to dispose of the object. In C++ you can do this using smart pointers that keep track of the number of references to each object. The Garbage Collection approach achieves the same result by working out for itself which allocated objects are no longer referenced by anything, and disposing of them. Pros and cons to each approach, and it's largely down to whether you prefer to have the application designer or the runtime designer responsible for getting it right. Letting the runtime designer take care of it makes for a very easy life for the application designer, it's just a shame that the twonks at Javasoft keep doing such a mediocre job of it.

Edited by GreenV8S on Saturday 9th June 14:50
I don't know about smart pointers.... Don't use any of them. I have classes a bit like MFC (but they're mine) for doing anything like that.

I have an array, which is literally a sequential block of memory containing data, it's fast for retrieving indexes in the middle, but slower when it comes to adding and removing stuff in the middle. Then I have a list which is an old fashioned linked list of data, which is fast for adding and removing stuff in the middle, but not so quick when it comes to indexing the middle. Working on the ends of a list is fine in both. Obviously the array has to physically shift all the data to insert, but you always know where stuff is. The list you don't have to move stuff to insert, but you dont know where things are unles you iterate.

Both are interchangable and identical from a functional perspective, and include sort and join. You can stab at which you think is going to work best, implement it, and try the alternative, if it doesn't work the way you had hoped.

Then I have a group of hashes, that come in four flavours, and are functionally equivalent and interchangable, and can be accessed by signed integer, integer, double or string. Each is a sorted map for fast lookup.

For both lists, and all four of the hashes, there are four implementations that allow them to contain signed integer, integer, double or pointer to object a total of twenty four separate interchangable classes. The structural objects are all serial, and I have schemes for compression, encryption, and storage. I don't use ANSI style class templates.

Clearly none of these types explicitly point to a range of characters, but I have a separate implementation which is an object for string, which operates sort of like a basic string. There are two pointers, which denote the selection, and you can operate "Left,Mid,Right". It provides bidirectional numeric conversions for int, float, oct, and hex. It's got a basic find. It has proved a suitable container for files up to 100MB, but in theory it should hold 4GB. String supports serialization, and direct file save/load.

Then I have a regex, which can load a string, and perform a perl compatible regex or a more basic word search with case and word boundaries.

There's loads more, I'm currently running at about 230 separate classes, for doing useful stuff in C++. Why am I doing it? Cos I've got no job, and it saves me getting bored. I'm trying to make it so that it hits the spot, that none of the third party software I ever used quite did!!!!

The thing is that you delete the root object once, and the whole thing is gone. The same is true for my thread objects, and my window objects. You can either delete them, and they delete their kids, or you can send WM_DESTROY, and they delete themselves, and their kids. Windows and threads are notified by their children of termination, and are automatically removed from a child list. This can lead to a situation where the pointer you just used to kill an object is valid when you used it to make the call, but not on return. It always works out because in C++, at least, different instances of a class share the same code. You use a pointer to call the close function, send a message or use delete and by the time the function completes, the pointer and any threads or windows are no longer valid. That's what I call memory management.
smile

Edited by dilbert on Saturday 9th June 15:51