Jump to content
3DCoat Forums
AbnRanger

Don't upgrade that NVidia card just yet...

Recommended Posts

I've been a loyal nVIDIA follower since day one, not anymore. I was having issues with display driver stops working in windows 7 with a GeForce GTX 460 (Fermi) card in a Core2 quad Q6600 machine. Out of disgust with nVIDIA for not addressing the issue I replaced the GTX 460 with a Radeon HD 7850. The HD 7850 works flawlessly and even smokes the the GeForce GTX 660 I have in my i7-3770k machine. I won't view AMD as inferior to nVIDIA anymore. Just my opinion, but I think (hope) OpenCL is the future.

Edited by bisenberger

Share this post


Link to post
Share on other sites

That's rather odd. I had a GTX 470 for about 2yrs now, and hadn't experienced a single issue with the drivers. In fact, I've had about 4 Nvidia cards now and don't recall a single issue with any of their drivers. That's with Windows Vista all the way through Win 8.

The only thing that chaps my hide with them is all this dang marketing fluff about Kepler, when they knew full well they crippled the compute power of Kepler, and it blows my mind that they would go backwards and reduce the memory bus size over and over. It was 512bits back in the GTX 200x cycle. Now, in the 600 cycle it's 256! What is wrong with this picture?

Edited by AbnRanger

Share this post


Link to post
Share on other sites

I'm having the same problem with my Nvidia 560Ti 2 gigs DDR5 RAM, particularly on startup.

I'm not going to take sides on this. There's a lot to be said for CUDA and NVidia but there's a lot to be said for Open CL and AMD Radeon too. And the older drivers that we're mostly speaking of here are not necessarily relevant to the latest breed of GPUs.

The Titan Ultras are about to come out and they've got 6gb ddr 5 RAM and 4000 cores

AMD RAdeon is coming out with their own GPU to match with 4000 cores and 6gb ddr 5 RAM

And the new Quadro Pro k6000 is coming out with up to 24 gb of ddr 5 RAM

They're all affordable.

Well the Titan and AMD 4000 core GPUs will be at around $1400.00 each. 2013 is a very exciting year for CG.

Edited by L'Ancien Regime

Share this post


Link to post
Share on other sites

Nevertheless, I do wish, after the V4 launch, I hope business is good enough that Andrew could contract a GPU programming specialist to take the ball and run with it.

Andrew did say this in another thread (which is good news):

"Don't worry. I have no any plans in visible future to stop or sell 3dc. We rather expect to expand."

So maybe there will be some new staff and great improvements after Version 4 is out...

Edited by TimmyZDesign
  • Like 1

Share this post


Link to post
Share on other sites

It wasn't with my GTX 470. I am thinking that the scaled back BUS size (from 320 to 256bit) is the culprit. Everything else is faster. That was a ridiculous move on NVdia's part. Nevertheless, AMD talks out of both sides of their mouths. They trumpet their streaming capability and yet consistently offer crappy driver support. It's like bragging about how big an engine you have in your car, while the transmission is broke down and no attempt is made to get it up and running.

Again, it ain't about monopolies or using a standard everyone can utilize. It's about AMD not giving a flip whether or not your gaming card works with a professional CG app. They put all their development efforts toward making sure their drivers work well with Games and games alone. The whole bit about Adobe offering some support for OpenCL is not a show of commitment from AMD, but on Adobe's part to kick AMD in the rear, to make their shizzle work...likely do to all the Adobe consumer push for the support.

It working with some of Adobe's apps doesn't mean OpenCL support in 3D Coat would work well with AMD cards. The way AMD looks at it, if you buy a gaming card from them, you get gaming support/drivers. If you buy a professional card from them, then you get professional/CG support.

Where you getting this from? Have you written a line of code in your life? OpenCL is equally powerfull as CUDA. And it works on CPU,GPU,FPGA,ASIC hardware. Sadly CUDA only works on NVIDIA GPU.

All Adobe software support OpenCL. It will continue to rise because it is an Open standard much like OpenGL. Everyone was talking down OpenGL when it was introduced. Now look where it is. Its used in every CG/CAD application on this planet.

Infact the next start to rise is OpenRL. Anyway this whole crap with AMD having bad drivers is a total sham. I have a AMD 5570 and 3d coat works perfect with it. OpenGL & DirectX.

  • Like 2

Share this post


Link to post
Share on other sites

it's really wrong to try to create a false dichotomy here...the dichotomy of AMD/OpenCL vs CUDA/Nvidia. I don't think it holds.

OpenCL is platform agnostic. It runs better with Nvidia than CUDA does.

DGt6vE9.jpg

Quotation from Heterogenous Computing with Open GL by Gaster, Howes, Kaeli et al.

Open CL > CUDA

and it's open, not proprietary or closed. Ultimately, isn't that to be preferred?.

Edited by L'Ancien Regime

Share this post


Link to post
Share on other sites

I've been a loyal nVIDIA follower since day one, not anymore. I was having issues with display driver stops working in windows 7 with a GeForce GTX 460 (Fermi) card in a Core2 quad Q6600 machine. Out of disgust with nVIDIA for not addressing the issue I replaced the GTX 460 with a Radeon HD 7850. The HD 7850 works flawlessly and even smokes the the GeForce GTX 660 I have in my i7-3770k machine. I won't view AMD as inferior to nVIDIA anymore. Just my opinion, but I think (hope) OpenCL is the future.

That happens because you cannot use the same card for viewport and CUDA. If the CUDA kernel runs too long the driver will time out. Infact the reason it times out the driver is because kernel is tooo long and should be split in to multiple parts.

Using two cards you wouldn't get such problem OR edit the registry settings but that can have consequences.

That shows you how unoptimized the 3DCoat CUDA kernel is.

  • Like 1

Share this post


Link to post
Share on other sites

54367.png

I have no idea what LuxMark scores are supposed to mean/prove?

I'm even more mad at NVidia today. I did another test, navigating with an older card (GTX 275), and even that card outperforms the 670, when wireframe is enabled. It's weird though. The FPS is much better on the 670 when wireframe is turned off. So, I don't know whether I should sell the 670 and get a used 580, or see if Andrew can fix it on the software side.

Share this post


Link to post
Share on other sites

I have no idea what LuxMark scores are supposed to mean/prove?

I'm even more mad at NVidia today. I did another test, navigating with an older card (GTX 275), and even that card outperforms the 670, when wireframe is enabled. It's weird though. The FPS is much better on the 670 when wireframe is turned off. So, I don't know whether I should sell the 670 and get a used 580, or see if Andrew can fix it on the software side.

why don't you do the reasonable thing and buy a Titan or Titan ultra?

http://www.extremetech.com/computing/153929-nvidias-gtx-titan-le-and-titan-ultra-leaked

4000 cuda cores...6 gb ddr5 RAM.

Edited by L'Ancien Regime

Share this post


Link to post
Share on other sites

"I have no idea what LuxMark scores are supposed to mean/prove?"

I'm wondering the same thing. Synthetic benchmarks don't mean much IMHO, but I'm sure this one does to owners of AMD cards whom, after reading my post, are no doubt foaming from the mouth at this very moment lol. :)

I used Nvidia cards in the beginning, switched to ATI for a while as they were called at the time, then went back to Nvidia due in part to driver support. In over a decade of use, I've only ever run into two problems with an Nvidia product. The first was a driver issue about three years ago that sounds very similar to bisenberger's problem, and the second was when I tried to set up SLI back when it was still relatively new tech. That one may have actually involved firmware rather than drivers and the manufacturer bent over backwards to make it right.

As for my experience with CLI versus CUDA, using the former in Vray RT results in approximately the same performance as having it set to CPU mode, while the latter is much faster than either. On the other hand I don't see any difference between CUDA and SIMP when comparing these two 3D Coat versions. Take from that what you will, but personally I think AbnRanger is right about the situation. Not sure what the problem is with his 670 though as mine (made by Gigabyte) works great.

And before anyone freaks out, I don't have anything against AMD and I would happily use a card designed by them provided some basic reassurances were met. After all, it's not just all about performance, is it now.

Share this post


Link to post
Share on other sites

I've talked to Andrew .. and he said its not worth to bother with CUDA. 30% speedup in particular operations not worth months of work.

And after that if CUDA changes again which is inevitable, a dev has no control on these changes, this greatly affects 3dcoat which only has a one man team. Even luxology with a modestly bigger team did some feasability studies over a year ago and they concluded that gpu rendering for modo is not going to be developed anytime soon.

Why do you have a direct line with Andrew? :D

Share this post


Link to post
Share on other sites
"I have no idea what LuxMark scores are supposed to mean/prove?"

I'm wondering the same thing. Synthetic benchmarks don't mean much IMHO, but I'm sure this one does to owners of AMD cards whom, after reading my post, are no doubt foaming from the mouth at this very moment lol. :)

I used Nvidia cards in the beginning, switched to ATI for a while as they were called at the time, then went back to Nvidia due in part to driver support. In over a decade of use, I've only ever run into two problems with an Nvidia product. The first was a driver issue about three years ago that sounds very similar to bisenberger's problem, and the second was when I tried to set up SLI back when it was still relatively new tech. That one may have actually involved firmware rather than drivers and the manufacturer bent over backwards to make it right.

As for my experience with CLI versus CUDA, using the former in Vray RT results in approximately the same performance as having it set to CPU mode, while the latter is much faster than either. On the other hand I don't see any difference between CUDA and SIMP when comparing these two 3D Coat versions. Take from that what you will, but personally I think AbnRanger is right about the situation. Not sure what the problem is with his 670 though as mine (made by Gigabyte) works great.

And before anyone freaks out, I don't have anything against AMD and I would happily use a card designed by them provided some basic reassurances were met. After all, it's not just all about performance, is it now.

Run a quick test, if you will. Start with the Basic Human Model (not the Mannequin...the other model), from the Splash Screen. Toggle Wireframe on. You can navigate about the model ok, with wireframe on, as it's only about a 1mill poly model. Now, click the RES+ icon. I believe it will be about 6.5mill polys. With wireframe toggled on, you will notice a good deal of initial lag with every attempt to navigate about the model. Watch the lower left part of the UI, and notice how the FPS drop through the floor (after you let off your stylus/mouse)...to around 5-10fps. Now, turn wireframe off and there is little problem. I seem to notice a bit of brush lag at times and I have to think it's it's the scaled back Memory Bus (bandwidth).

That would explain why it doesn't occur with the 470 or even the 275. They are slower in every other regard, but NVidia reducing the Bus size seems to be an artificial crippling of the Geforce line. Will probably sell the card and try to get a used 580. As for Andrew not wanting to update CUDA...he has no way of knowing the scope of improvement. His assumption of only marginal improvement is no different than his view of Multi-threading. He was just as reluctant in that regard, as well. So, I guess we'll just have to settle for lots of lag in large brushes and shrug it off, too.

Edited by AbnRanger

Share this post


Link to post
Share on other sites

That happens because you cannot use the same card for viewport and CUDA. If the CUDA kernel runs too long the driver will time out. Infact the reason it times out the driver is because kernel is tooo long and should be split in to multiple parts.

Using two cards you wouldn't get such problem OR edit the registry settings but that can have consequences.

That shows you how unoptimized the 3DCoat CUDA kernel is.

Just for the record; I wasn't using 3D-Coat when this happened (or any other cuda enabled app).

Edited by bisenberger

Share this post


Link to post
Share on other sites

I'm glad this thread happened. I was looking at Nvidia and listening to their hype a lot, and rejecting AMD RAdeon out of hand solely because I like 3d coat the best of all the sculpting programs (though I haven't tried Blender yet) and this thread basically laid it all out on the table for me; 3d Coat utilization of CUDA isn't really that important to me now. A new CPU (say dual Xeons) or maybe a 5ghz i7 or perhaps the AMD 8350 or its successor ( but that will only handle 2 gpus due to bandwidth issues) with 64 gig of RAM (or moar) will be all I need to sculpt all the voxels I'll ever need regardless of CUDA or OpenCL parallelism offloading of tasks to the GPU. 100 million polys in Live Clay no problem.

If this is the case then choice of GPU becomes one solely of real time rendering. If that is the case then AMD becomes far more competitive; AMD 7970 is 6gb ddr5 RAM and about 2500 cores and Nvidia Titan is about the same. Only Nividia is solely CUDA and Open CL while AMD is only OpenCL but it runs it a lot faster.

AMD 7970 = $500.00

NVidia Titan = over $1000.00

Now I agree AMD has had problems in the past with drivers. I had a ATI Radeon 4870 x2 and not only did it screw up the interface in Vue but it was loud as hell, literally a screamer on hot days.

But 3 Titans or 3 AMD 7970's is RTR.

So that's $3000+ Nvidia or $1500 for AMD.

I think for me the best bet is to try one AMD 7970 at $500 and see how the drivers are and if it's crap no big deal sell it at a loss on ebay.

Because CUDA for 3d Coat is no longer a consideration.

Edited by L'Ancien Regime

Share this post


Link to post
Share on other sites

I'm still not ready to go with AMD just yet. Too many GPU renderers only using CUDA, to limit myself in this regard. Holding out hope that more applications will get onboard with OpenCL instead is a fruitless endeavor. They were trumpeting their Streaming capability back when I had a 4850. That card just would not work with some of my CG apps. If Andrew just solved the Large Brush bottleneck with CUDA or optimized the code in OpenGL or DX, that would solve my issues on the performance side.

I just recently discovered that some of the newer RAM modules you buy have an XMP Profile embedded, and most newer (aftermarket) Motherboards can detect it. So, if you buy RAM listed at 1866-2100mhz, for example, you can select "Auto" (each MB is different in it's terminology, but you can probably find it easy enough) and the motherboard will run on the timing settings in that embedded profile. Previously, I could never seem to get anywhere near the max speed listed by the RAM or MB manufacturers, adjusting the settings manually. You have 4 timing settings listed on each RAM module, but in the Bios there are a bunch of other timing settings, so setting the timing manually is not going to get the average techy/geek very far.

You go into your BIOS to set this, and if you got 1866-2100 memory (and a pretty good multi-core CPU running around 4ghz+), you'll be surprised how much more liver the performance is. So, fast RAM and a beefy CPU will benefit a 3D Coat user more than anything else. Word of warning, though. If you have an off the shelf system, like Dell, HP or something...forget all of this. You are pretty much stuck at the rate you have. They disable the Motherboard from allowing any overclocking of the CPU or RAM. Your board might support a better CPU model from the same line of CPU's, but it won't help you to get faster RAM, when the Bios won't give you access to adjust it's timings (auto or otherwise).

Share this post


Link to post
Share on other sites

I followed your instructions, AbnRanger, and the experience was exactly as you predicted. It's a very jarring, sudden change in performance completely dependent on brush size. It's speedy and just fine, then totally drops off with just a tiny change to the brush size. Shrink it back down just a tiny fraction and it goes right back to being speedy.

Knowing Nvidia (and how corporate types think in general), their gaming cards are probably designed to work great only in games, while their pro cards are designed to work well only in CG apps. That way you're forced to buy both, or so they hope. Greed makes people do strangely illogical things. It would be interesting to see how AMD's 7970 would perform with 3DC if Andrew were to add OpenCL support.

The article posted by L'Ancien Regime is an interesting read. Thanks for sharing it with us! I'll probably replace my GTX 670 with whatever blows away the AMD 7970. I try not to upgrade too often because even though it can be fun, it's often also time consuming and I do so hate the inevitable troubleshooting that tends to go with it lol. :)

About memory with XMP, I had to turn it off because the timings it set would prevent my PC from getting past the BIOS screen, and sometimes not even that far. What I did was write down the settings it wanted to use, then entered identical settings into the BIOS myself using its manual override mode. Then it would boot perfectly fine and even ended up being super stable that way. Don't know why one way would work and the other wouldn't when the settings were identical, but there you have it. Fwiw they were Mushkin Enhanced Blackline Frostbyte DDR3-1600 rated for 9-9-9-24 timings at 1.5v. They easily ran at higher clock speeds so long as the timings were loosened, but after a lot of benchmarking I found that a slower frequency with tighter timings was actually a fair bit faster than a higher frequency with loose timings. Naturally YMMV.

Share this post


Link to post
Share on other sites

I followed your instructions, AbnRanger, and the experience was exactly as you predicted. It's a very jarring, sudden change in performance completely dependent on brush size. It's speedy and just fine, then totally drops off with just a tiny change to the brush size. Shrink it back down just a tiny fraction and it goes right back to being speedy.

Knowing Nvidia (and how corporate types think in general), their gaming cards are probably designed to work great only in games, while their pro cards are designed to work well only in CG apps. That way you're forced to buy both, or so they hope. Greed makes people do strangely illogical things. It would be interesting to see how AMD's 7970 would perform with 3DC if Andrew were to add OpenCL support.

The article posted by L'Ancien Regime is an interesting read. Thanks for sharing it with us! I'll probably replace my GTX 670 with whatever blows away the AMD 7970. I try not to upgrade too often because even though it can be fun, it's often also time consuming and I do so hate the inevitable troubleshooting that tends to go with it lol. :)

About memory with XMP, I had to turn it off because the timings it set would prevent my PC from getting past the BIOS screen, and sometimes not even that far. What I did was write down the settings it wanted to use, then entered identical settings into the BIOS myself using its manual override mode. Then it would boot perfectly fine and even ended up being super stable that way. Don't know why one way would work and the other wouldn't when the settings were identical, but there you have it. Fwiw they were Mushkin Enhanced Blackline Frostbyte DDR3-1600 rated for 9-9-9-24 timings at 1.5v. They easily ran at higher clock speeds so long as the timings were loosened, but after a lot of benchmarking I found that a slower frequency with tighter timings was actually a fair bit faster than a higher frequency with loose timings. Naturally YMMV.

It's running fine here, but what I had to do was adjust the main CPU speed + the memory ratio to get the CPU speed I wanted and keep the memory speed close to the max listed on the RAM. The Motherboard needs to be be able to handle the extra speed. For example on the 1366 (Intel X58) boards, most are rated only for a max of 1600. I put some Patriot Viper Extreme memory in an Intel board for a render box (2000mhz rated) in it and got no xmp profile option. All I could do was run the setup at optimized defaults and it ran like crap.

As soon as I put some sticks rated 1600, I got an an XMP profile....set it to that and it ran fine. Maybe a BIOS update might help your situation, but often times it's just the motherboard, period. The upper tier (high end ASUS, MSI, Gigabyte, EVGA) boards generally work the best.

Edited by AbnRanger

Share this post


Link to post
Share on other sites

I followed your instructions, AbnRanger, and the experience was exactly as you predicted. It's a very jarring, sudden change in performance completely dependent on brush size. It's speedy and just fine, then totally drops off with just a tiny change to the brush size. Shrink it back down just a tiny fraction and it goes right back to being speedy.

Knowing Nvidia (and how corporate types think in general), their gaming cards are probably designed to work great only in games, while their pro cards are designed to work well only in CG apps. That way you're forced to buy both, or so they hope. Greed makes people do strangely illogical things. It would be interesting to see how AMD's 7970 would perform with 3DC if Andrew were to add OpenCL support.

The article posted by L'Ancien Regime is an interesting read. Thanks for sharing it with us! I'll probably replace my GTX 670 with whatever blows away the AMD 7970. I try not to upgrade too often because even though it can be fun, it's often also time consuming and I do so hate the inevitable troubleshooting that tends to go with it lol. :)

About memory with XMP, I had to turn it off because the timings it set would prevent my PC from getting past the BIOS screen, and sometimes not even that far. What I did was write down the settings it wanted to use, then entered identical settings into the BIOS myself using its manual override mode. Then it would boot perfectly fine and even ended up being super stable that way. Don't know why one way would work and the other wouldn't when the settings were identical, but there you have it. Fwiw they were Mushkin Enhanced Blackline Frostbyte DDR3-1600 rated for 9-9-9-24 timings at 1.5v. They easily ran at higher clock speeds so long as the timings were loosened, but after a lot of benchmarking I found that a slower frequency with tighter timings was actually a fair bit faster than a higher frequency with loose timings. Naturally YMMV.

Actually after all this discussion, I'm thinking that the whole GPU card parallel programming business may not be the right way to go. Andrew thinks it's a bitch to program and so do the guys over at Vray.

Vray has a much more interesting take on it; they're going with Intel Xeons and the Xeon Psi Co Processor, which is much easier to program for multithreading and parallel computing. For not much more than an Nvidia Titan you get a lot more cores. 240 threads..with 8 gigs of DDR5 RAM and 320 GB/s of max memory bandwidth. One Xeon Phi will thus be = 4 * 8 core Xeons. For under $2000. And that will have MKL, Math Kernel Libraries built in.

I'm still not sure how many of these you could plug into your PCIE slots for each Xeon CPU....I would think at least two.

This is the route SGI is going with it's SGI UV chassis..

http://www.sgi.com/products/servers/uv/

INTC_XeonPhiKnightsCorner.jpg

So forget CUDA, and pass on OpenCL and go for the Xeon Phi, and just get an AMD 7970 or a Titan for viewport..or if you've got money to burn a Quadro Pro or FireGL..

If Andrew can make 3d Coat scalable to all those threads, (and the new Xeon Phi coprocessor coming out in July will have 480 threads) that will be the real deal, not trying to transform your GPUs into CPU functionality.

And the Nvidia 780 will be crippled just like the 680...what a joke..

Somebody send Andrew this for his birthday;

xeonphib.png

:D :D :D :D :D

http://www.amazon.ca/Intel-Xeon-Coprocessor-Performance-Programming/dp/0124104142

"Reinders and Jeffers have written an outstanding book about much more than the Intel® Xeon PhiT. This is a comprehensive overview of the challenges in realizing the performance potential of advanced architectures, including modern multi-core processors and many-core coprocessors. The authors provide a cogent explanation of the reasons why applications often fall short of theoretical performance, and include steps that application developers can take to bridge the gap. This will be recommended reading for all of my staff." -James A. Ang, Ph.D. Senior Manager, Extreme-scale Computing, Sandia National Laboratories

Edited by L'Ancien Regime

Share this post


Link to post
Share on other sites

I think this is probably one reason why Cebas (3rd Party software vendor for 3ds Max, Maya and C4D) has dragged it's feet for the past 3-4yrs+ with the upgrade to R4 GPU. It looked like they were going to be all CUDA based (hybrid CPU +GPU), but at one point, this Intel card came into the picture and I think that made them take another look. Who knows? GPU computing IS indeed the future, cause CPU tech has been stuck in the mud, with no significant advances for the past 3-5yrs. We've had 4-8 core CPU's since 2009...and we're still here.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×