Angle between 2 vectors.

Anybody and everybody who has worked with 3D has, at one point or the other, run into a situation where an angle between 2 vectors was needed. The solution is rather straight forward; take the dot-product and take an arccosine of it. That’s it, you have the angle between the two. Well yes, but then again I wouldn’t be writing this blog entry if there wasn’t a story behind this. Yes, the above method will definitely give you the angle between vectors; well most of the times anyways. It’s mathematically correct. However, taking the arccosine (acos) can be prone to floating point overflows and can cause simulations and calculations to go out of whack unexpectedly. That’s because the acos function in the standard library operates in the range between [-1, 1]. Values very near to the extreme range values can cause the function to give out strange angle values and\or will give exceptions if the range values are exceeded even slightly. It’s not always you can clamp these values to prevent error, especially in situations where smooth transition of angles is required. Clamping the values will cause a jerk in the simulation and might reduce precision if it were required.

So what do we do in such a case?  Don’t worry there is a solution. Instead of using acos use atan2. atan2 takes two arguments (x,y) and finds the counterclockwise angle in radians between the x-axis and the point (x, y) in 2-dimensional Euclidean space. More importantly it is valid for all values of x and y except (0, 0) which is the origin btw. It uses the signs of both arguments to determine the quadrant of the result. The x and y in our equation are simply perp-dot-product and dot-product respectively, or simply put

angle = atan2( magnitude(cross(a,b)),  dot(a,b) );

Just remember one thing, atan2 will give you the results in the range -pi to pi radians. So you may (or may not) have to adapt your code to handle that.

The HD 4850 and the story with AMD/ATI.

First the HD 4850. I was testing the game on the new HD 4850 (Palit 512MB) today and some interesting things I observed with the graphics card. For one it gives a serious bang for the buck. Doofus 3D clocked at about 140 FPS at a resolution of 1024×768, AF 16x with graphics quality set to high. Even with AA 2x Doofus 3D clocks more than 120 FPS and I have a strong suspicion the game was going CPU bound at those frame-rate, since the machine had a 3 year old CPU. I can tell you for a fact, the card is a serious performance monster, but then again Doofus 3D ain’t a top line game. However, for me, this is the first time I have seen Doofus 3D under 4x AA and 16x AF running at a playable FPS since up until now I have had only GeForce 6200, 6600 (and to some extent the 8600) cards. There is no denying that the HD 4850 is more than worth it’s price for someone who is looking for a budget card and expects to run most of the top-line games today. The card runs a little bit hot but that’s to be expected given the amount of triangles it can push and effects it can deliver. Hats off to AMD/ATI in that regards. If you are someone who is looking for a mid-range card right now, the HD 4850 is excellent value for money.

That was the overview from non-programming point of view. Now the programmer in me has something to say. The card maybe excellent, however it’s not all that cozy with ATI drivers. OpenGL drivers are a mess, with the bundled driver not even having extensions like EXT_stencil_two_side support. Even basic functionality like (for example glDrawRangeElements() ) seems to be broken at times, even showing messed up graphics when using Vertex Arrays on older cards. Now this exact same functionality is available under DirectX. Lets say it’s safe to assume that GL drivers haven’t been updated in a while and\or AMD/ATI just isn’t interested. The only issues that were reported in this round of testing were on ATI cards, so I had to literally debug the application on ATI hardware to ascertain that these were indeed driver problems. Some of the issues I have mentioned occur on guess what, the HD 4850 also. The only workaround seems to be, vendor specific hacks! That doesn’t make me a happy programmer at all!

The story with Direct3D is a lot better and no issues were observed under DirectX renderer of the game. That just tells you something doesn’t it!

It’s a cracker!

It’s probably well known that GPUs are powerful beasts, and I have repeatedly pointed out on this blog that the awesome power of the GPU can be used for more than just graphics. For tasks and computations that can be executed in parallel, GPUs are a lot faster than CPUs and also more powerful. So it won’t come as a big surprise to learn that people have put GPUs to good use to do all kinds of stuff. GPGPU has been more than a buzzword off late and with technologies like CUDA and Larrabee, it has become even easier to get at all this power. However like every other piece of technology, GPGPU also has it’s downsides. This article I read recently briefly outlines the fact the GPU could be put to work as a generic brute force cracker. I am no expert in cracking, but I am a person who has played around with GPGPU long enough to understand how serious this could be. I read the article and the first thought that crossed my mind is, “Hey, you know, this is the kind of thing the GPU excels in actually!”

GPUs today can deliver computational power in teraflops. Very soon we could have hardware that can do 100s of times that. There is also another interesting thing that GPUs allow you to do. You can stack a series of these buggers together and achieve a phenomenal boost to this already awesome power. You could increase the computational power of a machine by several orders of the magnitude by stacking GPUs in parallel. It’s a disturbing fact that such power, until a few years ago, was only available on top-of-the line mainframes. Today your could build a machine that has the power of a supercomputer with components probably available at your nearest computer hardware store. That just doesn’t bode will with the fact that anyone with a brain and time to kill can hack-up a brute force cracker and put it to work — and with enough “horsepower”, might even succeed.

As more and more powerful GPUs hit the market and as GPGPU technologies progress, we will see newer machines with unheard of computing power on our desks and laps. While this means more interesting games and faster number crunching for most of us, there are those who will put such tech to vile use. What we probably also need are better security systems and stronger encryption systems along with better games and faster number crunchers.

Tweaking the game to run on a wide range of hardware.

For the last week I have been involved in rather uninteresting activity. Well, I have been literally throwing the game on all possible hardware configs hoping it will run. All of this (yes, again) to find out how the game fares when exposed to different hardware configurations. Well it may seem like this activity is rather mundane, then let me assure you — it is. Well, not entirely 😀 . It takes some effort to get a game to scale seamlessly to all kinds of hardware and currently I am enduring all the pain of crappy drivers and broken functionality, which,  should I say, underscores some of the major headaches in real-time graphics development. It’s not like you can throw the game with it’s peak setting ON and expect it to run on a crappy Intel on-board graphic cards. Such a thing will just end in a disaster. The game must scale to different kinds of hardware and in our case especially so; that too seamlessly and effectively.

Doofus 3D is uniquely placed. It doesn’t aim to be a top-line, hardware intensive, hard-core gamer only, triple A (AAA) title. Neither is it a 2D game capable of running flawlessly under software rasterized graphics on your grandma’s old school PC. It is geared more towards intermediate level hardware. Hardware that most people have on their work laptops and home desktops. This effectively means an extremely wide range of hardware to cater to, and that in turn means scaling the game’s software paths (internally) based on a *lot* of underlying factors. Assuming a player to have a specific functionality available on his hardware setup can be catastrophic and disastrous. Such assumptions could mean a total failure of the game on a machine and could mean a potential loss of a buyer in the end.

While drawing up specs of Doofus 3D we were especially careful not to go overboard with graphics galore. Even with careful planning, there was significant feature creep, and with each new feature that was added, new countermeasures had to be put in place so that the game would scale to lower-end hardware. Not everything was straight forward, but we still did manage to push it through. If you have been following my blog for some time now, you would know that this is not the first time I am into such activity. I (personally) run such tests after each beta (feature addition/ feature freeze) of the game. That is probably why we haven’t faced too many problems this time around.

Under Doofus 3D we followed a process that is a bit different from traditional software development. Every beta under this game project was actually a feature complete runnable version of the game. Before or between any beta, every release was an internal alpha version. A beta meant, “A set of features is complete enough to be tested”. After each beta, each feature was tested on various hardware setups. Something like an iterative method of software development, but not quite. I would say, a process tailored specifically for our project and more specifically for our situation given our limitations.

Doofus 3D runs on most middle rung hardware without too much problems. It will run on on-board graphics cards too, but I find Intel on-board graphics to be an abomination. Hopeless hardware support for 3D graphics and equally crappy driver support! Enough reason for the engine to scale the game to run on a low setting when it detects an Intel graphics card. The situation with NVIDIA and ATI cards is a lot better with ATI’s low end cards (,assuming the price point, ) to be consistently outperforming NVIDIA cards. That said, NVIDA has the most stable hardware and drivers and most settings work uniformly across cards and driver setups, though there can be problems there as well. ATI’s drivers can be buggy at times and in case of OpenGL can be totally broken. Fortunately the O2 Engine and the Doofus Game can use either Direct3D or OpenGL as rendering APIs. For any high end or for that matter even for most mid-range graphics cards, Doofus 3D is not a problem at all.