Tuesday, March 23, 2010

Instancing and Geometry Shader

ViruZ uses Instancing to draw its graphics.
Instancing is quite simple: You have one piece of geometry (e.g. a rectangle) and you wanna draw it multiple times.
Normally you'd just be like execute the draw-function n times, but Instancing moves this to the graphics-card, meaning that the CPU is free to do more stuff while the graphics card handles drawing the rectangle n times.

This worked quite out-of-the-box for me. Good stuff.
Here's a screenshot:
It has my circle-shader, lightly modified. You can see that the edge of the circles are blurred a bit too much; I've simply been guessing the value then, it's fixed now (at least for 1:1 ratio resolutions..).
I had to turn VSync off in my drivers to get the read FPS (the numbers in the console-window on the left). As you can see, the performance improved greatly (from ~100-120FPS to ~1900-2000FPS). I'm hoping this will make my game run smoother on older hardware.
There are not all 50 viruses visible, because the boundaries are now invalid (the canvas is now -/+1 to -/+1 instead of 0/0 to 800/600 :P).

You might have heard about geometry shaders. They take vertices (points, that define geometry) and may output more vertices in return.

Before that, I've been uploading a quad (rectangle) into the graphics-memory, had a static array of 50 positions (representing the origins of the viruses) and then used Instancing to draw that quad 50 times. The shader took care of translating that quad onto the right position.

Now, with a geometry shader, I'm not uploading any geometry into the graphics-memory, but instead have a dynamically sized array of 2D points (again representing the origins of the viruses) and using that information for Instancing to draw.. well points ^^

I could be satisfied with points for now, but I'm planning on implementing more viruses with different shapes at some point, thus only squares will not always be the best solution.

My geometry shader takes those points and creates 4 vertices for each point by simply translating it to the lower left, then upper left, then upper right and finally lower right, resulting in a square shape.

I had some trouble with this geometry shader, mainly because I've accentually been trying to compile my pixel shader as a geometry shader and it took quite a while to find out that the shader code was actually correct XD

And again a picture:
You can see that I've also implemented the outline of the virus.
It also seemed to slightly have improved my performance to ~2000-2050FPS.

I didn't have too much time, so I spend it on the stuff you can experiment with, instead of stopping in the process of cleaning up a few times.
So yeah, the same excuse goes for the blogpost-less week. Now I've left the written parts of my Abitur behind. Coulda done the math-test better; was spoilt by the really easy CS-test. That thing felt like a joke.

Thursday, March 11, 2010

OpenGL 4.0

Today, the specs of OpenGL 4.0 (and GLSL 4.00) were released. I didn't go through all changes yet (the OpenGL spec is almost 400 pages long! and the GLSL spec additional 100 pages), but what I've seen so far appeals.
Though I still have to learn how this weird packing data into one integer-thing works. I don't know what the name is, but HistoPyramids, which I investigated on a few days ago and didn't understand, seems to be a similar technique.

Well, so what made me make this post is the statement, that Khronos will add extensions to OpenGL 3(.3) to make as much of the OpenGL 4.0-features possible with older hardware.
So then I thought, what hardware actually supports OpenGL 3? And well, all ATi/AMD 2xxx or higher and the nVidia 8000-series or higher (didn't care about Intel and what not).
So I will look into the OpenGL 3-stuff I need/want to use and look if OpenGL 2 supports it through extensions and finally port ViruZ to use OpenGL 2 instead.

btw, the port from cairo to OpenGL is harder than thought. This time to actually finish coding the game, I took the idiom "Make a game, not an engine", thus the cairo-stuff is all over the place and not hidden behind some interface.
Well, my fault after all, since you should stick with your decision if you use that idiom.

Sunday, March 7, 2010


So I downloaded and compiled freeglut, a toolkit for OpenGL.
Then I downloaded GLEW, but it seemed somewhat resource-hungry for its use, because it keeps a complete list of function-pointers to the extensions.
Extensions for OpenGL are functions, which are not officially supported but help greatly with using newer stuff. There are common extensions, which have been approved the ARB, vendor-specific extensions and the ones that multiple vendors have agreed upon.
You can't call the functions like you are used to in a programming-language. You have to query for them in run-time and hope that they're available and supported by the hardware.

To write modern OpenGL 3.x code you need to use extensions.

So I've looked in the smooth_triangle-example of freeglut and ported that code to a very bad C++-version using fancy stuff like std::vectors and std::strings. I made the design so that I should be able to copy most of that code if I'm getting serious.
My and their way of getting extensions is not hidden and looks beautiful and requires more work to set up, but I prefer that solution.
Basically, I need to declare the function myself and then I have a init-function to initialize all functions.

After two stupid mistakes, that took a while to find and were easy and fast to fix after that, I had it running.
I played around with the Shader a little, changed the triangle into a quad and it looked quite pretty. So pretty, that I wanted a wallpaper-formated picture. So I removed the borders I had on the quad, enabled fullscreen-mode and took this:
And it started out with this:

What I've been doing to create this effect is calculating the light-intensity via the formula color.r * 0.3 + color.g * 0.59 + color.b * 0.11. I multiply by different numbers instead of just taking the average, because green looks brighter than red, which looks brighter than blue. The specific numbers seem to be common, you can simply google for them.
To create corners (sharpen the intensity-effect) I've before multiplied the color-value, which is in range [0..1], four-times with itself (that will darken dark colors more than bright ones).
To make this intensity-effect more visible, I multiplied with 8 and finally multiplied the intensity with the actual color-value.
Or in GLSL:

#version 140

smooth in vec4 fg_SmoothColor;
out vec4 fg_FragColor;

void main(void){
  float i = dot(pow(fg_smoothColor.rgb, vec3(4, 4, 4)), vec3(0.3, 0.59, 0.11)) * 8;
  fg_FragColor = vec4(i, i, i, 1) * fg_SmoothColor;

Here's a little less intense version. People with darker desktops (like me) will prefer it. If I darked it more, it doesn't look good anymore.
If anyone wants another resolution, just leave a comment. I can then let my computer calculate a perfectly scaled image, without quality loss :)

I also have a small bug, and that is I get a INVALID_OPERATION error in my exit-function, but I simply blame GLUT for that.

Wednesday, March 3, 2010


You know... like blog and update in one word.... whatever.

Again, I have nothing fancy to tell.
Wanted to work on my assembler-program since I actually wanted to have a working prototype by tomorrow and completely forgot about it, but instead I wasted have of my time playing Call of Duty 2.
I have the basic flesh of the assembler done, now I only need to implement the pseudo-Ops (probably the hardest :P) and decent file parsing. I don't think my current one works fine, even under the circumstances that I didn't spend enough time with it.

I also wanted to code a lock-free queue for ViruZ, so I can easily implement my different looping-method (one thread renders, others update), but I did something wrong and now it only draws one virus :P
A lock-free queue is basically a queue you can do stuffs on without having to wait for other operations to finish.
You need to wait for other operations to finish if you're working with threads sharing data, as the possibility exists that one thread might read some data while the other one is writing to it at the same time. And that can be catastrophic.
A queue is like a waiting-line you know in real-life. You can stack various stuff into it and the first thing that got in is the first thing that's going to get out. The last thing in will be the last thing out.

I have another algorithm for the lock-free queue in mind, though it's a bit more complex. Dunno if I will bother as I don't know if this even is a bottleneck in my code.

I've also read some stuff about shaders in OpenGL and I'm ready to give my first shader a try. I just need to find some time.
When I begin implementing OpenGL and using that as my rendering-backend things should get faster by themselves.

Oh, I also created the tag 'ViruZ' and added it to all my ViruZ-related posts.