Friday, 29 July 2016

The Vision Quest

Greetings!

If you happened to read my last blog post you saw that I fixed all the assistants to work with an OpenGL 3.2 Core Profile. Let's take a step back and see why this fix was necessary and what was wrong with the code.

History

In Krita 3.0 we introduced 'Instant preview'. This is a mechanism for speeding up big brush strokes on large canvases and uses OpenGL3. Before this mechanism Krita exclusively used OpenGL2 and below.

A side-effect of OpenGL3 is that it deprecated some functions from older versions. Now, normally this isn't a problem as Windows and Linux support a thing called Compatibility Profile, which allows the user to use new and deprecated functions together.

However on Mac OS this compatibility profile is not supported and they leave us two choices. Either don't use any functionality from OGL3, or remove all deprecated functions from our code.

Our solution

By now you might have guessed that we chose the latter option and set out to remove all deprecated functionality. The problem here is that not all of this legacy code is in Krita, but actually in Qt (which is the library we use for the graphical user-interface). More specifically, we use some functions of Qt which contains legacy code to draw our canvas decorations (Brush outline, Assistants, etc.).

Since we don't have direct control over the Qt code, we decided to copy their legacy code into Krita and use this copied code to implement our fix. This means that drawing the decorations would now make use of our copied code instead of the Qt code. Ultimately however, we don't want to keep using this copied code as it would be a nightmare to keep up-to-date with the current Qt version. Therefore, the plan was to implement our fix and send our patch back to the guys over at Qt for them to merge it into their library.


So what did this fix involve exactly?

Qt:
  • Updating every Qt shader to use GLSL 1.5
  • Creating a VAO + several VBOs for uploading data to the GPU
  • Dynamically switching between old OGL versions and new versions
 Krita:
  • Updating Krita shaders to use GLSL 1.5
  • Creating appropriate VAO + VBOs for tool outlines and canvas textures
  • Dynamically switching between old OGL versions and new versions
Making a custom Qt installation

As you saw the fix worked well using our copied Qt code, but now the next step was to move these fixes to the current Qt 5.7 code. Of course, Qt 5.7 contained some changes that weren't in our old copied code, so I had to merge my changes manually into the new files. Luckily this all went well and my first custom Qt installation was born.

And then, le moment suprême, as we run Krita with this custom Qt version...

Well.. unfortunately it didn't run...

On start-up Krita complained that there wasn't a valid context bound or that the OpenGL implementation does not support buffers. This happened in a piece of code that is completely unrelated to my fixes, but one that I luckily recently had a look at.

The unnerving thing is that my fix contains nothing that meddles with the OpenGL context and doesn't touch the file that gave the error. What's even worse, when debug printing the current context in that file it looked perfectly intact. So what could possibly be causing this?

Well it turned out that there is no such error when I run Krita without my fix, so it had to be something I had done. Alas, there was nothing left to do but to very slowly remove parts of my fix until the error stopped appearing, while at the same time keeping the code runnable.

Finally, I found the troublesome piece of code. It was already present in Qt and I had commented it out as it is chock-full of deprecated functions. The act of commenting out this piece of code apparently has severe consequences on unrelated files. I have no idea why...

Uncommenting this piece of code no longer caused any issues and fixed the error, soooo... ¯\_(ツ)_/¯

Sending the patch to Qt

Last Wednesday I cleaned up the fixes and sent in a change request to the Qt people. Over the coming days we will discuss the best way to implement parts of it in preparation of them taking in the changes so that we may drop our copied code and just use Qt as-is.

Their vision is to keep support for deprecated functionality, but to also allow the user to pick an OpenGL3.2 Core Profile which removes all these functions. This means I will have to implement checks in the fix to see which profile the user has requested. This incremental preparation of the patch will happen over a couple of weeks as we get closer to a solution we are both happy with.

Bonus talk

As one might imagine it is not a super fast process to update fixes and wait for comments and critique from the patch reviewers. This leaves me with some extra time in my Summer of Code to look at other parts of Krita. In particular, I am interested in the deep dark depths of the Krita painting engine.

The first part of these depths that I looked at was the way in which parts of the canvas are updated as people paint on it. This happens in a tile based manner.
The canvas is divided in tiles of size 256x256 and as paint strokes hit certain tiles only they get updated. An image would look something like this to Krita internally:


You notice I've drawn some red borders around the tiles. These borders represent where we extend each tile by 16 pixels on every side. This tile + border together is a 256x256 texture (so the effective size of the actual image tile is only 224x224).

Why do we extend each tile by 16 pixels? Well we keep what is called 'levels of detail' of the image. Effectively what this means is that we keep lower quality versions of the image (also called mip-maps). These levels of detail are progressively lower in resolution by powers of 2. So if the original image had a resolution of 1024x1024 its mip-maps would be: 512x512, 256x256, 128x128, 64x64 etc.

To see why these levels of detail are useful we have to dive into the implementation of 'Instant preview'. Essentially what that mechanic does is simulate a user's brush stroke on a lower level of detail where it is much faster to calculate and show this preview to the user, while in the background it is applying the brush stroke to the actual image. This gives the user an 'instant preview' of the brush stroke and retains the integrity of the image.

But I still haven't told you about why we need this border around the image. Well this has to do with the filtering we perform. To show a high-quality image at all zoom levels we might apply filters such as bilinear interpolation. For every pixel you see on screen bilinear interpolation takes the 4 pixels closest to the pixel you want to calculate and averages these according to how close they are.

In the image below you see a pixel with an imprecise position (because it has been zoomed in/out) called (x, y) for which we want to calculate the colour, and the 4 closest pixels in the actual image (x1,y1), (x2,y1), (x2, y2) and (x1, y2). The colour of the pixel is then taken as the average of the colour of the other pixels multiplied by the area the pixel directly diagonal from it takes up.


Now you have an idea of how bilinear interpolation works, you might ask yourself how this works when the pixel is at the edge of the image. Because obviously there aren't any pixels outside of the image to sample colours from.

Well this is exactly why we need an extra border of pixels around the image. We need at least one extra pixel around the image in order to handle the corner cases of bilinear interpolation. But what colour should this border be? It should be the colour of the pixel directly next to it! So in a way we are just taking all the pixels of the image edge and copying them to form a 1 pixel border.

But.. we have a 16 pixel border? Here is where the mip-mapping comes in. If we want to have a 1 pixel border at the lowest level of detail, we should have a border that is 2 pixels on the next higher level of detail (LoD). This is the case because if the second-to-lowest LoD is halved in size to form the lowest LoD we end up with a 1 pixel border.

In Krita we store five levels of detail (including the original image) and so we need a 1px, 2px, 4px, 8px and finally a 16px border on the original image.

So far I have been talking about these borders in the context of the image, but actually we need this border of every tile as they are little images that form the complete image. So now you hopefully understand the red lines on the Kiki image.

Speed up

You might be wondering why I am telling you all of this. While going through the code that handles all this tile business I found out that the code that extends each tile by 16 pixels takes up half of the processing time of each tile. This means that when you are drawing, half of the time it spends updating your canvas is spent on extending the tiles a little bit.


Here is my tiny benchmark of updating a full 8000x8000 canvas with and without tile borders:

Time taken on updating 8000x8000 canvas with borders: ~402ms

Time taken on CPU Time taken on GPU Total time
1 123ms 106.3ms 229.3ms
2 237ms 208.4ms 445.4ms
3 140ms 135.7ms 275.7ms
4 155ms 148.1ms 303.1ms
5 325ms 256.3ms 581.3ms
6 279ms 249.7ms 528.7ms
7 237ms 208.2ms 445.2ms
8 283ms 267.2ms 550.2ms
9 225ms 209.0ms 434.0ms
10 122ms 109.8ms 231.8ms

Time taken on updating 8000x8000 canvas without borders: ~194ms

Time taken on CPU Time taken on GPU Total time
1 55ms 52.8ms 107.8ms
2 87ms 123.8ms 210.8ms
3 82ms 125.3ms 207.3ms
4 46ms 122.8ms 168.8ms
5 245ms 197.3ms 442.3ms
6 53ms 45.6ms 98.6ms
7 46ms 125.1ms 171.1ms
8 47ms 122.9ms 169.9ms
9 50ms 122.9ms 172.9ms
10 61ms 124.7ms 185.7ms


I think the current implementation of this extending has a lot of opportunity to be optimised. So the time I have left while waiting for Qt critique I will spend on trying to get this border extension implementation optimised and possible getting a nice speed-up on the painting. I doubt it will be twice as fast, because I am sure there is a lot of other things going on during a paint stroke, but it will at least go some way to squeezing more performance out of Krita.

Tuesday, 28 June 2016

The Road of Trials

Hello all!

In my last blog post I said that I would work on extending support for paint operations like 'fill'. I have done so, albeit more as a necessity in fixing the assistant code. Moreover, I have fixed a number of other paint operations which are vital in painting the various assistants Krita offers currently.

Tool Outline

Before I talk about the assistant fixes I would like to talk about my fix of the tool outline code. This code is responsible for drawing the brush outline which follows the cursor and some of the selection previews while the user is dragging the selection over the canvas. Prior to starting work on the porting of our decoration code to OpenGL 3.2 some work had been done already by another member of the Krita community (beelzy). She changed several deprecated drawing functions to make use a vertex array object and multiple vertex buffer objects. The idea was that we bind a single VAO and instead of uploading the data directly to the GPU, the data is now first uploaded to a vertex buffer object which can be used for drawing the shape.

This approach works fine for drawing the canvas on the screen and drawing the chequerboard texture you see if your layer is transparent, however it broke down for drawing the tool outline. The reason for this is that the tool outline is.. well.. a line! So as opposed to drawing a quadrilateral polygon (rectangle) for the canvas we now want to render a line. This isn't a problem on its own however we can see that the data required to draw a 'quad' is significantly different than for a line. Besides, the data required to draw a selection as it is being dragged over the canvas changes significantly in a short period of time whereas the data requires to draw the canvas shape remains the same for longer periods of time.
Wouldn't it be nice if we could have one buffer for storing the data required to draw the canvas shape, and another buffer for storing the data required to draw the volatile lines?

To address this issue I split our original vertex array object (VAO) into two of these objects. One specifically meant for drawing quadrilaterals, and the other specifically for drawing lines. On top of that, I specified in these objects that the quad data doesn't change very much by setting it to GL_STATIC_DRAW and that the line object does change a lot by setting it to GL_STREAM_DRAW. The OpenGL documentation explains these constants better than I can so I will just post them here.

GL_STATIC_DRAW: The data store contents will be modified once and used many times.
GL_STREAM_DRAW:  The data store contents will be modified once and used at most a few times.

So that is just what we want! We define the data required to draw a rectangle once and render it many times. In contrast, we keep redefining the data required to draw the line at that particular moment and draw it at most a few times.

Now the astute reader might note that the canvas size and shape might change as well during zooming or rotating. And that's true! Here is a little dilemma we have to answer. The traditional way of handling these transformation is by using a matrix. The matrix contains all the transformation necessary so that when it is applied to every vertex of the polygon the whole polygon is transformed to its new location and shape in the Vertex Shader. This doesn't change the vertex positions stored in the graphics card, but rather puts them in the correct position every frame. A more naive approach might be to just re-upload the already transformed vertices to the graphics card. The reason why I say naive is that this approach would obviously cause huge amount of slowdown when we are working with a lot of vertices (like a game model). Uploading tens of thousands of vertices as opposed to a matrix consisting of 16 floating point numbers every time a model changes position would be stupid. However, here we are only working with 4 vertices. In this case we would only have to re-upload 8 floating point numbers to the graphics card in order to update the shape. Moreover, maybe we don't even have to re-upload the data on every frame but only when the user changes the size or position of the canvas. Turns out this approach is not so naive and may be the best option.

The metric 12 fiasco

For the past few weeks my terminal has been spammed by a single warning over and over again. "QOpenGLWidget::metric(): unknown metric 12". So this week I decided to go out and investigate what was causing this so that I could debug in peace. After tedious grepping and taking notes I traced it back to metric 12 coming from somewhere inside QWidget. So now that I knew where metric 12 was coming from and what it was supposed to do imagine my surprise when I went into QOpenGLWidget and found that the handling of metric 12 had simply been commented out. D'oh! Turns out that during the beta phase of Qt 5.6.0 they commented out this handling and our forked paint engine was based on this beta version. So I updated my Qt to the new version 5.6.1 and copied its handling of metric 12 back into our paint engine. Let's hope there aren't more surprise changes between our forked paint engine and Qt 5.6.1's paint engine.

Perspective assistant

In my previous post I noted that many assistants crashed upon the first click on the canvas. For some profane reason I decided to tackle the perspective assistant first, and boy did I regret that.

Here is a partial compendium of all the different things the perspective assistant consists of:

  • Corner nodes
  • Nodes between corner nodes
  • Handles for nodes between corner nodes
  • Grid between nodes
  • Shapes behind the icons of the assistant widget
  • Icons on the assistant widget
  • Lines while moving nodes
  • Lines drawn in hidden mode
  • Crosshair where perspective lines meet
So where to even begin? Well the way I started was putting down a return statement at the start of every perspective function I could find (and there are quite a few). Then by removing these statements one by one I would find it to crash multiple times. Each of these crashes relates to a path ultimately leading to a function in the paint engine that was still using legacy code.

One such path lead to a new branch in the code that is responsible for painting strokes (aptly named stroke()). This new branch involved code for painting non-opaque strokes. Qt does this by using a stencil buffer (for reasons which are unclear to me). This branch seems to be called for the drawing of the grid lines which are (barely) transparent.

Another path was responsible for drawing the assistant widget which consists of a circle and a rounded rectangle combined to form the background and 3 icons pasted on top of it. The icons are drawn using QPainter::drawPixmap, which calls drawImage, which calls drawTexture, which calls.. wait what's another word for image?

And then the circle.. ultimately it gets drawn using a function called drawVertexArrays() in the paint engine. This function receives all the vertices of the circle. But then how do you fill that circle? Well there are many ways, however since I saw that it was being drawn using GL_TRIANGLE_FAN which makes a fan of triangle I assumed the following shape:

The first vertex in the array would be taken as the central hub and all consequent vertices would form a triangle with the hub and the previous vertex. But there's a problem, the data I received in the function didn't contain a central vertex. How does this work? Well turns out it works pretty cleverly by just taking one of the circle vertices as the central hub. It results in a circle that looks like this in wire-frame:



I should note at this point that the assistant code is very old and quite messy which resulted in a lot of confusion over which code was responsible for what. I will probably visit it again at some point to clean it up and hopefully make the assistant faster.

Some more bugs caused by my own stupidity kept me busy for the remainder of the time this week, but I won't bother you with the details (I'm ashamed).

Previously I made a graph to keep track of which decorations were broken and which were fixed. Well... turns out I had less merit from that than I expected. The perspective assistant was hoarding all the methods required to draw every assistant. And since I fixed the perspective assistant, by proxy I fixed every other assistant that we offer. In addition, I fixed the tool outline painting which was also used by many selections, so this is the status now:



I expected some assistants to be fixed, but I certainly didn't expect this. In any case there are still a lot of code paths in the paint engine which aren't called by Krita but still need to be ported to OpenGL 3.2.

Finally, the way that I currently fix the paint engine code is by uploading the data to a VBO instead of directly passing it to the graphics card (which isn't allowed anymore). One can imagine however that adding this extra step in between doesn't mean the code becomes faster. The way in which OpenGL 3.2 could be faster is by uploading our data once and then drawing it multiple times. So over the coming weeks I will investigate if it is possible to cache certain drawing operations so that drawing the same thing over and over again doesn't need any uploading of data to the graphics card. That is where a speed up will come from and that is what Qt would be happy with.

Oh, and here's a little picture of all the assistants in action:

Tuesday, 21 June 2016

Crossing the Threshold to the Special World


In my last post I spoke about how the stroke method of the paint engine was now working properly. Over the past week I have been cleaning up my solution for this and making it ready for extension. The coming week I will extend the solution to many of the other methods in the paint engine (like fill, which is responsible for.. you guessed it.. filling shapes!). Fixing this method should allow a bunch more decorations to work properly.

In other news, last Monday was my final bachelor thesis presentation which was the last thing I had to do for school this year. I managed to graduate with a 9/10 for the whole thesis process and my supervisors were allegedly very happy with me. Now I cross over the threshold of the academic life to the working life of which the first three months will be reserved for Krita.

Today I gathered up all the decorations I could find in Krita and made this table to keep track of what is broken and what is fixed as I will spend the coming weeks extending my solution.


Working - The decoration works perfectly and looks good
Visible - The decoration shows but it looks bad
Not visible - The decoration doesn't crash but we can't see it either
Crashes - The decoration crashes Krita instantly

Looks like there is some work to do, so bye all!

Meeting with the Mentor

My google summer of code has officially started on the 9th of June.

Last week Tuesday I met up with my mentor in order to discuss the way forward. We looked through the code previous contributors had made to fixing the OS X issue and the changes I had made to the paint engine.

After some cleaning of debug statements and redundant calls we managed to merge the old code with the current state of the master branch. And then the moment...

We launched up Krita expecting a complete disaster, but what it blessed us with was a small green square sitting tranquilly on top of the canvas. What does it mean?

The green square is a token of freedom from the decoration oppression. A green square drawn in full OpenGL 3.2 glory... Ok, admittedly, it wasn't that glorious, but what it really meant was that we were able to perform strokes using the QPainter class while having no support for legacy OpenGL functions.

If you remember from my last post, the main show stopper for the release of instant preview on Mac OS X was that all the decorations would be broken if we did so. The fact that we are now able to make strokes using the same version of OpenGL as we need for instant preview means that slowly we are reclaiming the possibility of switching to OpenGL 3.2 and having both instant preview and decorations.

What's more is that besides the amazing green square we also suddenly found other decorations to be working for the first time on OS X. This is due to me testing my solution on the stroke method of the paint engine. So now every decoration that solely consists of a stroke renders properly.

Friday, 13 May 2016

The Call To Adventure

So.. Looks like I will be participating in Google Summer of Code this year. Officially it starts on the 23rd of May, but my thesis is in the way, so I will be starting about two weeks later.

What will I be doing you ask? Well, as some people know Krita on Mac OS X is not quite there yet. Some of the new cool functionality added to Krita 3.0 is forcefully omitted from the OS X release. Deep down in the depths of Krita painting we paint decorations using Qt's kindly provided QPainter class. This class allows us to make pretty lines and shapes very easily, and is perfectly suited to drawing all of the overlay functionality (such as grids, cursors, guides, etc.). What could possibly go wrong there? Well, even though we are grateful to have such easy rendering functionality, the backend of those functions haven't exactly kept up with the times.

In 2008 a new version of OpenGL came out (version 3.0) that threw out much of the old functionality and told programmers to do all of it themselves. You should upload your own data to the graphics card! You should define your own shading algorithms! And you should keep track of your own transformation matrix stack! Sounds like a lot of hassle, but the advantage is that we are not stuck in the rigidity of what OpenGL allows us to do. The problem is that OpenGL doesn't have a clue about what kind of application we are making, so it provides some general functions to us which may or may not suit our needs. And when these functions do not suit our needs, we are going to feel it in the performance.
Want to render a complicated model with many thousands of vertices? Well sorry, I only know how to redundantly upload all these vertices to the graphics card on every frame.

So now we have reached the year 2016 and the new functionality of Krita makes thankful use of this new OpenGL version that allows us to cleverly upload data to the graphics card. But here comes the showstopper, because we still use that old and slow OpenGL version to draw our decorations. D'oh!
Luckily this isn't too much of a problem, the old version is still fast enough to draw a couple of decorations without breaking a sweat. Indeed, mixing both versions works quite a charm... at least... on Windows and Linux..

With the advent of OpenGL version 3.0 the notion of deprecation was introduced. Many of the features that were in use before 3.0 were now replaced with newer better ways to accomplish the same and were marked as deprecated. In order to stay compatible with both ways of doing things, generally OpenGL doesn't really care if you use deprecated functionality together with the new functionality. A program that uses this type of mix is therefore said to use the 'Compatibility Profile'. The compatibility profile allows the programmer to use older and newer versions interchangeably. On the other hand we find something called 'Core Profile (with forward compatibility)'. This core profile removes all deprecated functionality and prevents the programmer from using those functions.

As it happens, the Windows and Linux platforms support programs that were written using core profile or compatibility profile. In contrast, the Mac OS X platform currently only supports using OpenGL versions lower than 3.0 or using versions higher than 3.0 but only in core profile. This forms a problem for Krita, because if we pick a version lower than 3.0, then OS X users don't have access to our new performance features (which use the new OpenGL version). However, if we pick a version higher than 3.0  (as we currently do) then we are disallowed from using the QPainter class (which uses deprecated functionality) to draw our decorations. It appears we are at an impasse...

Well, here is where I come in. It will be my responsibility in this Google Summer of Code to modernise this QPainter class and make it play nicely with the rest of Krita. I will need to make sure that all deprecated functionality is purged from its implementation and that it makes efficient use of the new functionality OpenGL has provided us with.

If all goes to plan it means that OS X users will get to enjoy the new performance enhancements brought to Krita 3.0 whilst at the same time not losing all of their decorations.