Not all dot products are equal

I had another interesting debugging session recently with quite an unexpected conclusion. It all started when we received a new crash report - a certain platform (and 1 platform only) was crashing in a fairly old and seemingly innocent fragment of code: someModule->GetElements(elements); std::sort(elements.begin(), elements.end(), [&origin](const Element* a, const Element* b) { return a->pos.DistSqr(origin) < b->pos.DistSqr(origin); }); Crash was fairly rare although quite consistent in a particular location in the game, release build only, crashing on access violation inside the predicate.

operator[] considered harmful

I’ve always felt conflicted about the subscription operator[] in standard containers. It makes sense for random access structures like vectors, but it gets a little bit problematic elsewhere. Consider this seemingly innocent piece of code (it’s not 1:1, but this is code you can easily find in many codebases): if(someMap[key].value < x) { someMap[key].value = x; } 2 lines of actual code, but more than 1 problem, this snippet is potentially incorrect and inefficient.

Know your assembly (part 5)

The other day I was looking at a crash dump for a friend. A discussion that followed made me realize it might be worth to write a short post explaining why sometimes 2 seemingly almost identical function calls behave very differently. Consider the following code snippet (simplified): 1struct Foo 2{ 3 __declspec(noinline) const int& GetX() const { return x; } 4 virtual const int& GetY() const { return y; } 5 6 int x, y; 7}; 8 9int Cat(int); 10int Lol(Foo* f) 11{ 12 const int& x = f->GetX(); 13 const int& y = f->GetY(); // *** 14 15 return Cat(x+y); 16}

Simple multithreading tricks

Today I’d like to share a simple multithreading trick you can use to minimize “bubbles” in your pipeline. “Simple” because it applies mostly to “oldschool” threading systems, ie. the ones with main thread concept that kicks jobs and eventually flushes them. Cool kids using proper task graphs where everything just flows beautifully should not need it. Imagine we have a simple scenario, our thread produces some work, continues with whatever it’s doing, eventually waits for the job to finish (ideally this overlaps the work from previous point, so not much to do here) and processes results.

Zig pathtracer

If you’ve been reading this blog for a bit, you might know that when I experiment with new programming languages, I like to write a simple pathtracer, to get a better “feel”. It’s very much based on smallpt (99-line pathtracer in C++), but I do not want to make it as short as possible. I am more interested in how easily I can get it to run and what language ‘features’ I can use.

A debugger barrier

I’ve recently been asked by a friend for a little help with debugging a problem he was running into. Occasionally the program would freeze while trying to process a chunk of data and never moved on to the next one. Application is heavily threaded and processing is done by thread B, while thread A does its own job and periodically checks if work has been finished. If so, it sends it for further transformations and queues more work for thread B.

Grafana for dummies

Grafana is a very popular “analytics platform” or in more professional terms - a system to create pretty graphs. It’s very popular for monitoring system metrics, but really can be used for any timeseries data. It supports plethora of data sources and there is a decent chance you can use one of the off-the-shelf solutions to do 99% of the work for you (for example for some basic system metrics, especially on Linux).

Microcorruption writeup

Microcorruption is my ongoing “distraction” – it’s an online CTF. I’m way late to the party and have been doing it on and off since… 2013. How does it work exactly? To use their own description: “tl;dr: Given a debugger and a device, find an input that unlocks it. Solve the level with that input. You’ve been given access to a device that controls a lock. Your job: defeat the lock by exploiting bugs in the device’s code.

API granularity

I’d be the first to admit I don’t have much experience designing public APIs. I typically work on code that’s fairly specific to the game we’re making and while some of it is expected to be reused, our potential user pool is very limited, we’re still talking just one team, so <40 people and definitely not thousands. I’m still successfully using some little utilities/helpers I designed/coded 10+ years ago, but every now and then I run into a situation where decisions taken back then catch up with me and force to rewrite the API.

Ranged based "for" story

Consider the following, seemingly innocent (and completely made up) fragment of code: typedef std::map<int, std::string> MyMap; void foo(const MyMap& m) { for(const std::pair<int, std::string>& i : m) { if(i.first) { printf("%s\n", i.second.c_str()); } } } Looks simple enough, right? Just iterate over all elements of the map and print the value for non-zero keys. We use range-based for construct, use references, so do not expect any copies. Let’s just quickly make sure it all works as expected and consult Compiler Explorer.

Two Stage Push & Pop

I probably mentioned this before, but SPSC (single producer, single consumer) queue is one of my favorite structures. If implemented correctly, it’s actually 100% lock-free and is also surprisingly versatile (not all problems require MPMC!). I typically use a slightly modified version of this or an unbounded version, based on code by Dmitry Vyukov. Both implementations are very simple and possibly lack some of the modern C++ bells and whistles, like emplace, but these should not be hard to add.

Circular buffers (to the rescue)

Circular buffers are one (if not the) of my favorite data structures for some quick&dirty debugging. A simple, not production ready version can be implemented in a few lines of code (not ashamed to admit, I usually just copy paste these and remove when I’m done) and they’re a great tool to “record” a history of generic events. Any time I run into a seemingly random/unpredictable issue that might take a long time to repro, they’re on my short list.

Vanishing warning

Yet another MSVC story. Visual Studio has a nice compile-time warning when trying to access a static array with invalid index - C4789. According to documentation it’s mostly meant for various ‘copy’ functions (memcpy/strcpy etc), but it seems to work on ‘simple’ accesses as well. Consider (here’s a Godbolt link): struct Tab { float tab[2]; }; void Foo(const Tab&); void Bar(float forward) { Tab tab;[0] = forward;[2] = forward; // OOB access!

Compilers are smart

I recently transitioned to Visual Studio 2017 and while it went relatively painless, the new and improved optimizer uncovered some subtle issues lurking in the code (to be fair, Clang/GCC has been behaving same way for a long time now). The code in question was actually quite ancient and originated from this Devmaster forum post (gone now, but found if using The Wayback Machine): Fast and accurate sine/cosine. To be more precise, it was this version with ‘fast wrapping’:


Ever since I was a kid I was always fascinated by the northern lights and hoped to see them in person one day. Poland is too far south/densely populated to spot them, though, so it wasn’t a very realistic dream. In 2009 I did move to Sweden, but still, this was Stockholm area, so my chances might have been bigger, but not by much (plus only been there for a year).

Neural networks and the stock market

Machine learning and neural networks seem to be all the rage recently. I was sifting through my old drives the other day and found my master thesis. It’s actually vaguely related, subject was Neural Networks for Stock Market Forecasting. Complete thesis can be downloaded here, but comes with 2 major caveats: it’s 100% in Polish (so actual title is Sztuczne sieci neuronowe w prognozowaniu kursów giełdowych) it’s over 15 years old Sadly, I could not find an accompanying MATLAB program I wrote.

Who ate my stack

The Old New Thing is one of my favorite blogs. It’s a collection of Windows development anecdotes, but every now and then Raymond will post a gnarly debugging/crash story. I’ve recently found some of my old notes related to a crash I was chasing in a third-party library, it reminded me a little bit of The Old New Thing and I decided to try something similar. The whole thing took place a few years ago, we were using some super early versions of a certain vendor library (no sources).

Cache effects, illustrated

Few years ago I published an article about love and care for your cache. Every now and then I receive emails asking for clarification or some extra question. It seems like the basic rules are easy enought, but once you add multiple cores to the mix, it might get a little bit confusing. If we only have one core everything is fairly simple. There’s just 1 cache, which is essentially a variant of hash-table with limited number of slots.

A Halloween Story - get_temporary_buffer

A little bit late, as Halloween was yesterday, but I think std::get_temporary_buffer is scary enough to qualify. A co-worker called me today to show me an ‘interesting’ crash. It was deep in guts of the STL, more specifically in the std::_Merge method. Corresponding C++ line seemed innocent: *_Dest++ = _Move(*_First1++); Nothing very interesting, not our code and yet it was crashing with: Exception thrown at 0x01293C05 in foo.exe: 0xC0000005: Access violation reading location 0xFFFFFFFF

Digital Dragons 2017

I’ve just returned from the Digital Dragons 2017 conference. I’ve been presenting there last year and enjoyed the event so much I decided to repeat it. They changed a venue this year and I must say it was even better. Last year had a very nice ‘artsy’ feel to it, but ICE has a much better infrastructure. This year I’ve been talking about networking system. You can find the slides here - “Networking Architecture of Warframe”