Coding is cheating

Programming is perhaps the only job where lying, cheating and deceiving will not only get you paid but also praised for being innovative and creative. That’s because computers are severely limited. We’re literally fitting square pegs (real world) into round holes (0’s and 1’s).

Assuming you already know that everything in the computer world is represented in binary - combinations of 0’s and 1’s - consider the simplest example - trying to store the value 0.2:

0.2 decimal = 0.00110011... binary

There is no precise representation in binary for the decimal 0.2. Instead it’s a repeating pattern of 0011 after the mark. To make matters worse, a computer will only store a limited amount of digits for a number, say 32 for a fraction like the one above. But due to the specifics of number storage, the actual representation will only keep 26 digits from the recurring pattern of 00110011.... The rest will be cut and gone as if it never existed. So the number a computer will actually store is:

0.199999988079071044921875

Close enough to round it off to 0.2, but still, what a cheat!

We continue to work around constraints, this time of memory. All computer memory is limited and we shouldn’t waste it unnecessarily. So when we want to store a value in a program, we often won’t store the actual value, but a pointer instead:

value = "banana"
valuePointer = &value

The exact code will differ depending on the language used, but essentially it says:

  1. save the value banana under the name value, then
  2. assign to the name valuePointer the memory address of value (it points to where the original value is stored).
Pointer illustration

In consequence we’re using much less memory, because the banana is stored only once, but as a side effect (sometimes desired), if we change to value = "kiwi" later, then valuePointer will also suddenly return kiwi.

Let’s look at something more tangible - a sphere.

Sphere

You know how a sphere looks like, you can recognize one if you see it. But a computer is inherently incapable of producing a real sphere (though that’ll change once ray tracing goes mainstream, thanks Marek). For reasons that require a university course to explain, 3D spheres are drawn with… triangles.

Sphere

There’s just so many of them and so tiny that you are fooled and see a smooth surface. In the first 3D games that surfaced in the 90’s you could actually see the edgy surfaces. Nowadays computers have enough horsepower to draw millions of triangles without much sweat.

Another funny concept is lazy loading. We’re usually storing data in some kind of database, which makes it expensive to retrieve. There’s the time needed for a network call, the database engine reading files from disk etc. It all adds up, hence we want to make as few database calls as possible, so that users won’t have to constantly stare at a screen saying “loading”.

Let’s say you want to open up a contract. We’ll represent it in code as an object that includes all relevant information - ID, date of signing, ship-to and bill-to companies etc. We’ll also tell you it has line items, which we may conveniently program as:

getLineItems()

where calling the above function will return the list of line items. However… we don’t really have those line items ready to display, because we purposefully didn’t ask the database for them just yet. You might not need them at all - just want to check some basic details of the contract. So only the moment you ask for line items explicitly, and getLineItems() is called, do we make the query to the database (and let you wait for it), then return and display the list.

Finally, some problems in computing are very hard and extremely expensive to calculate at scale. Even for the modern beastly machines we have available. If you’re using any sort of map application, you’re seeing one such problem: calculating the best route between two points.

In order to perfectly calculate the best route - be that the shortest or the quickest one, whatever the criterion - the computer would have to have to calculate the distances and routes between every single point in the database. The number of calculations to perform would be the square of the number of points. Warsaw alone has thousands of addresses. Think how many points would there be on the route between, say, Warsaw and Berlin.

The trick we use in these hard cases is heuristics which boils down to using extra information we may have, and allowing for suboptimal results, providing the ones we deliver are good enough. For finding the best route on a map, we already know the locations (latitude and longitude) of all points. We can use that information to limit the area in which we’ll calculate the routes, often to a shape resembling an ellipse:

Shortest path calculation area

We won’t consider points and roads outside of this area at all. That’s why when trying to cross Warsaw North (say Marymont) to South (Ursynów), the GPS might offer you a straight line through the city center, while a quicker and more convenient route may lead along the city bypass. But the calculation is much faster.

It’s all cheating. Bending, stretching the material we are working with - computers - in order to deliver bigger, better and more vibrant experiences to users. We’re not sorry, not at all. It’s like solving elaborate puzzles every single day, while getting pay and praise for it. The joy of programming.

*[GPS]: Global Positioning System

I make software. And other things. Mostly in Warsaw, Poland, from wherever there’s an Internet connection, power outlet and fresh coffee. I love to read and learn how the world works. You should follow me at @mpaluchowski.

Read another post?