How do I recognise a string as Malbolge source code?

835 Views Asked by At

Suppose I was given a string that looks completely garbage, and I am asked to identify what it could possibly be, there are tell-tale signs that crypto-analysts use to form a set of hypotheses to be tested. Are there such signs for Malbolge? Take for example the following string.

D'`%$p"[m}YziUxBe-2>0/pL,%7#FE~ffezcaw<^)Lxwvun4lTj0nmlejc)J`&dFE[!BXWV[ZSwQuUTMLpP2NGFEiC+G@EDCB;_?!=<;:3W765.-Q1*).-,+$#G'&feB"!a}v<]\xqpo5srqpohg-eMibg`_%cE[`Y}]?UTYRvV87MqQPONMFKJCgA)?cCB;@?87[5{38765.-Q10pM:
2

There are 2 best solutions below

0
On BEST ANSWER

The initial meaning of a Malbolge command is based on its ASCII code, plus its position in the program. This leads to a fairly recognisable in most programs of a sequence of ASCII codes going backwards.

Let's take the following cat program as an example (source):

(=BA#9"=<;:3y7x54-21q/p-,+*)"!h%B0/.
~P<
<:(8&
66#"!~}|{zyxwvu
gJ%

The most obvious part of the program, which helps to visibly distinguish it from keyboard mashing, is the zyxwvu, which is recognisable as a portion of the lowercase English alphabet written backwards. (In fact, the "!~}|{ preceding it is also made up of consecutive ASCII codes, wrapping around from ~ to !.) There are also other, less obvious examples of reverse consecutive ASCII codes in the program, such as -,+*)"! on the first line.

These reversed sequences of ASCII codes correspond to series of the same command repeated. It's also possible to discover "broken sequences" which are an even bigger clue. Looking at the first line, and compare it to a reverse ASCII sequence (with ! signs showing where they match):

(=BA#9"=<;:3y7x54-21q/p-,+*)"!h%B0/.
DCBA@?>=<;:9876543210/.-,+*)('&%$#"!
  !!   !!!!  ! !! !! ! !!!!!   !

The thing that first drew my eye to the line was 7x54-21; it's the ASCII digits written backwards, but slightly corrupted. That's because there's more than one command in that section, but there are enough repeats of the same command to produce a noticeable pattern. Expanding the pattern shows that it matches at lots of other points in the line, too; that's because the same command is running at all those points in the program. Because Malbolge has only 8 commands, you'll discover that every command in the program thus belongs to one of 8 reverse-ASCII sequences.

(In order to verify the code as correct Malbolge, you'll need to make sure that they're the 8 specific sequences that correspond to commands, which is what the interpreter does. But that's overly complicated if you're merely trying to determine whether written code is Malbolge a lot; just looking for zyxwv or EDCBA or 87654 or the like is normally enough of a giveaway on its own to make me suspect that unknown source code is Malbolge.)

For the example string in the question, it looks a lot more like Malbolge than random keyboard mashing, due to substrings like xwvu, nmlej, and the very suspicious QPONMFKJ; those are the sort of substrings that rarely happen by chance in random data, but are very common in Malbolge code. I thus suspect that it's either genuine Malbolge code, or code that has been slightly altered.

2
On

Run it past a malbolge interpreter, if you don't get a syntax error, it's valid code.

Determinining if it is useful code is a different matter entirely