A WebAssembly compiler that fits in a tweet

244 points by todsacerdoti 8 months ago

tromp 8 months ago

Interesting how the obfuscated code is explained by slowly unobfuscating it step by step. This is the reverse of how obfuscated code is normally created: by starting with understandable code, and then slowly obfuscating it bit by bit (as I explained for this IOCCC submission [1]).

I say normally because one could also have a superoptimizer search for a minimal program that achieves some desired behaviour, and then one has to understand it by slowly unraveling the generated code.

[1] https://tromp.github.io/maze.html

hoppp 8 months ago

I prefer to learn more about it first by having understandable code, that way there is less brainfuck.
But cool stuff anyways
marianoguerra 8 months ago

this one started pretty obfuscated :)
https://x.com/warianoguerra/status/1576166873296941056

trescenzi 8 months ago

I’ve used reverse Polish notation as an interview question many times. It works well because if someone’s never seen it you can learn a lot about their basic understanding of algorithms. But if they are aware of how easy it is you can extend it forever by adding symbols, improving the algo they build, or doing something like this.

eleumik 8 months ago

You are a crazy sadic bastard
- AdieuToLogic 8 months ago
  
  Such is the result of delving into languages such as Forth[0].
  Can you make programs with it smaller than assembly language? Sure.
  Will you come out the other side a mad hatter speaking of things such as words, dictionaries, and washing machine firmware? Well, I can only speak for myself...
  :-D
  0 - https://forth-standard.org/

martijnarts 8 months ago

> if you take the time to understand what this code does, you’ll learn a surprising amount about WebAssembly!

It's a shame the article mostly teaches about codegolf tricks, and the actual wasm info is left to a single commented code block.

Nonetheless an interesting article about JavaScript quirks though!

hinkley 8 months ago

“Fits in a tweet” can be safely assumed to mean lots and lots of code golf.

Lerc 8 months ago

Really interesting. I considered having a go at a RPN to WASM converter when I was making https://c50.fingswotidun.com/

I ended up converting the RPN style notation into a JavaScript string and creating a new function, which lets the JIT sort it out.

https://c50.fingswotidun.com/show/?code=xy!2*!2y!*6%2Bo2%2Fv...

which has the code

    xy!2*!2y!*6+o2/vy#!*:Cy#*+z#d!;*:ze!xy*4s*43/*e+*+

becomes

    ((round(z) * ((v * (1 - round(y))) + (clamp((( ((x*(2**((2 * (1 - y)) + 6))) ^ ((1 - ((1 - y) * 2))*(2**((2 * (1 - y)) + 6)))) /(2**((2 * (1 - y)) + 6))) / 2)) * round(y)))) + ((1 - round(z)) * ((1 - smoothStep(z)) + smoothStep((((x * y) * sin(4)) * (4 / 3))))))

It would be interesting to see the performance difference from a wasm version, but in the end I found the human(ish) readable expression to be quite useful too.

Originally I created an interpreter for a code as a texture maker for code golfed javascripted games. https://github.com/Lerc/stackie

There's potential for a WASM implementation to be both smaller than the small version and Faster than the fast version.

marianoguerra 8 months ago

post co-author here, let me know if you have any questions :)

AdieuToLogic 8 months ago

Have you or the other co-author ever worked with/in Forth[0]?
0 - https://forth-standard.org/
- marianoguerra 8 months ago
  
  yes, it has always been an influence for me, in fact 9 years ago I implemented a Forth interpreter in plain WAT[2] by de-obfuscating a IOCCC Forth implementation[3] and reimplementing it in Wasm and JS[4]
  [1] https://github.com/marianoguerra/ricardo-forth
  [2] https://github.com/marianoguerra/ricardo-forth/blob/master/s...
  [3] https://www.ioccc.org/1992/buzzard.2/
  [4] https://github.com/marianoguerra/ricardo-forth/blob/master/s...

deivid 8 months ago

WASM is cool; I've started implemented a CPU that runs unmodified WASM in Verilog, but I'm finding the feature creep on the instruction set (SIMD, GC) to take away from the initial values behind WASM (simple, small)

sitkack 8 months ago

You can ignore SIMD and GC (for now). SIMD explodes the complexity level of Wasm, esp when there is WebGPU. I am curious how you are handling layout and how you are handling all the irregular sizes.
enos_feedler 8 months ago

I don't think WASM's value was ever in a hardware instantiation of the actual instruction set.
- deivid 8 months ago
  
  Oh, I don't think so either, but if you think back to the asm.js times, there was a clear goal of "simple and higher perf", but now it's going in a direction for maximum compatibility with existing stacks (GC, WASI, etc) at "any" cost

kragen 8 months ago

This is really impressive. It is over 140 characters, but I guess "a tweet" can be any length now.

pdubroy 8 months ago

Co-author of the post here — we had 280 characters in mind. :-)
lcnPylGDnU4H9OF 8 months ago

I have never used Twitter so I might be mistaken but I believe the limit has been 280 for a while now, which is why the first one at 269 bytes would also have fit.
- GrumpyNl 8 months ago
  
  Twitter was based on sms, the standard SMS character limit is 160. They used 140 so they could use the remained 20 chars for other purposes.
  - acuozzo 8 months ago
    
    140 and 160 are related when it comes to SMS.
    The GSM-7 alphabet is the most common one in use with SMS (or, at least, it was as UCS-2 is more common now with emojis and such).
    160 is the number of GSM-7 characters.
    160*7/8 = 140 which is the number of bytes in the userdata portion of the TPDU.
    
    pests 8 months ago
    
    I don't think the Twitter choice of 140 was anything to do with this though and is just a coincidence. Back during dumbphones the only way to receive tweets while mobile was via the texting interface, and it would want to prepend the username. I don't think reserving 20 for the username has anything to do with how many bits are used to represent the alphabet.
    
    kragen 8 months ago
    
    That's coincidence, though. I used Twitter to keep in touch with friends via SMS in 02008, and the messages had space for a prelude to say who they were from. In the opposite direction, you could use that space to tell Twitter to send the message privately to someone.
  - benatkin 8 months ago
    
    The username length restriction might come partly from that. They could surely relax it by now, though. I saw it at play this week when @SecondGentleman (15 characters) changed to @SecondGent46.
- jsheard 8 months ago
  
  Yeah it was changed to 280 for all users in 2017. That's still the default limit, but paying users can exceed it now.

parlortricks 8 months ago

Is this a Tweet or a Xit (zit)? ha.

This is cool though, i love these programs that exist in these constraints, like Dwitter does with the demoscene.

numpy-thagoras 8 months ago

I call them Xeets now.

actionfromafar 8 months ago

[flagged]

userbinator 8 months ago

I prefer calling it an x-cretion.
- exe34 8 months ago
  
  Xitter, with the X pronounced as a soft Sh