
On Systems, Incentives, and Machine Intelligence
If you're new here, I'd recommend visiting my about page to see what this site is about.
I expect and welcome criticism, and have enabled comments on all of my posts. I enforce civility, professionalism, and meaningful contribution to the discourse via a minimum character count. Other than that, I welcome any and all contributions from researchers, industry professionals, or anybody else, regardless of background or field of study. My goal is to be proven wrong, so that I can correct my assumptions.
My Best Work
Will We Ever Illuminate the Black Box?
Can Less Democracy Save Democracy?
Bots will Win the Detection War
Latest
- ChatGPT-integrated Smartshell
I recently started playing with the idea of a “smart” command-line shell integrated with ChatGPT and implemented in Python. The code for my rudimentary implementation is freely available here, and should be simple enough to install and run on a Linux system.
- XBee Protocol Design (from scratch!)
This post was entirely AI generated by Anthropic AI’s Claude. I did this because I’ve gotten very busy with college and my research projects, and I’d rather put out AI-generated content than no content at all. For the record, my friend is not named “Julia”; Claude hallucinated that part. I also added in some markdown formatting to make the presentation a little nicer. For those interested in prompt engineering (or those who just want to read human content), the prompt I used is at the end of the article, and has all the same information.
- Insights from Playing with Language Models
Ever since the groundbreaking release of ChatGPT, I’ve been wanting to look into these “large language models” (referred to from here on as LLMs). LLMs, at their core, are autoregressive transformer-based machine learning models scaled up to be able to learn from vast collections of data scraped from the internet. The central concession of an autoregressive model is that it cannot have infinite memory; instead, it takes the prior $n$ tokens as input to generate the $n + 1$th token, and discards the earliest token in memory to replace it with the most recently generated one in a sliding-window fashion, before passing the result back into the model to generate the $n + 2$nd token. While one wouldn’t expect intentionally forgetting the earliest inputs would make for an effective language model, results ever since OpenAI’s Generative Pre-Trained Transformer (GPT) have proven otherwise. Combined with major advancements in other areas of NLP like Google’s SentencePiece tokenization, researchers have been able to achieve record-breaking performance on many natural language tasks using autoregressive language models. The most recent iteration of OpenAI’s GPT, GPT-4, can even perform better than most human specialists in legal and medical exams.
Subscribe to CrossCurrents
Get notified when new posts are published