Feb. 28, 2026

Using AI agents for pixel art animations

Claude Code has a mascot named Claw'd which features in all the feature announcement posts. I find these animations extremely cute and wondered if it is possible to create such animations for myself.

I showcased before how coding agents can actually be used as general agents to do things outside of coding - like infographic video generations. I also recently subscribed to Google AI Pro plan and have been using the Gemini CLI with Gemini 3.1 Pro to do some frontend design changes for FanMeter.

So, I thought why not test both these agents to see which one can generate the better animation. I launched both of these agents in their respective YOLO modes (--dangerously-skip-permissions for Claude Code and --yolo for Gemini CLI) and gave this prompt:

Create a pixel art animation GIF of a 26 year old guy who spends his week teaching AWS classes (offline), working fulltime (WFH) as a DevOps engineer and is also addicted to AI software development (Claudoholic). High FPS, high definition. Not less than 10 seconds.

Gemini was the first one to generate it and it gave me this. I'd give it 4/10 at best:

Gemini 3.1 Pro with Gemini CLI

Claude took a while and gave me the much better output. An impressive 7/10:

Opus 4.6 with Claude Code

Interestingly, because I use Claude as my daily driver, it knew about how I got the idea of Fan Meter and added the part where I wake up in the middle of the night to build something

[... 72 words]

Claude Code PM Noah Zweben posted this gif today to announce that the remote control feature is rolling out to Pro users because "they deserve to use the bathroom too". I related so hard that I couldn't resist posting it here

Claude Code remote control gif

Feb. 26, 2026

Feb. 25, 2026

Feb. 22, 2026

Lessons from Building FanMeter

Idea originated at 1:45 AM, 21st February. MVP shipped ~27 hours later at 5 AM, 22nd February. And no, I have not touched the code and I don't feel bad about it

At the time of writing this entry, I had 4 hours of sleep in the past ~62 hours. I still feel excited to blog about it.

Now, I know that spending 27 hours to vibe code an application in 2026 is probably too much time. I know this because I did ship another vibe coded app in less than 2 hours. But fanmeter.in is different because it is the first one which I'm proud of. It is an application I'd use myself (I couldn't say that about the other one). It is also one which I know my friends and family like to play, at least enough to share it with their own friends.

Here are some of my learnings from building it:

1. Taste is the differentiator

One of the main reasons why I'm happy with Fan Meter is because of how the UI design turned out. Inspired by Greg Isenberg's podcast with Suraya Shivji, I looked for inspiration on cosmos and generated the design screens using Flux 2 Pro on Weavy AI.

I find that this is a great workflow. The search & similarity algorithm on cosmos is top-notch and that really allows you to choose your theme and get an image model to design UI screens matching that theme. There was a lot of back and forth iteration involved in this process for me as I kept chasing and fine-tuning for that exact look and feel that I was happy with. I will be experimenting a lot more & write a blog post on getting LLMs to create the frontend of your apps as per your taste.

[... 742 words]

Feb. 21, 2026

[...] But I do love the concept [of OpenClaw] and I think that just like LLM agents were a new layer on top of LLMs, Claws are now a new layer on top of LLM agents, taking the orchestration, scheduling, context, tool calls and a kind of persistence to a next level.

Feb. 20, 2026

Gemini 3.1 Pro Preview scored highest in the Artificial Analysis Intelligence Index but its most significant advantage might be its price and token efficiency. Our evaluations cost <50% to run on Gemini 3.1 Pro Preview compared to Claude Opus 4.6 (max) and GPT-5.2 (xhigh)

Feb. 19, 2026

Claude Sonnet 4.6, the Token Muncher

Anthropic released Claude Sonnet 4.6 yesterday. The rumours from a few weeks ago were that the next model from the Sonnet family would be Sonnet 5, but here we are.

The main observation from the announcement is its improvement in computer use skills and performance in economically valuable office tasks (it is the #1 model in the Office tasks benchmark: GDPval-AA).

Personally, I was excited to see how similar it is to Opus 4.5 in the benchmarks, which means that we are getting Opus 4.5 level performance at the same cost of Sonnet 4.5 ($3/$15 per million tokens).

[...some early developers] often even prefer it [Sonnet 4.6] to our smartest model from November 2025, Claude Opus 4.5

Agent Output

Experiment

I have always liked data visualization and I'm really enjoying using these models to generate visualizations for trends and datasets that I find interesting. Inspired by this post on r/dataisbeautiful, I wanted to check if Sonnet 4.6 could generate a similar one but for Indian states instead. So I launched a new Claude Code session with --dangerously-skip-permissions and gave it the following prompt:

Hey Claude, create a infographic image (maybe a line graph) showing the average male height by birth year in Indian states. You can use the internet to fetch the data that is required to create this image. Try to find the data from the oldest possible to the latest possible. Include the top 10 most interesting states in terms of trends (be sure to include AP). Make it look aesthetic. Include some text highlight one interesting trend. a Don’t ask me any questions. You should not complete until you’ve generated a image in the current directory for me to view.

[... 742 words]

Feb. 16, 2026

Feb. 13, 2026

Is it me or is the rate of model release is accelerating to an absurd degree? Today we have Gemini 3 Deep Think and GPT 5.3 Codex Spark. Yesterday we had GLM5 and MiniMax M2.5. Five days before that we had Opus 4.6 and GPT 5.3. Then maybe two weeks I think before that we had Kimi K2.5.
logicprog

GPT-5.3-Codex-Spark and AI coding addiction

OpenAI announced the release of their new coding model GPT-5.3-Codex-Spark today, only a week after the release of GPT-5.3-Codex. They say that it has been designed for real-time coding capable of serving more than 1,000 tokens per second. Real-time coding here means to see the results of your requested changes immediately by getting near-instant responses. It runs on Cerebras for high-speed inference.

When I read 'ultra-fast model', I first thought of Fast mode for Opus 4.6 in Claude Code. But the primary difference is that Fast mode is the same model with different API configuration that prioritizes speed over cost. Codex-Spark is a different model with a drop in quality and capabilities.

Also interesting to note that the reduced latency is not just due to the improved model speed, but also because of improvements made to the harness itself:

"As we trained Codex-Spark, it became apparent that model speed was just part of the equation for real-time collaboration—we also needed to reduce latency across the full request-response pipeline. We implemented end-to-end latency improvements in our harness that will benefit all models [...] Through the introduction of a persistent WebSocket connection and targeted optimizations inside of Responses API, we reduced overhead per client/server roundtrip by 80%, per-token overhead by 30%, and time-to-first-token by 50%. The WebSocket path is enabled for Codex-Spark by default and will become the default for all models soon."

I wonder if all other harnesses (Claude Code, OpenCode, Cursor etc.,) can make similar improvements to reduce latency. I've been vibe coding (or doing agentic engineering) with Claude Code a lot for the last few days and I've had some tasks take as long as 30 minutes.

[... 178 words]

Feb. 12, 2026

Feb. 11, 2026

If you are in any situation where being right matters, you would, at this point, be making a mistake to not ask a frontier LLM for help. That can mean checking your own work, second opinions on other experts, or getting help with a complex problem. Have judgement, but use them

Feb. 10, 2026

TIL that I can use Force Touch on Mac to look up text to show more information such as definitions, wikipedia pages, movies, music, podcasts etc. I accidentally force touched a lot of times but because I didn't intend to, I never really read what's in the look up popup.

My default mode of looking up is to launch a private browser window and google search the text. That usually ends up with me reading the Wikipedia page. Now, I can just force touch the text, read the Wikipedia preview and only then dive into reading the full page if I find it interesting.

I learned this when reading the Ghostty documentation.

Feb. 9, 2026

A signs of maturity is the ability to love one part of somebody and hate another, without throwing the entire person aside the moment you find something about them that isn’t the way you like. I’ve learned some of my best lessons from the worst people. Sunday thoughts.

Feb. 8, 2026

Using Claude Code as a general agent

When Anthropic announced Claude Skills in October 2025, Simon Willison said this in his blog post:

Claude Code is, with hindsight, poorly named. It’s not purely a coding tool: it’s a tool for general computer automation. Anything you can achieve by typing commands into a computer is something that can now be automated by Claude Code. It’s best described as a general agent. Skills make this a whole lot more obvious and explicit.

While I had this in mind, using Claude Code for music is something that didn't come cross my mind before. Josh Cohenzadeh got Claude Code to write an original song, an original EDM song, original rock song with vocals, and an original album -- actually getting it to create an audio file for each using one-shot prompts. He didn't mention the model he used in his blog post but I assume it is Opus 4.5 with extended thinking (the transcripts he added to the post had thinking blocks).

Reading the blog post made me realize that I am still blinded by Claude Code's poor naming because I haven't used it for anything other than building software. I use it everyday and I am aware that it is not "just a coding tool", yet I haven't really of thought of doing anything with it.

So, I thought why not get it to create a video. Specifically, a bar chart race video illustrating the most popular girl names in Telugu states from 1950s - 2020s (because why not?). So that's what I did. I started a claude code session with --dangerously-skip-permissions and gave it the following prompt (heavily inspired from Josh's prompts):

[... 323 words]

Feb. 2, 2026

Birthing Jacho - my first AI agent

Discovery

It's wild what's happening right now. I first saw "clawd bot" show up as a search suggestion when I was searching for Claude related subreddits on reddit. I thought it was strange that too many people made the same typo for Claude. The next morning I saw that Dreams of Code on YouTube posted a video about Clawdbot. Then I saw Matthew Berman's video. I wondered if it was a coordinated marketing campaign.

What's so special?

My first reaction was "How is it any different from Claude Code?" OpenClaw's website says:

"The AI that actually does things. Clears your inbox, sends emails, manages your calendar, checks you in for flights. All from WhatsApp, Telegram, or any chat app you already use."

"All from WhatsApp, Telegram, or any chat app you already use": I think this is the main part that makes it different. The rest of OpenClaw's features can also be achieved by Claude Code with little to no effort.

Trying it out

After installing and checking it out on my local machine first, I figured it was best to have it running on a server 24/7 and talk to it through Telegram. I launched a c7i-flex.large EC2 instance (2 vCPU, 4GB RAM) with Ubuntu and installed it (curl -fsSL https://openclaw.ai/install.sh | bash). The baseline memory is 620 MB but I imagine that when you try to add a provider like WhatsApp, it needs more - this explains why the onboarding process kept crashing with JavaScript heap out of memory error when I first tried it on a t2.micro instance.

[... 405 words]

Oct. 20, 2025

5 years of no social media

It was in October-November of 2020 that I decided to quit social media. I was an active user back then - bunch of stories everyday, a new post every once in a while, comments, tagging, and all of what was usual at that time.

Why I quit

There was enough content on the internet warning about the dangers of social media and I was always aware of it. I was spending hours circling through the apps and it was clear that I was an addict. I quit it all after watching The Social Dilemma on Netflix.

Downsides

There are very obvious downsides like not being up to date with what my friends are doing on a daily basis and staying connected with older friends or acquaintances. I do not understand memes which are trending. It is also harder to grow a personal brand (which is very important). But I find that this is a trade-off I am okay with considering the long term benefits.

How does it feel?

Honestly, I'm very proud. I feel like people have problems that are non-existent for me. I feel normal. The analogy I like is that of a diet. I feel healthy because I am conscious of what I consume.

Information diet

I am intentionally oblivious to events happening in the world. When I was an active user, I used to always see content or news that would either enrage or upset me. Daily news is not important news. If something is important enough, I would hear it from friends or family.

[... 68 words]

May 22, 2025

Thinking is a commodity

I am undecided on how I feel about LLMs (especially reasoning models). I have always been careful about my thoughts and decision making. I like to do things most people label as "boring" work, like DYOR (Doing Your Own Research) and RTFM (Reading the Fucking Manual).

My personal experience has been that doing the "boring" work is essential to think clearly. It is what solidifies the concepts & strengthens the fundamentals. Good decision making requires clear thoughts & strong fundamentals.

But given that now LLMs have done the boring work (pretraining) and can also do reasoning, anyone using LLMs is no longer thinking. And because everyone is using LLMs, everyone is basically thinking the same. The lack of diversity in thinking bothers me a lot.

When I look at a PR (pull request) with full of AI generated code, I don't know how to feel about it. Is it frustrating that the PR author has not done the thinking or does it really matter if the code works?

LLM thinking comes at a price, and it can think deeply if you pay more. If you do the "boring" work yourself, you fall behind. Does money matter now more than ever? Food for thought (no pun intended)

April 28, 2025

Getting the day back

For me, the best part about the show Adolescence is the final episode. Especially, how the family "gets the day back" twice after going through two terrible experiences. Even though it is a fictional story, the family's resilience got me emotional. If only such things happened to me, I'm not sure if I'll even be able to get the "week" back. There's something incredibly inspiring about seeing people get through such horrible experiences.

The concept of "Getting the day back" resonated with me because it is very similar the Buddhist philosophy: feelings are impermanent. Conceptually, I think I have a good understanding of it but not so much in practice.

Thanks to the show, I now have a phrase I can tell myself as part of my practice.