Raahel Baig

Feb. 28, 2026

Using AI agents for pixel art animations

Claude Code has a mascot named Claw'd which features in all the feature announcement posts. I find these animations extremely cute and wondered if it is possible to create such animations for myself.

I showcased before how coding agents can actually be used as general agents to do things outside of coding - like infographic video generations. I also recently subscribed to Google AI Pro plan and have been using the Gemini CLI with Gemini 3.1 Pro to do some frontend design changes for FanMeter.

So, I thought why not test both these agents to see which one can generate the better animation. I launched both of these agents in their respective YOLO modes (--dangerously-skip-permissions for Claude Code and --yolo for Gemini CLI) and gave this prompt:

Create a pixel art animation GIF of a 26 year old guy who spends his week teaching AWS classes (offline), working fulltime (WFH) as a DevOps engineer and is also addicted to AI software development (Claudoholic). High FPS, high definition. Not less than 10 seconds.

Gemini was the first one to generate it and it gave me this. I'd give it 4/10 at best:

Claude took a while and gave me the much better output. An impressive 7/10:

Interestingly, because I use Claude as my daily driver, it knew about how I got the idea of Fan Meter and added the part where I wake up in the middle of the night to build something

[... 72 words]

# 3:50 PM /

Claude Code PM Noah Zweben posted this gif today to announce that the remote control feature is rolling out to Pro users because "they deserve to use the bathroom too". I related so hard that I couldn't resist posting it here

Claude Code remote control gif

# 2:45 AM /

Feb. 26, 2026

Programming is becoming unrecognizable. Karpathy draws attention again to the famous November-December 2025 inflection point.

It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December.

He was able to get his agent (most likely his claw) to build a local video analysis dashboard in ~30 minutes

I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes. [...] As a result, programming is becoming unrecognizable.

He says that the "biggest prize" is being able to set up Claws well enough that it can handle multiple Code instances for you

The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now.

I am currently still operating at the LLM agent layer. I did setup Jacho but given that his brain is powered by GLM 4.7, he isn't that smart of a claw. I'm considering birthing a new claw and plugging its brain to Gemini soon. I'll be more meticulous this time with its setup so that it can "productively manage multiple parallel Code instances" for me.

He points out that while its not perfect yet, the key is figuring out which parts to delegate to AI to be fully autonomous and which ones to be more involved in.

The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.

at the top tiers, deep technical expertise may be even more of a multiplier than before because of the added leverage.

# 2:00 AM /

Feb. 25, 2026

Remote Control in Claude Code. Sometimes when I'm in the zone but I have to step out for a while, I'd really wish I could control the Claude Code session from my phone. There is an official way to do this using Claude Code on the web but it requires a GitHub repo setup and runs on the cloud.

It's also possible to setup Tmux + Termius + Tailscale or use the Happy Coder app which seems to be an open source option which is closest to Remote Control.

Claude Code has native support for this now. It allows us to continue a session remotely from our phone using the new Remote Control feature.

Anthropic is making it really easy for Claudoholics like me to have 24/7 access to our drug.

# 9:30 AM /

Feb. 22, 2026

Lessons from Building FanMeter

Idea originated at 1:45 AM, 21st February. MVP shipped ~27 hours later at 5 AM, 22nd February. And no, I have not touched the code and I don't feel bad about it

At the time of writing this entry, I had 4 hours of sleep in the past ~62 hours. I still feel excited to blog about it.

Now, I know that spending 27 hours to vibe code an application in 2026 is probably too much time. I know this because I did ship another vibe coded app in less than 2 hours. But fanmeter.in is different because it is the first one which I'm proud of. It is an application I'd use myself (I couldn't say that about the other one). It is also one which I know my friends and family like to play, at least enough to share it with their own friends.

Here are some of my learnings from building it:

1. Taste is the differentiator

One of the main reasons why I'm happy with Fan Meter is because of how the UI design turned out. Inspired by Greg Isenberg's podcast with Suraya Shivji, I looked for inspiration on cosmos and generated the design screens using Flux 2 Pro on Weavy AI.

I find that this is a great workflow. The search & similarity algorithm on cosmos is top-notch and that really allows you to choose your theme and get an image model to design UI screens matching that theme. There was a lot of back and forth iteration involved in this process for me as I kept chasing and fine-tuning for that exact look and feel that I was happy with. I will be experimenting a lot more & write a blog post on getting LLMs to create the frontend of your apps as per your taste.

[... 742 words]

# 10:00 PM /

Feb. 21, 2026

[...] But I do love the concept [of OpenClaw] and I think that just like LLM agents were a new layer on top of LLMs, Claws are now a new layer on top of LLM agents, taking the orchestration, scheduling, context, tool calls and a kind of persistence to a next level.

— Andrej Karpathy

# 5:30 AM /

Feb. 20, 2026

Gemini 3.1 Pro Preview scored highest in the Artificial Analysis Intelligence Index but its most significant advantage might be its price and token efficiency. Our evaluations cost <50% to run on Gemini 3.1 Pro Preview compared to Claude Opus 4.6 (max) and GPT-5.2 (xhigh)

— Artificial Analysis

# 10:00 AM /

Feb. 19, 2026

The Only Moat Left Is Money. Found this interesting post on Hacker News and immediately thought of my blog post from May 2025: Thinking is a commodity. At the end of that post, I hint at writing another post answering the question if money matters now more than ever:

LLM thinking comes at a price, and it can think deeply if you pay more. If you do the “boring” work yourself, you fall behind. Does money matter now more than ever? Food for thought (no pun intended)

I think this post by Elliot captures my feelings very well. Here are some quoteworthy snippets from it:

The effort is gone. Effort was the filter.

When creation was hard, skill was the differentiator: you had to actually be good to make something worth showing. Now the barrier is near zero, so you need reach. Reach costs money or it costs years. Probably both.

He thinks that there's a real chance that those without existing reach are locked out

I don't know if we've already crossed a singularity on this, a point past which new entrants without existing reach or capital to buy it are effectively locked out. I think there's a real chance we have. The uncomfortable version: if you're not already moving, you might never take off.

# 1:45 AM /

Claude Sonnet 4.6, the Token Muncher

Anthropic released Claude Sonnet 4.6 yesterday. The rumours from a few weeks ago were that the next model from the Sonnet family would be Sonnet 5, but here we are.

The main observation from the announcement is its improvement in computer use skills and performance in economically valuable office tasks (it is the #1 model in the Office tasks benchmark: GDPval-AA).

Personally, I was excited to see how similar it is to Opus 4.5 in the benchmarks, which means that we are getting Opus 4.5 level performance at the same cost of Sonnet 4.5 ($3/$15 per million tokens).

[...some early developers] often even prefer it [Sonnet 4.6] to our smartest model from November 2025, Claude Opus 4.5

Experiment

I have always liked data visualization and I'm really enjoying using these models to generate visualizations for trends and datasets that I find interesting. Inspired by this post on r/dataisbeautiful, I wanted to check if Sonnet 4.6 could generate a similar one but for Indian states instead. So I launched a new Claude Code session with --dangerously-skip-permissions and gave it the following prompt:

Hey Claude, create a infographic image (maybe a line graph) showing the average male height by birth year in Indian states. You can use the internet to fetch the data that is required to create this image. Try to find the data from the oldest possible to the latest possible. Include the top 10 most interesting states in terms of trends (be sure to include AP). Make it look aesthetic. Include some text highlight one interesting trend. a Don’t ask me any questions. You should not complete until you’ve generated a image in the current directory for me to view.

[... 742 words]

# 12:05 AM /

Feb. 16, 2026

Home base ft. Piku (the bamboo plant) & OpenCode

# 10:30 PM /

Feb. 13, 2026

Is it me or is the rate of model release is accelerating to an absurd degree? Today we have Gemini 3 Deep Think and GPT 5.3 Codex Spark. Yesterday we had GLM5 and MiniMax M2.5. Five days before that we had Opus 4.6 and GPT 5.3. Then maybe two weeks I think before that we had Kimi K2.5.

— logicprog

# 2:30 AM /

GPT-5.3-Codex-Spark and AI coding addiction

OpenAI announced the release of their new coding model GPT-5.3-Codex-Spark today, only a week after the release of GPT-5.3-Codex. They say that it has been designed for real-time coding capable of serving more than 1,000 tokens per second. Real-time coding here means to see the results of your requested changes immediately by getting near-instant responses. It runs on Cerebras for high-speed inference.

When I read 'ultra-fast model', I first thought of Fast mode for Opus 4.6 in Claude Code. But the primary difference is that Fast mode is the same model with different API configuration that prioritizes speed over cost. Codex-Spark is a different model with a drop in quality and capabilities.

Also interesting to note that the reduced latency is not just due to the improved model speed, but also because of improvements made to the harness itself:

"As we trained Codex-Spark, it became apparent that model speed was just part of the equation for real-time collaboration—we also needed to reduce latency across the full request-response pipeline. We implemented end-to-end latency improvements in our harness that will benefit all models [...] Through the introduction of a persistent WebSocket connection and targeted optimizations inside of Responses API, we reduced overhead per client/server roundtrip by 80%, per-token overhead by 30%, and time-to-first-token by 50%. The WebSocket path is enabled for Codex-Spark by default and will become the default for all models soon."

I wonder if all other harnesses (Claude Code, OpenCode, Cursor etc.,) can make similar improvements to reduce latency. I've been vibe coding (or doing agentic engineering) with Claude Code a lot for the last few days and I've had some tasks take as long as 30 minutes.

[... 178 words]

# 2:10 AM /

Feb. 12, 2026

Something Big Is Happening. Highly recommended reading. It might seem a bit lengthy but definitely worth it.

The author's intention is to invoke preparedness rather than just fearmongering. I can confirm I had a similar feeling about the newer models being so good that they can do the same job as me, much faster and better. I will say, though, that getting them to be that good is a skill in itself.

Here are the key highlights from the essay:

"The gap between public perception and current reality is now enormous, and that gap is dangerous because it's preventing people from preparing". This is true, most people in my circle have little to no idea about the rapid improvement in AI in the last few months. As the author says, this is mostly because of their prior experience from 2023/2024 or because they use the free version.
"I think the honest answer is that nothing that can be done on a computer is safe in the medium term. If your job happens on a screen (if the core of what you do is reading, writing, analyzing, deciding, communicating through a keyboard) then AI is coming for significant parts of it. The timeline isn't "someday." It's already started". I agree, especially because he says "... coming for significant parts of it".
"The person who walks into a meeting and says "I used AI to do this analysis in an hour instead of three days" is going to be the most valuable person in the room. Not eventually. Right now". Makes a lot of sense. The ability to use AI to its full potential is valuable. The bottleneck is not the AI itself; it is the uninformed human with a lack of imagination.
"Make a habit of experimenting. Try new things even when the current thing is working. Get comfortable being a beginner repeatedly. That adaptability is the closest thing to a durable advantage that exists right now". This is also the best way to develop the valuable skill of using AI to its full potential.

And this is the most important takeaway:

"Here's a simple commitment that will put you ahead of almost everyone: spend one hour a day experimenting with AI. Not passively reading about it. Using it. Every day, try to get it to do something new... something you haven't tried before, something you're not sure it can handle. Try a new tool. Give it a harder problem. One hour a day, every day. If you do this for the next six months, you will understand what's coming better than 99% of the people around you. That's not an exaggeration. Almost nobody is doing this right now. The bar is on the floor."

This commitment is something I made to myself at the beginning of the year (long before this essay was published). I began to feel a sense of urgency but not the FOMO kind. I was okay with having this feeling because it would push me to use all these tools, experiment with new models and keep up to date with everything happening in AI. The goal is to become a power user of AI.

# 10:15 PM /

Feb. 11, 2026

If you are in any situation where being right matters, you would, at this point, be making a mistake to not ask a frontier LLM for help. That can mean checking your own work, second opinions on other experts, or getting help with a complex problem. Have judgement, but use them

— Ethan Mollick

# 9:35 AM /

Feb. 10, 2026

Maybe you're not Actually Trying. I find myself going back to read this incredible article by Cate Hall every time I feel like something is not going right in my life or that I have disappointed myself.

It's about how even though we might think we are trying hard to solve a problem in our lives, we might not actually be trying.

[...] feeling of effort doesn’t mean that you’re Actually Trying.

These are the key points from that article:

People are not just high-agency or low-agency in a global sense, across their entire lives. Instead, people are selectively agentic.
It seems like, by default, you are stuck with whatever level of resourcefulness you brought to a problem the first time you encountered it and failed to fix it.

I would be careful when applying the 2nd point to past situations because it is tempting to evade accountability by blaming circumstances in life at that point of time. Instead, I'd tackle it by asking myself:

Have I done my best to come up with a set of potential solutions, using all the resources I have?
Am I doing as well by myself as I would by any friend who came to me with the same problem? How do I know I’m Actually Trying?

I really like this because it reminds me of the Sam Harris quote:

"On one level, wisdom is nothing more profound than an ability to follow one’s own advice."

# 3:00 PM /

TIL that I can use Force Touch on Mac to look up text to show more information such as definitions, wikipedia pages, movies, music, podcasts etc. I accidentally force touched a lot of times but because I didn't intend to, I never really read what's in the look up popup.

My default mode of looking up is to launch a private browser window and google search the text. That usually ends up with me reading the Wikipedia page. Now, I can just force touch the text, read the Wikipedia preview and only then dive into reading the full page if I find it interesting.

I learned this when reading the Ghostty documentation.

# 10:35 AM /

Feb. 9, 2026

A signs of maturity is the ability to love one part of somebody and hate another, without throwing the entire person aside the moment you find something about them that isn’t the way you like. I’ve learned some of my best lessons from the worst people. Sunday thoughts.

— Alex Hormozi

# 8:30 AM /

Feb. 8, 2026

Using Claude Code as a general agent

When Anthropic announced Claude Skills in October 2025, Simon Willison said this in his blog post:

Claude Code is, with hindsight, poorly named. It’s not purely a coding tool: it’s a tool for general computer automation. Anything you can achieve by typing commands into a computer is something that can now be automated by Claude Code. It’s best described as a general agent. Skills make this a whole lot more obvious and explicit.

While I had this in mind, using Claude Code for music is something that didn't come cross my mind before. Josh Cohenzadeh got Claude Code to write an original song, an original EDM song, original rock song with vocals, and an original album -- actually getting it to create an audio file for each using one-shot prompts. He didn't mention the model he used in his blog post but I assume it is Opus 4.5 with extended thinking (the transcripts he added to the post had thinking blocks).

Reading the blog post made me realize that I am still blinded by Claude Code's poor naming because I haven't used it for anything other than building software. I use it everyday and I am aware that it is not "just a coding tool", yet I haven't really of thought of doing anything with it.

So, I thought why not get it to create a video. Specifically, a bar chart race video illustrating the most popular girl names in Telugu states from 1950s - 2020s (because why not?). So that's what I did. I started a claude code session with --dangerously-skip-permissions and gave it the following prompt (heavily inspired from Josh's prompts):

[... 323 words]

# 10:45 PM /

Feb. 2, 2026

Birthing Jacho - my first AI agent

Discovery

It's wild what's happening right now. I first saw "clawd bot" show up as a search suggestion when I was searching for Claude related subreddits on reddit. I thought it was strange that too many people made the same typo for Claude. The next morning I saw that Dreams of Code on YouTube posted a video about Clawdbot. Then I saw Matthew Berman's video. I wondered if it was a coordinated marketing campaign.

What's so special?

My first reaction was "How is it any different from Claude Code?" OpenClaw's website says:

"The AI that actually does things. Clears your inbox, sends emails, manages your calendar, checks you in for flights. All from WhatsApp, Telegram, or any chat app you already use."

"All from WhatsApp, Telegram, or any chat app you already use": I think this is the main part that makes it different. The rest of OpenClaw's features can also be achieved by Claude Code with little to no effort.

Trying it out

After installing and checking it out on my local machine first, I figured it was best to have it running on a server 24/7 and talk to it through Telegram. I launched a c7i-flex.large EC2 instance (2 vCPU, 4GB RAM) with Ubuntu and installed it (curl -fsSL https://openclaw.ai/install.sh | bash). The baseline memory is 620 MB but I imagine that when you try to add a provider like WhatsApp, it needs more - this explains why the onboarding process kept crashing with JavaScript heap out of memory error when I first tried it on a t2.micro instance.

[... 405 words]

# 11:00 AM /

Oct. 20, 2025

5 years of no social media

It was in October-November of 2020 that I decided to quit social media. I was an active user back then - bunch of stories everyday, a new post every once in a while, comments, tagging, and all of what was usual at that time.

Why I quit

There was enough content on the internet warning about the dangers of social media and I was always aware of it. I was spending hours circling through the apps and it was clear that I was an addict. I quit it all after watching The Social Dilemma on Netflix.

Downsides

There are very obvious downsides like not being up to date with what my friends are doing on a daily basis and staying connected with older friends or acquaintances. I do not understand memes which are trending. It is also harder to grow a personal brand (which is very important). But I find that this is a trade-off I am okay with considering the long term benefits.

How does it feel?

Honestly, I'm very proud. I feel like people have problems that are non-existent for me. I feel normal. The analogy I like is that of a diet. I feel healthy because I am conscious of what I consume.

Information diet

I am intentionally oblivious to events happening in the world. When I was an active user, I used to always see content or news that would either enrage or upset me. Daily news is not important news. If something is important enough, I would hear it from friends or family.

[... 68 words]

# 5:30 AM /

May 22, 2025

Thinking is a commodity

I am undecided on how I feel about LLMs (especially reasoning models). I have always been careful about my thoughts and decision making. I like to do things most people label as "boring" work, like DYOR (Doing Your Own Research) and RTFM (Reading the Fucking Manual).

My personal experience has been that doing the "boring" work is essential to think clearly. It is what solidifies the concepts & strengthens the fundamentals. Good decision making requires clear thoughts & strong fundamentals.

But given that now LLMs have done the boring work (pretraining) and can also do reasoning, anyone using LLMs is no longer thinking. And because everyone is using LLMs, everyone is basically thinking the same. The lack of diversity in thinking bothers me a lot.

When I look at a PR (pull request) with full of AI generated code, I don't know how to feel about it. Is it frustrating that the PR author has not done the thinking or does it really matter if the code works?

LLM thinking comes at a price, and it can think deeply if you pay more. If you do the "boring" work yourself, you fall behind. Does money matter now more than ever? Food for thought (no pun intended)

# 5:30 AM /

April 28, 2025

Getting the day back

For me, the best part about the show Adolescence is the final episode. Especially, how the family "gets the day back" twice after going through two terrible experiences. Even though it is a fictional story, the family's resilience got me emotional. If only such things happened to me, I'm not sure if I'll even be able to get the "week" back. There's something incredibly inspiring about seeing people get through such horrible experiences.

The concept of "Getting the day back" resonated with me because it is very similar the Buddhist philosophy: feelings are impermanent. Conceptually, I think I have a good understanding of it but not so much in practice.

Thanks to the show, I now have a phrase I can tell myself as part of my practice.

# 5:30 AM /