Raw notes include useful resources, incomplete thoughts, ideas, and learnings as I go about my day. You can also subscribe to the RSS feed to stay updated.
Anthropic just launched their most awaited Claude 4 Opus and Claude 4 Sonnet models. They tweeted:
Introducing the next generation: Claude Opus 4 and Claude Sonnet 4.
Claude Opus 4 is our most powerful model yet, and the world’s best coding model.
Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.
Claude 4 Sonnet is for "near instant response" whereas Claude 4 Opus is for extended thinking and deeper reasoning. And they both are significantly better than Claude's previous models as well as OpenAI's and Google's latest models (OpenAI Codex-1, OpenAI o3, OpenAI GPT-4.1, and Google Gemini 2.5 Pro) in terms of software engineering.
Starting today, Claude 4 Opus is available for the paid users and Claude 4 Sonnet is available for free users as well on claude.ai.
The SEO landscape is changing and it's going to be heavily influenced by AI in the coming years. And here is a list of some really useful research papers that one should study to stay relevant.
I asked Grok 3 DeeperResearch tool to find a list of technologies throughout the history that have claimed to replace software engineers, and it did pull up some cool mentions:
1. Automatic Programming and Compilers (1950s)
Era: 1950s
Description: Early efforts to automate programming, such as the A-0 compiler by Grace Hopper, focused on translating high-level languages like Fortran into machine code.
Claims: Suggested that programming could be automated, reducing the need for low-level coding expertise.
Impact: Enabled higher-level programming but did not eliminate the need for programmers to design algorithms and logic.
Source: Wikipedia: Automatic Programming
2. Fourth-Generation Programming Languages (4GLs) (1970s-1990s)
Era: 1970s-1990s
Description: High-level languages like SQL and FoxPro designed to be closer to human language, enabling non-programmers to perform tasks like database queries.
Claims: Hyped as potentially eliminating the need for traditional programmers, with - Claims: that they were the "last generation" requiring code writing.
Impact: Simplified specific tasks but were limited for complex projects, requiring professional developers for broader applications.
Source: Wikipedia: Fourth-generation programming language
I came across a very interesting LinkedIn post by Judah Diament where he makes a point that vive coding won't be replacing software engineers. Below are some interesting fragments of the post:
Vibe coding enables people who aren't well trained computer scientists to create complete, working applications. Is this a breakthrough? Not even close - there have been such tools since the late 1980s. See, for example: Apple HyperCard, Sybase PowerBuilder, Borland Delphi, FileMaker, Crystal Reports, Macromedia (and then Adobe) Flash, Microsoft VisualBasic, Rational Rose and other "Model Driven Development" tools, IBM VisualAge, etc. etc. And, of course, they all broke down when anything sightly complicated or unusual needs to be done (as required by every real, financially viable software product or service), just as "vibe coding" does.
Then he goes on to explaining why vibe coding won't be replacing software engineers:
To claim that "vibe coding" will replace software engineers, one must: 1) be ignorant of the 40 year history of such tools or 2) have no understanding of how AI works or 3) have no real computer science education and experience or 4) all of the above, OR, most importantly, be someone trying to sell something and make money off of the "vibe coding" fad.
I like how the last paragraph is framed, it's definitely some food for thought.
"Comedy hits you in the head, drama hits you in the heart. If you want people to remember your work, you need both: comedy to lower their guard, drama to make them feel."
OpenAI launches Codex, a cloud-based agent that writes code and works on multiple tasks at once. It's just launched, and can be accessed from inside ChatGPT at chatgpt.com/codex but visiting this URL just redirected me back to ChatGPT as it's only for ChatGPT Pro users, and not Plus users.
Currently, it's in a research preview but it's said to have features like:
writing code for you
implementing new features
answering questions about your codebase
fixing bugs, etc.
The implementation is very interesting as it runs in its own cloud sandbox environment, and can be directly connected to your GitHub repo. It performs better than o1-high, o4-mini-high, and o3-high.
The cool thing is, it can also be guided by an AGENTS.md file placed within the repository. Very cool.
Today, we’re also releasing a smaller version of codex-1, a version of o4-mini designed specifically for use in Codex CLI.
Yes, they're also releasing something for Codex CLI as well. And about the pricing and availability:
Starting today, we’re rolling out Codex to ChatGPT Pro, Enterprise, and Team users globally, with support for Plus and Edu coming soon. [...] We plan to expand access to Plus and Edu users soon.
For developers building with codex-mini-latest, the model is available on the Responses API and priced at $1.50 per 1M input tokens and $6 per 1M output tokens, with a 75% prompt caching discount.
I am excited to see how this compares to Claude 3.7 Sonnet and Gemini 2.5 Pro in terms of coding, fixing bugs, designing UI, etc. I also uploaded a quick video about the same that you can watch on YouTube.
I have been coming across a lot of cool MCP server while browsing the internet, so decided to create a dedicated page and keep collecting MCPs here. I have a JSON file where I can add the new MCP servers, and it will automatically show in the card format here.
BioMCP
Connects AI systems to authoritative biomedical data sources
Connecting ChatGPT to Airtable gives you the superpower to get answers to 100s of questions in no time. Here's how to do that:
You need the following things to be able to connect ChatGPT to Airtable:
A paid Airtable account (the lowest plan is $24/month)
OpenAI API key (you'll have to set up a payment method on OpenAI, here)
The Scripting extension from Airtable (no additional cost), and
A script to call the OpenAI API inside Airtable
And below is the function that you can use to call the OpenAI from inside the Airtable and get the output.
asyncfunctiongetGPTResponse(){const userInput ="why is the sky blue?";const maxTokens =500;const temperature =0.7;const model ="gpt-4.1";const systemPrompt ="be precise";const messages =[{role:"system",content: systemPrompt },{role:"user",content: userInput },];const res =awaitfetch('https://api.openai.com/v1/chat/completions',{method:'POST',headers:{'Content-Type':'application/json','Authorization':`Bearer ${openaiApiKey}`,},body:JSON.stringify({
model,
messages,max_tokens: maxTokens,
temperature,}),});const data =await res.json();return data.choices?.[0]?.message?.content ||null;}
Here, userInput is the prompt that you give AI, maxTokens is the max tokens for the model, temperature is model temperature, and systemPrompt is the system prompt. The prompt here is hardcoded, but you can modify the script to dynamically fetch prompts from each row and then get the outputs accordingly.
ChatGPT is very good at doing this implementation as per your base data, you can just give the above script and other details in the prompt, and it will give you the final code that you can put inside the Scripting extension.
Also, there's a generic version of this script at InvertedStone that you can also get and use. You can generate almost any kind of content using this script, not just from ChatGPT but also from other AI models like Claude, Gemini, Perplexity, and more.
The ultimate test of whether I understand something is if I can explain it to a computer. I can say something to you and you’ll nod your head, but I’m not sure that I explained it well. But the computer doesn’t nod its head. It repeats back exactly what I tell it. In most of life, you can bluff, but not with computers.
Came to know that Google Docs now has a "Copy as Markdown" and "Paste from Markdown" option under the Edit menu at the top. You can select some text to highlight the copy option and then any Markdown is also pasted in the document with proper formatting.
Very cool!
By the way, Google Docs already had the option to download the entire document as a .md file, but these copy and paste options are even more user friendly.
I saw a person using the React Router inside Next.js and I have so many questions. Like the navigation is visibly very fast, but my questions are:
Is it good for public pages? Because I think, it will have same SEO issues as SPAs.
Does it make the codebase more complicated?
Upon looking I found a detailed blog post on building a SPA using Next.js and React Router. It mentions the reason for not using the Next.js router:
Next.js is not as flexible as React Router! React Router lets you nest routers hierarchically in a flexible way. It's easy for any "parent" router to share data with all of its "child" routes. This is true for both top-level routes (e.g. /about and /team) and nested routes (e.g. /settings/team and /settings/user).
I do understand why someone would want to use Next.js but I have yet to learn more about this React Router thing.
BRB.
Update:
Josh has written a new short blog post about how he did it, definitely worth reading and understanding the process.
Just noting this for myself for future reference that whenever I have to create cards, I must use this simpler method each time. If the HTML is like this:
.card-container{display: grid;grid-template-columns:repeat(auto-fit,minmax(300px, 1fr));gap: 20px;margin: 0 auto;}/* and then whatever CSS for .card here */
I’ve compiled a list of websites for important web technologies that are likely to have old but functional designs. These are fundamental tools for the internet, often open-source, and their websites prioritize functionality over aesthetics, reflecting their long-standing nature.
I came across a GitHub repo containing the complete Python code host and run a WhatsApp AI chatbot. I have forked the repo as I am thinking of making such a chatbot for myself. The requirements are mentioned as:
WaSenderAPI: Only $6/month for WhatsApp integration
Gemini AI: Free tier with 1500 requests/month
Hosting: Run locally or on low-cost cloud options
No WhatsApp Business API fees: Uses WaSenderAPI as an affordable alternative
I will learn more about the WhatsApp business API and how it can be used to create a WhatsApp chatbot for specific topics that people can interact with. And then how it can all be monetized.
Stripe has developed a new approach to analyze transactions using a new transformer-based foundation model. Earlier, they relied on a traditional machine learning model but these models had limitations, but the new model is supposed to even increase the conversion even more and significantly decrease the fraudulent transactions.
Gautam Kedia, an AI/ML engineer at Stripe, explained this in a detailed X post. He mentions:
So we built a payments foundation model—a self-supervised network that learns dense, general-purpose vectors for every transaction, much like a language model embeds words. Trained on tens of billions of transactions, it distills each charge’s key signals into a single, versatile embedding.
This approach improved our detection rate for card-testing attacks on large users from 59% to 97% overnight.
While I did have a loose knowledge of what a transformer is, I looked up its definition again to understand it better in the context of payments:
A Transformer is a type of neural network architecture that has revolutionized natural language processing (NLP) and is now being applied to other domains, as seen in the Stripe example. Its key innovation is the attention mechanism.
The attention mechanism allows the model to weigh the importance of different parts of the input sequence when processing any single part.
Further, I asked Gemini to explain this entire thing to me in a simpler words and here's how it explained:
Think of it like reading a book. An older model might read word by word and only remember the last few words. A Transformer, with its attention mechanism, can look back at earlier parts of the book to understand the meaning of the current sentence in the broader context. In the payment world, this means understanding the significance of a transaction not just in isolation, but in the context of previous transactions.
Someone added more than 81 MCP tools to their Cursor IDE and it started showing a warning saying "too many tools can degrade performance" and it suggested to use less than 40 tools.
you'll be able to disable individual tools in 0.50 :)
But the problem still remains, if MCPs are the future, there has to be a way that they are automatically managed and I do not need manually enable or disable tools.
I don't know how this changes things for Firefox, but there must be some reason to it. A person, who works at Mozilla, commented:
The Firefox code has indeed recently moved from having its canonical home on mercurial at hg.mozilla.org to GitHub. This only affects the code; bugzilla is still being used for issue tracking, phabricator for code review and landing, and our taskcluster system for CI.
On the backend, once the migration is complete, Mozilla will spend less time hosting its own VCS infrastructure, which turns out to be a significant challenge at the scale, performance and availability needed for such a large project.
I think it's actually an understandable strategical move from Mozilla. They might loose some income from Google and probably have to cut the staff. But to keep the development of Firefox running they want to involve more people from the community and GitHub is the tool that brings most visibility on the market right now and is known by many developers. So the hurdle getting involved is much lower.
I think you can dislike the general move to a service like GitHub instead of GitLab (or something else). But I think we all benefit from the fact that Firefox's development continues and that we have a competing engine on the market.
Some folks seemed excited about the migration whereas some are upset about the move to the closed-source platform, GitHub. But if this really makes the browser better, I am excited for the move.
Run llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF
Note: you may need to add -ngl 99 to enable GPU (if you are using NVidia/AMD/Intel GPU)
Note (2): You can also try other models here
Open index.html
Optionally change the instruction (for example, make it returns JSON)
A person on Reddit created a MCP to control a single LED bulb via natural language – it does look like an overkill, but that's not the point. The person asks it to blink the LED twice, and it does that. Beautiful.
The tech used are:
Board/SoC: Raspberry Pi CM5 (a beast)
Model: Qwen-2.5-3B (Qwen-3 l'm working on it)
Perf: ~5 tokens/s, ~4-5 GB RAM
And the control pipeline is explained as:
MCP-server + LLM + Whisper (All on CM5) → RP2040 over UART → WS2812 LED
And not to mention that everything runs locally on the Raspberry Pi CM5 device, and here's the entire code on GitHub that one can use.
Came across a Reddit post where the person bought a second-hand Lenovo Thinkpad P53 for €150 and successfully removed the supervisor password from it.
I found it very cool how the person unlocked the BIOS, so saving this post for future references, in case I decide to get something like this for myself. There are some additional resources also shared for the same - like this forum post and this YouTube video.
The history of flatbread goes back to 550 BC when Persian soldiers used to bake this and the first mention of the word "pizza" was recorded in AD 997 in Italy.
Here's a cool timeline for the history of pizza that you can refer to. It has multiple major events listed from 550 BC till 2020 - very interesting to go through.