Notes on machine learning, VLMs, and building things.
Google’s new Gemini 3.1 Flash TTS adds natural-language ‘audio tags’ like [excitement] or [like dracula] that steer voice style inline. I run it in English and Hindi and embed the results.
Two worked examples — a still photo and a drone video — running Falcon Perception and Gemma 4 locally on an M2 Max. Counts you can audit, from an agent that never does arithmetic in its head.
Hi-res ESRI imagery + Gemma 4 in the PR #926 agent loop on three brick-kiln classes. Twelve tiles, four Falcon queries each, three prompt variants — with per-tile mask grids and Gemma’s reasoning shown. 67% zero-shot, and the remaining 33% is instructive.
Gemma 4 sounds confident. Falcon Perception actually measures. Together they beat either alone — here is the evidence.
A hands-on guide to automating web browsing for AI agents using agent-browser CLI
Learn how to build MCP servers with FastMCP - enabling LLMs like Claude to interact with your custom tools, databases, and APIs.
Configure Brave browser to hide personal browsing history during classroom demos and presentations
A comprehensive terminal optimization covering security fixes, modern tooling, and performance improvements
Learn how to build neural networks that gracefully handle missing input channels by explicitly encoding missingness patterns
Understanding vector databases through text and image search
Learn how to call REST APIs from Google Sheets using a custom Apps Script function.
Comparing DINOv2 fine-tuning vs training CNN from scratch for binary image classification
Visualizing ERA5 data grid coverage over India
Exploring how temperature scaling affects the randomness and diversity of language model outputs through mathematical analysis and interactive visualizations
Visualizing and filtering multi-image classification samples from the VOC dataset for improved training data quality
Demonstrating how logit masking enforces valid transitions in sleep stage classification
Setting up Python Environment on Linux Remote Servers with GPU Support
Tired of waiting 5+ minutes for Quarto to rebuild all 145 posts when you only changed one? Here’s how to make GitHub Actions only rebuild what actually changed.
Transferring large projects with thousands of small files over SSH can be painfully slow. Here’s how we solved it with parallel transfers.
Keyboard shortcuts on mac
from ultralytics import YOLO, checks, hub import pandas as pd
import numpy as np import matplotlib.pyplot as plt %matplotlib inline %config InlineBackend.figure_format = 'retina' import torch import torch.nn as nn import torch.nn…
import torch import torch.nn as nn import torch.optim as optim from torch.utils.data import DataLoader, TensorDataset import numpy as np import matplotlib.pyplot as plt
import numpy as np import matplotlib.pyplot as plt %matplotlib inline %config InlineBackend.figure_format = 'retina'
import numpy as np import matplotlib.pyplot as plt %matplotlib inline import torch import torch.nn as nn import torch.nn.functional as F %config…
import torch import torch.nn as nn import matplotlib.pyplot as plt import numpy as np # Retina mode %config InlineBackend.figure_format = 'retina'
# Create…
import tiktoken
encoding = tiktoken.get_encoding("cl100k_base")
encoding.encode("Hello World! This is a simple notebook")
[9906, 4435, 0, 1115…
import numpy as np import time import matplotlib.pyplot as plt import pandas as pd # Retina display %config InlineBackend.figure_format = 'retina'
log_size …
import matplotlib.pyplot as plt import torch %matplotlib inline %config InlineBackend.figure_format='retina'
# Download some MNIST to demonstrate…
import networkx as nx import numpy as np import matplotlib.pyplot as plt import pandas as pd %matplotlib inline # Retina display %config InlineBackend.figure_format =…
import numpy as np import matplotlib.pyplot as plt import torch import seaborn as sns import pandas as pd dist =torch.distributions sns.reset_defaults() sns.set_co…
import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim import numpy as np import matplotlib.pyplot as plt from torch.utils.da…
from jax import vmap, jit, grad, vmap import jax.numpy as jnp # Enable 64-bit mode from jax.config import config config.update("jax_enable_x64", True) import matplotl…
import jax.numpy as jnp import jax from jax import random import tensorflow_probability.substrates.jax as tfp tfd = tfp.distributions import pandas as pd import matpl…
Some useful tidibts in sympy
A programming introduction to Autoencoders in JAX
Probability Calibration
Multi-output Gaussian Process
import numpy as np import matplotlib.pyplot as plt import torch import seaborn as sns from functools import partial sns.reset_defaults() sns.set_context(context="ta…
import numpy as np import matplotlib.pyplot as plt import torch import seaborn as sns import pandas as pd import pyro dist =pyro.distributions sns.reset_defaults() …
import numpy as np import matplotlib.pyplot as plt import torch import seaborn as sns import pandas as pd t_dist =torch.distributions sns.reset_defaults() sns.set_…
import torch dist = torch.distributions import matplotlib.pyplot as plt import seaborn as sns import numpy as np %matplotlib inline
learn
import torch from jax import grad import jax.numpy as jnp
How to learn the parameters of a GP
using Plots theme(:default) using LinearAlgebra using LaTeXStrings
Blurring an image selectively using Affinity Photo
Audio filtering techniques and applications
Running Python scripts on server over ssh and getting back content
Some of my shortcuts on the iPad
My iPad computing setup
My Mac Setup
Implementation and visualization of Generative Adversarial Networks
Using GPy and some interactive visualisations for understanding GPR and applying on a real world data set
From the ground up!
A programming introduction to Active Learning with Bayesian Linear Regression.
A programming introduction to NNs.
Simple scripts for downloading weather data
A programming introduction to Bayesian Linear Regression.
A minimal example of using markdown with fastpages.
An interactive exploration of Gaussian processes.
HashMaps for programming interviews
How is the world changing over the years!
AQ sensing in India
A programming introduction to query by committee strategy for active learning
Denoising
Some personal reflections..
Neural networks to learn the embeddings! and how to combine them
Adagrad optimizer for matrix factorisation
What if we start from some prior!
Exploring data in Matplotlib
Constrained NMF using CVXPY!
Out of tensor factorisation
What if we to predict for entries not within the matrix?!
Towards amazing plots in research papers!
Maximize based on what you know, re-estimate!
Simulating a continuous HMM
--- title: "Writing" subtitle: "Notes on machine learning, VLMs, and building things." listing: contents: posts sort: "date desc" type: default categories: true sort-ui: false filter-ui: false fields: [image, date, title, description, categories, author, reading-time] page-layout: full ---