This site is built with fastpages, An easy to use blogging platform with extra features for Jupyter Notebooks.

Posts

  • Performer vs Softmax — From Kernel View to Fair Speed Tests

  • Indic CLIP Multimodal Understanding for Indic Languages

  • Why Batch Size Matters - The Surprising Difference Between Batch Training and Averaging

  • Attention Paths and Rank Collapse Part 1

  • Unveiling Position Encoding in Transformers - From Absolute to Relative with RoPE

  • Deep-Contextualized Embeddings ( ELMO )

  • Residual Learning

  • Creating a Maze Solver using Pix2Pix

  • Understanding Transformer Positional Encodings - A Mathematical Deep Dive

  • Gradient Clipping and Adaptive Learning Rates

  • Tight-fisted Optimizer ( Tiger )

  • im2col

  • AEDA ( An Easier Data Augmentation Technique for Text Classification )

  • Temporal Convolution Networks

  • 1
  • 2