PinnedPublished inTowards Data ScienceQuality > quantity: Cleaning noisy datasets using training dynamicsMay 6, 2021May 6, 2021
PinnedPublished inTowards Data ScienceHarvesting the power of Mutual Information to find bias in your NLP datasetLocal mutual information to find biased terms in NLP datasets, and why it should be preferred over Pointwise mutual information.Jan 14, 2021Jan 14, 2021
Batch-GPT: We Slashed Our OpenAI API Costs by Over 50%We created a drop-in service that cuts OpenAI costs while keeping your code unchanged. 💸🔌 Perfect for thrifty AI developers! 🧠💰Oct 111Oct 111
PyTorch Lightning ⚡️: You’re probably using the wrong metric for early-stopping or model…While using the PyTorch Lightning Trainer API, we monitor some metric for early-stopping, model checkpointing, etc. An example…Apr 15, 20231Apr 15, 20231
Load pre-trained GloVe embeddings in torch.nn.Embedding layer… in under 2 minutes! 🏎🔥A no nonsense tutorial for loading pre-trained GloVe word embeddings into a torch.nn.Embedding layerApr 25, 20211Apr 25, 20211
Published inTowards Data ScienceVisualize BERT sequence embeddings: An unseen wayExploring an unseen way of visualizing sequence embeddings generated across BERT’s encoder layers (Python notebook included)Jan 1, 20211Jan 1, 20211
Published inTowards Data ScienceIs your NLP dataset suffering from bias?Your first step towards understanding dataset biases in NLP and what it means to have a biased datasetDec 19, 2020Dec 19, 2020
Published inTowards Data Science🤗Transformers: Retraining roberta-base using the RoBERTa MLM ProcedureIn this tutorial we will be retraining a RoBERTa LM for a more personalised downstream-oriented dataset.Dec 13, 2020Dec 13, 2020