cd ../projects
Build Passing

Argentinian Insult Generator

A humorous NLP experiment created during the 2022 World Cup. Generates unique, culturally accurate Argentinian insults using Markov chains and Twitter data.

PythonNLPMarkov ChainsWeb ScrapingFly.ioFlask
Argentinian Insult Generator preview

200+

Insults in DB

1 🇦🇷

World Cup Wins

# Overview

Created specifically for the Qatar 2022 World Cup craze, this project started as a fun experiment to capture the passion and linguistic creativity of Argentinian football fans.

We scraped and curated a dataset of over 200 authentic insults from Twitter during the matches. The system uses these samples to train a lightweight language model and Markov chains, capable of generating new, grammatically coherent (and hilarious) combinations that sound entirely authentic.

It became a viral hit among friends during Argentina's run to the championship.

Problem

During the World Cup, we wanted to celebrate the unique 'folklore' of Argentinian football fandom, specifically their creative use of language in banter, but we wanted to do it through code and automation.

Solution

I built a web application backed by a probabilistic text generation model. By feeding it a curated dataset of real tweets, the algorithm learns word probabilities and sentence structures to construct new, never-before-seen insults that maintain the specific cadence and slang of the region.

# Features

  • Custom NLP engine based on Markov Chains
  • Curated dataset of 200+ authentic slang tweets
  • Instant text generation with 'Copy to Clipboard'
  • Lightweight, server-side rendering for speed
  • Minimalist, football-themed UI

# Screenshots

Generator Interface

Minimalist interface generating unique combinations of slang.

Example Output

An example of a generated phrase capturing the local dialect.

Challenges

  • >Cleaning and normalizing Twitter data (slang, typos, abbreviations) to ensure coherent output.
  • >Tuning the Markov chain 'state size' to balance between copying phrases and generating nonsense.
  • >Deploying a Python application with minimal latency on a free-tier PaaS.

Learnings

  • >Fundamentals of Natural Language Processing and probabilistic models.
  • >Techniques for web scraping and data sanitation.
  • >Deployment workflows with Docker and Fly.io.
  • >How simple algorithms can effectively mimic complex cultural linguistic patterns.

Tech Stack

NLP/AI

Markov ChainsPythonNLTKCustom Micro-LLM

Backend

FlaskPython

Data

Twitter ScraperJSON

Deployment

Fly.io

Collaborators

Alvaro Galisteo

Co-creator

Quick Info

Status

live

Started

November 2022

Completed

December 2022